MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Lexical Semantics as a Basis for Argument Structure Frequency Biases

 Vered Argaman, Neal J. Pearlmutter, Aurora A. Mendelsohn, Janet H. Eandall, Elizabeth Meyers and Susan M. Garnsey
  
 

Abstract:
Recent constraint-based sentence comprehension theories make extensive use of argument structure frequency information in explaining ambiguity resolution (e.g., Garnsey et al., 1997; MacDonald et al., 1994; Trueswell & Tanenhaus, 1994). However, while such frequency measures often predict comprehenders' choices for ambiguities, the source of frequency differences remains unclear. One possibility is that they are simply the result of (historically) early random variation, reinforced over generations of use. Alternatively, frequency differences might reflect underlying differences in the frequency of semantic alternatives (e.g., extending Levin, 1993), which can explain argument structure frequency differences by appealing to differences in the frequency of events in the world. A second concern is that very little research has examined the appropriateness of corpus- versus survey-based frequency measures. These measures have generally been treated as interchangeable, but they do not necessarily reflect the same underlying distributions.

To examine these issues, we used data for verbs (e.g., proposed ) which can take sentence complements (SCs) and corresponding SC-taking nouns (e.g., proposal ), from both the Penn Treebank's Wall Street Journal (WSJ) corpus and sentence-completion surveys. In the surveys, participants continued fragments such as Bill proposed (verbs) or Caroline ignored the proposal (nouns) to form complete sentences. Participants completed either a verb or a noun survey. Approximately 100 tokens of each word were coded from each source.

We compared corresponding nouns and verbs on the probability of taking an SC. If frequency biases are a matter of random variation, they should not be expected to correlate across noun-verb pairs. Alternatively, noun-verb pairs share substantial semantic information, and thus if frequency biases are determined by semantics, the biases across pairs should correlate. In both the survey and WSJ sources, the SC probabilities were strongly correlated (r =.49, p<</I>.001; r=.67, p<</I>.001, respectively), suggesting that semantics has a substantial influence on frequency biases. We further examined these relationships by sorting the nouns and verbs into semantic subcategories following Levin (1993) and Wierzbicka (1987) and comparing the subcategories. Levin's subcategories provided further evidence that fine-grained semantics is related to argument structure frequency.

To examine the relationship between corpora and surveys, we compared the sources separately for verbs and nouns, using the same SC probability measure. For verbs, the correlation was marginal
(r=.26); for the nouns, the sources were strongly correlated (r=. 59, p<</I>.001), and this pattern also appeared for other probability measures. These results indicate that corpora and survey sources do tend to agree, but the relative weakness of the verb correlations suggests a need for consideration of both types of sources in using frequency information to predict comprehension difficulty.

 
 


© 2010 The MIT Press
MIT Logo