MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Verb behavior is not verb nature: Sense and genre bias as sources of subcategorization probabilities

 Douglas Roland, Daniel Jurafsky and Laura Michaelis
  
 

Abstract:
The subcategorization preference of a verb is significantly affected by its discourse context and its word sense. This suggests that processing models must (1) represent subcategorization biases at the level of the semantic lemma ('sense bias'), rather than orthographic word ('verb bias'), and (2) be able to integrate multiple probabilistic factors (sense bias, genre bias, animacy bias, etc.). Frequencies are strongly dependent on the method used for computing them.

By comparing subcategorization frequencies across six corpora (Connine et al.'s (1984) sentence-production data, Garnsey et al.'s (1997) sentence-completion data, written text (the Brown, Wall Street Journal, and British National corpora), and conversational data (the Switchboard corpus)) we show that:

1) Different verb senses have different subcategorization probabilities.

Processing models generally assume that subcategorization biases are specific to each verb entry (Clifton, Frazier, & Connine, 1994; Garnsey, 1997; Spivey-Knowlton & Sedivy, 1995; Trueswell et al., 1993, inter alia). Our results reveal biases at the level of the semantic lemma ('sense bias') rather than orthographic word.

2)      The genre constrains the subcategorization probabilities of a verb.

Following work by Merlo (1994), Roland and Jurafsky (1998) showed that subcategorization frequencies in elicited data (Connine et al., 1984) differed from those in natural corpora (elicited sentences have a greater probability of PPs but fewer passives and zero arguments). They gave functional explanations for these differences. We extend this work by showing that discourse topic, verbal aspect and the animacy of the syntactic subject all affect verb sense, and thereby subcategorization. For example when subjects in Connine et al. were asked to use the verb pass with the discourse topic 'school' they tended to use the 'pass a test' sense - a DO-biased sense rare in the Brown and WSJ data. Perfective aspect correlates with DO complementation and imperfective with SC (Dowty, 1990); such aspectual biases were found in the elicited data. Additionally, we found that inanimate-subject senses of verbs like worry were preempted by the generally animate subjects of the elicited sentences.

As sentence-processing models grow more probabilistic, we need a deeper understanding of the constraints which affect these probabilities, in order to ensure the methodological soundness of our measures and accurate representation of the complex factors which influence our models.

 
 


© 2010 The MIT Press
MIT Logo