MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

A Distributed, Large Scale Connectionist Model of the Interaction of Lexical and Semantic Constraints in Syntactic Ambiguity Resolution

 Michael W. Harm, Robert Thornton and Maryellen C. MacDonald
  
 

Abstract:
Constraint-based accounts of sentence processing emphasize weighing multiple constraints during ambiguity resolution and point to connectionist models as a mechanism for doing so. To date, connectionist models have examined the effects of lexical and distributional frequency on processing but have not typically incorporated semantic representations. Because interactions of lexical and plausibility information are central to constraint-based proposals (e.g., Trueswell et al., 1994; Garnsey et al., 1997), it is clearly important to represent semantic information in the models.We present a model that maps distributed formal representations onto distributed semantic representations, providing a rigorous test of constraint-based accounts. We examined the interaction of lexical, semantic, and distributional constraints in processing syntactic category ambiguities, such as "the desert trains", in which "trains" can be a noun ("the desert trains were late") or a verb ("the desert trains the soldiers").

The model was trained on 20,000 word triplets taken from the tagged WSJ and Brown corpora. The word forms were represented syllabically, with morphological endings separately encoded (so CATS consists of a CAT node and an -S node). The semantic representations were derived from Wordnet, encoding high level features (such as ISA-ENTITY, ISA-ACTION, PLURAL), pragmatic features (such as ISA-TOOL, ISA-MAMMAL) and lexical, item specific features. We used 8,207 semantic features and 3,421 formal features to encode 8,210 word forms and their meanings. A total of 350 items were ambiguous as to noun or verb status (e.g., TRAINS).

For each phrase, the network was presented with each word in succession (e.g., DESERT TRAINS ARE). The target was the correct semantics for the current word. Interpretation of category ambiguous second words was biased by the semantics of the preceding word. A pair of "interpretation" nodes were connected to the semantic and context representations, encoding the interpretation of the phrase as NN or NV. High level semantic features (such as ISA-ENTITY), pragmatic constraints, and item specific regularities were all combined by the network and utilized to the extent to which they were informative. For example, the semantic feature ISA-ABSTRACTION strongly inhibited the verb sense of CUTS, so items such as TAX CUTS or TARIFF CUTS tended to be interpreted as NN. This formalism allows us to examine the interaction of semantic and lexical biases (TARIFF CUTS has the same phrase plausibility as TAX CUTS, but TARIFF's frequency biases differ from TAX's), and more broadly to address the integration of constraints computed simultaneously across many grain sizes.

We tested the model on a set of phrases crossing the lexical frequency preferences of the first word to be a modifying or head noun with semantic plausibility derived from calculations of the conditional entropy between semantic features of the first and second words. We measured the preference of the model by measuring activity of the interpretation nodes. A statistically reliable effect of both first word bias and phrase plausibility was found, replicating effects reported by MacDonald (1993). These and other results point to the importance of rich semantic representations and large training corpora in connectionist models of syntactic processing.

 
 


© 2010 The MIT Press
MIT Logo