MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

A Model that Derives the Syntax/Semantics Distinction

 Whitney Tabor and Sean Hutchins
  
 

Abstract:
Diverse recent studies indicate that violations of syntactic constraints are processed differently from violations of semantic constraints (Brain imaging: e.g., Ainsworth-Darnell et al., 1997; Ni et al., in press; Speeded grammaticality judgment: McElree and Griffith, 1995; Eye-tracking: Ni et al., 1998). Usually, these results are taken as support for the view that the processor employs two separate modules for enforcing the two classes of constraints. But this account leaves open the question of how a learner decides that a particular systematicity in language belongs to one constraint system or the other. (Why is "Dogs moo" a semantic violation while "Dogs barks" is a syntactic one?) Several studies of learning connectionist networks show them developing distinct responses to syntactic and semantic violations without architectural modularity (Plaut, 1999; Tabor and Tanenhaus, 1999; Rohde and Plaut, in press). The connectionist studies are appealing because they derive a distinction between the two types rather than stipulating it, and they are explicit about how the distinction could be learned. But the source of the distinction in the connectionist studies has been unclear up to now.

We report on a replication of Rohde and Plaut's simulation as well as two new simulation studies which make it clear how the contrast is learned and why a modular architecture assumption is not necessary. Our new simulations used a Simple Recurrent Network (SRN) and focused on the distinction between (semantic) selectional constraints and (syntactic) subcategorization constraints, one of the subtlest kind of syntax/semantics distinctions.

Simulation 1 examined a very simple language in order to bootstrap our understanding of more complex cases: only three phrase types and random semantic feature assignment were used. The network organized its hidden representations into three major clusters corresponding to the three phrase types and many subclusters corresponding to the semantic contrasts within the phrase types. A single principle governed the network's response to all violations. Violations are cases where the information provided by the current word clashes with the information provided by the preceding context. The network responds to such clashes by averaging the conflicting signals. In the case of selection violation, this averaging puts the hidden representation outside of the subclusters but within a major cluster. In the case of syntactic violation, averaging puts the network between major clusters (and there is no containing supercluster). If we assume that reaction to a violation is slow when it is in the same cluster and hence confusable with a familiar grammatical case, this finding models the behavioral results identified above.

Simulation 2 scaled these results up to a more complex case involving realistic noun classes. Whereas Rohde and Plaut's large-scale simulation examined predictions made by the network immediately preceding a violation, we studied its responses to the violations themselves. In support of the interpretation suggested by our Simulation 2 analysis, we found clear differences between the average minimum distance away from familiar grammatical cases in well-formed test cases (0.040), selection violations(0.176) and subcategorization violations (0.360). All pairwise contrasts in means were significant with p < .001 and subcategorization violations were significantly farther from grammatical cases than semantic violations were. These results indicate that again, confusability is the distinguishing factor.

 
 


© 2010 The MIT Press
MIT Logo