MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Learning Semantic Structures From a Corpus

 Michiro Negishi
  
 

Abstract:

Connectionist modeling of language processing (e.g., St.John et al. 1990) is especially important because it provides models of development as well as performance. This poster presents a non-supervised connectionist model of language comprehension at syntactic and semantic levels, focusing on the acquisition of semantic role assignment.

The syntactic part (parser) of the model has a two sub-parser architecture that is motivated by linguistic (Pesetsky 1995) and neurophysiological (Friederici 1995) observations. A first level operates based on the categories of lexical items, and serves to combine a head with its arguments. The second level operates with the cooperation of the semantic part, and serves to combine phrases into extended maximal projections. This output can be fed back as an input to the first level as an argument. There are working memories that interface inputs and the two sub parsers. Because there are separate working memories for heads and arguments and there are two parsers, the parser is able to handle inputs where heads and arguments are moved from their base positions.

The semantic part consists of two distribution analysis (Saussure, Harris) networks and a categorizer. Distribution analysis networks are two-layer perceptrons that use a least mean square error algorithm. One of the distribution analysis networks learns the mapping from a predicate (the lexical head of an extended maximal projection) to the distribution of the lexical head of its arguments. The other distribution analysis network learns the mapping from a lexical head of each argument to the distribution of predicates. Thus for each predicate-argument pair from the parser, the distribution analysis network yields a combination of probability distributions of arguments-predicates (note the reverse order). A categorizer receives these probability distributions from the distribution analysis network as well as positional information of the argument from the parser and outputs the semantic role of the argument with respect to the predicate. A categorizer is a two layer network that employs a competitive learning algorithm (Grossberg 1976).

After the model is trained with a corpus of utterances of mothers to children (Snow 1989), it is tested on sentences that are randomly selected and excluded from the training data to confirm that reasonable semantic roles are distinguished and assigned to each argument. Analysis of the results indicates that initially the syntactic structure and word category are mainly used to learn semantic roles, and lexical differences of each head are used in a later stage of development.

Friederici, A. D., (1995). "The time course of syntactic activation during language processing: A model based on neuropsychological and neurophysiological data." Brain and Language, 50, 259-281.

Grossberg, S. (1976). "Adaptive pattern classification and universal recoding, I: Parallel development and coding of neural detectors." Biological Cybernetics, 23, 121-134.

Pesetsky, D. (1995). Zero Syntax: Experiencers and Cascades. MIT Press, Cambridge.

Snow, C. (1989). "Imitativeness: a trait or a skill?" In Speidel, G. and Nelson, K., editors, The many faces of imitation. Reidel, NY.

St.John, M. F., and McClelland, J. L. (1990). "Learning and applying contextual constraints in sentence comprehension." Artificial Intelligence, 46, 217-257.

 
 


© 2010 The MIT Press
MIT Logo