|
Abstract:
Connectionist modeling of language processing (e.g., St.John
et al. 1990) is especially important because it provides models
of development as well as performance. This poster presents a
non-supervised connectionist model of language comprehension at
syntactic and semantic levels, focusing on the acquisition of
semantic role assignment.
The syntactic part (parser) of the model has a two sub-parser
architecture that is motivated by linguistic (Pesetsky 1995) and
neurophysiological (Friederici 1995) observations. A first level
operates based on the categories of lexical items, and serves to
combine a head with its arguments. The second level operates with
the cooperation of the semantic part, and serves to combine
phrases into extended maximal projections. This output can be fed
back as an input to the first level as an argument. There are
working memories that interface inputs and the two sub parsers.
Because there are separate working memories for heads and
arguments and there are two parsers, the parser is able to handle
inputs where heads and arguments are moved from their base
positions.
The semantic part consists of two distribution analysis
(Saussure, Harris) networks and a categorizer. Distribution
analysis networks are two-layer perceptrons that use a least mean
square error algorithm. One of the distribution analysis networks
learns the mapping from a predicate (the lexical head of an
extended maximal projection) to the distribution of the lexical
head of its arguments. The other distribution analysis network
learns the mapping from a lexical head of each argument to the
distribution of predicates. Thus for each predicate-argument pair
from the parser, the distribution analysis network yields a
combination of probability distributions of arguments-predicates
(note the reverse order). A categorizer receives these
probability distributions from the distribution analysis network
as well as positional information of the argument from the parser
and outputs the semantic role of the argument with respect to the
predicate. A categorizer is a two layer network that employs a
competitive learning algorithm (Grossberg 1976).
After the model is trained with a corpus of utterances of
mothers to children (Snow 1989), it is tested on sentences that
are randomly selected and excluded from the training data to
confirm that reasonable semantic roles are distinguished and
assigned to each argument. Analysis of the results indicates that
initially the syntactic structure and word category are mainly
used to learn semantic roles, and lexical differences of each
head are used in a later stage of development.
Friederici, A. D., (1995). "The time course of syntactic
activation during language processing: A model based on
neuropsychological and neurophysiological data."
Brain and Language,
50, 259-281.
Grossberg, S. (1976). "Adaptive pattern classification and
universal recoding, I: Parallel development and coding of neural
detectors."
Biological Cybernetics,
23, 121-134.
Pesetsky, D. (1995).
Zero Syntax: Experiencers and Cascades.
MIT Press, Cambridge.
Snow, C. (1989). "Imitativeness: a trait or a skill?" In
Speidel, G. and Nelson, K., editors,
The many faces of imitation.
Reidel, NY.
St.John, M. F., and McClelland, J. L. (1990). "Learning and
applying contextual constraints in sentence comprehension."
Artificial Intelligence,
46, 217-257.
|