| |
Abstract:
Interactive constraint-based theories of sentence
processing have gained increasing support, as a growing body of
empirical evidence demonstrates early influences of various
factors on comprehension performance. Connectionist
networks are one form of model that naturally reflect many
properties of constraint-based theories, and thus provide a form
in which those theories may be instantiated.
Unfortunately, most of the connectionist language
models implemented until now have involved severe
limitations. Comprehension and production models have, by
and large, been limited to simple sentences with small
vocabularies (cf. St. John & McClelland, 1990). Most
models that have addressed the problem of complex, multi-clausal
sentence processing have been prediction networks (cf. Elman,
1991; Christiansen & Chater, 1999). Although a useful
component of a language processing system, prediction does not
get at the heart of language: the interface between syntax and
semantics.
The current study involves a recurrent neural
network model that has been trained to both comprehend and
produce a relatively complex subset of English. This
language includes such features as tense and number, adjectives
and adverbs, prepositional phrases, relative clauses, subordinate
clauses, and sentential complements, with roughly 50 each of noun
and verb stems, for a total of about 300 words. It is broad
enough that it permits the replication of a wide range of
sentence processing experiments.
Critical to the model is the way in which the
meanings of complex sentences are to be encoded. Finite
slot-filler representations will not suffice, so complex sentence
meanings are encoded as sets of propositions. The "encoder"
portion of the model is responsible for compressing this set into
a single static representation of sentence meaning, which serves
as the target of comprehension and the source of
production. The comprehension and production systems, which
are largely integrated, map between a sequence of words and a
message. That the model has properly encoded or
comprehended the message can be verified by asking
fill-in-the-blank questions. A method has also been
developed for obtaining simulated reading times based on the
difficulty of both word prediction and semantic
integration.
The model has been extensively tested on a variety
of tasks, including the processing of lexical and structural
ambiguities, and a range of unambiguous sentence types. It
is able to replicate many key aspects of human sentence
processing, including sensitivity to lexical and structural
frequencies, semantic plausibility, and locality effects.
In this presentation I will briefly describe the model, review in
detail its processing of several interesting features of English,
including the NP/S ambiguity, and discuss some of the lessons
learned from its study.
|