| |
Abstract:
Constraint-based accounts of sentence processing emphasize
weighing multiple constraints during ambiguity resolution and point
to connectionist models as a mechanism for doing so. To date,
connectionist models have examined the effects of lexical and
distributional frequency on processing but have not typically
incorporated semantic representations. Because interactions of
lexical and plausibility information are central to
constraint-based proposals (e.g., Trueswell et al., 1994; Garnsey
et al., 1997), it is clearly important to represent semantic
information in the models.We present a model that maps distributed
formal representations onto distributed semantic representations,
providing a rigorous test of constraint-based accounts. We examined
the interaction of lexical, semantic, and distributional
constraints in processing syntactic category ambiguities, such as
"the desert trains", in which "trains" can be a noun ("the desert
trains were late") or a verb ("the desert trains the soldiers").
The model was trained on 20,000 word triplets taken from the
tagged WSJ and Brown corpora. The word forms were represented
syllabically, with morphological endings separately encoded (so
CATS consists of a CAT node and an -S node). The semantic
representations were derived from Wordnet, encoding high level
features (such as ISA-ENTITY, ISA-ACTION, PLURAL), pragmatic
features (such as ISA-TOOL, ISA-MAMMAL) and lexical, item specific
features. We used 8,207 semantic features and 3,421 formal features
to encode 8,210 word forms and their meanings. A total of 350 items
were ambiguous as to noun or verb status (e.g., TRAINS).
For each phrase, the network was presented with each word in
succession (e.g., DESERT TRAINS ARE). The target was the correct
semantics for the current word. Interpretation of category
ambiguous second words was biased by the semantics of the preceding
word. A pair of "interpretation" nodes were connected to the
semantic and context representations, encoding the interpretation
of the phrase as NN or NV. High level semantic features (such as
ISA-ENTITY), pragmatic constraints, and item specific regularities
were all combined by the network and utilized to the extent to
which they were informative. For example, the semantic feature
ISA-ABSTRACTION strongly inhibited the verb sense of CUTS, so items
such as TAX CUTS or TARIFF CUTS tended to be interpreted as NN.
This formalism allows us to examine the interaction of semantic and
lexical biases (TARIFF CUTS has the same phrase plausibility as TAX
CUTS, but TARIFF's frequency biases differ from TAX's), and more
broadly to address the integration of constraints computed
simultaneously across many grain sizes.
We tested the model on a set of phrases crossing the lexical
frequency preferences of the first word to be a modifying or head
noun with semantic plausibility derived from calculations of the
conditional entropy between semantic features of the first and
second words. We measured the preference of the model by measuring
activity of the interpretation nodes. A statistically reliable
effect of both first word bias and phrase plausibility was found,
replicating effects reported by MacDonald (1993). These and other
results point to the importance of rich semantic representations
and large training corpora in connectionist models of syntactic
processing.
|