|
Abstract:
Experience-based theories of structural
disambiguation preferences (e.g., Mitchell et al, 1995) claim
that people disambiguate in a way which is consistent with their
past experience of syntactic configurations. If this is
correct, it is important to ask which features of linguistic
experience are used to inform people's disambiguation
preferences. One way to consider this question is to test
the effect of different linguistic "experiences" on an explicit
model, thus generating predictions for human sentence
processing.
We have previously presented a recursive neural
network model, which is trained to disambiguate by recognising
the correct partial tree (henceforth "incremental tree") spanning
the sentence from the first word to the current word, given a
(usually very large) set of alternatives generated from a
large-scale treebank grammar. The model has been shown to
capture some well-known structural preferences in human
parsing. Here we test the effect of linguistic experience
on this model.
Experiment 1. One network was trained on
full incremental trees from a large treebank sample.
Another was trained on reduced trees from the same sample, from
which we removed all nodes not c-commanding the right frontier of
the incremental tree. The tree reduction had no adverse
effects on disambiguation performance; in fact, performance
improved (81.2% vs. 85.0% correct choice). This shows that
the model does not use information which is deeply embedded
beyond the right frontier (c.f. Right Roof Constraint; Ross,
1967).
Experiment 2. Syntactic disambiguation
requires choosing the correct attachment site (anchor), and the
correct tree fragment (connection path) to connect the current
word with the previous incremental tree. Experiment 2
showed that the network has very high accuracy in anchor
prediction (91.5%), and that, given the correct anchor, its
performance can almost be matched by choosing the most frequent
connection path. Thus, the network appears to give priority
to its choice of anchor over its choice of connection
path.
Experiment 3. Several networks were trained,
each on a sample of text which was identical apart from the
relative frequencies of high and low relative clause attachments
(0% to 100% low attachment). The resulting networks were
tested on a sample of unseen relative clause ambiguities from the
treebank. The different biases clearly affected the
network's preferences, although there was also an underlying
low-attachment bias; even the network trained with 50% high and
50% low attachments showed a reliable low attachment
preference. A further network, trained on a sample with no
relative clause ambiguities, also showed a reliable
low-attachment bias. Thus, the network exploits underlying
biases as well as experience of specific examples, and it is able
to generalize from experience to process novel
ambiguities.
|