| |
Abstract:
Many models of parsing assume that the processor behaves
incrementally, reading and interpreting its input from left to
right without delay. A natural assumption to make is that the
parser maintains a fully connected syntactic structure as each
word is read (e.g. the "Left-to-Right" constraint of Frazier and
Rayner, 1988), and this view has recently received empirical
support in a number of studies, including those investigating
head-final constructions (Bader and Lasser, 1994; Yamashita,
1994; Hemforth et al, 1994).
However, if we accept incrementality, we also have to accept
that some structure building and syntactic disambiguation cannot
be directly lexically driven (see Crocker, 1994; Lombardo and
Sturt, 1997), posing important questions for theories which claim
a lexical basis for these processes (Pritchett, 1992; MacDonald
et al., 1994). Consider the following sentence fragment, for
example.
(1) John thought my... (brother was ill).
Here, the word "my" can only be incorporated into a
grammatical connected representation via a path of nodes
including some which have not yet been licensed by lexical input
(e.g. the NP which will later be headed by "brother," and the
clause which will be headed by "was"). The construction of such a
path of nodes, which we will call "connection paths," introduces
serious non-determinacy into fully incremental parsing,
particularly when we consider realistically wide coverage
grammars. If human perceivers construct such connection paths,
they must employ systematic strategies and constraints, which by
definition cannot be purely lexically driven. Our goal in the
research reported in this abstract is to determine the nature of
these strategies.
In order to consider this question, we have constructed a
parsing simulation algorithm, which takes a treebank as its
input, and builds a database of the connection paths required to
arrive at the correct parse of each tree, assuming an incremental
parsing algorithm. By examining these connection paths, we intend
to analyse the requirements of non-lexical structure building in
the incremental parsing of unrestricted language, and also to
determine any systematic heuristics that may be at work, with a
goal to defining general principles that guide the processor in
such cases. We also intend to examine the extent to which lexical
information (for example, subcategorization information
associated with nodes on the right frontier) can influence the
construction of connection paths. We will provide both a detailed
description of the algorithm and a discussion of the results.
|