| |
Abstract:
Prior research has demonstrated that
eye-movements around a visual scene are closely time-locked to
aspects of the mental processes implicated in sentence processing
(e.g., Tanenhaus et al., 1995; Allopenna et al., 1998). A
range of data suggest that "listeners establish reference
incrementally, rapidly integrating the information in the visual
model with the speech" (Tanenhaus et al., 1995; 474). In this
paper, we describe data which demonstrate that sentences are not
mapped onto static representations of the concurrent visual input,
but rather they are mapped onto interpreted, and dynamically
changeable, representations of that input. Participants were
shown a visual scene containing an open box, an open briefcase, and
some books, and various other objects. They were told simply
to listen to each sentence and observe each picture. We
contrasted four conditions:
| a. |
The boy will close
the box. Then, he will drop the books into the
briefcase.
|
| b. |
The boy will move
the box closer. Then, he will drop the books into the
box.
|
| c. |
The boy will close
the briefcase. Then, he will drop the books into the
box.
|
| d. |
The boy will move
the briefcase closer. Then, he will drop the books into the
briefcase.
|
In each case, the concurrent scene (displayed
throughout the two sentences) was identical (as was the fragment
'Then, he will drop the books'). During the second sentence,
we found that up until verb offset there was a very strong bias to
look towards whichever container had been mentioned in the prior
sentence. In the 'closer' contexts, this bias was maintained
during 'the books' (when the processor is anticipating a subsequent
expression that will potentially refer to the Goal of the action
denoted by the verb); there were more looks towards the 'closer'
(previously mentioned) container than towards the other one.
Crucially, this bias was eliminated in the 'closed' contexts the
bias to look towards the previously mentioned (but 'closed')
container was compounded now by an equivalent tendency to look
towards the container that had not been 'closed'. Prior
sentential context thus interacted with the concurrent visual input
to allow the rapid integration of the target sentence with a mental
representation of the concurrent visual scene which took account of
the 'changes' to that scene that had been introduced
linguistically. The target sentence was not, therefore,
mapped onto the concurrent visual scene, but was mapped instead
onto a linguistically-mediated representation of that scene.
Our data demonstrate that the 'world' in the visual-world paradigm
is not a visual world at all, but rather a mental world,
potentially mediated by linguistic context, and potentially
somewhat different to the actual world that forms the concurrent
visual input. Thus, 'what you see' is not 'what you get' when
mapping language onto the 'visual' world.
References
Allopenna, P. D., Magnuson, J. S., &
Tanenhaus, M. K. (1998). Tracking the time course of spoken
word recognition using eye movements: Evidence for continuous
mapping models. Journal of Memory and Language, 38(4),
419-439.
Tanenhaus, M. K., Spivey-Knowlton, M. J.,
Eberhard, K. M., & Sedivy, J. C. (1995). Integration of
visual and linguistic information in spoken language
comprehension. Science, 268, 1632-1634.
Tanenhaus, M. K., Spivey-Knowlton, M. J.,
Eberhard, K. M., & Sedivy, J. C. (1996). Using eye
movements to study spoken language comprehension: Evidence for
visually mediated incremental interpretation. In T. Inui
& J. L. McClelland (Eds.), Attention and Performance XVI:
Information integration in perception and communication.
Cambridge, MA: MIT Press/Bradford Books.
|