| |
Introduction
Introduction
Speech perceivers are informational omnivores. Although the acoustic medium provides necessary and sometimes sufficient information for phonetic perception, listeners use other sources as well. The surface of the speaking face provides surprising amounts of phonetic information (e.g., Yehia & Vatikiotis-Bateson, 1998), at least some of which observers use when it is available. In noisy settings, perceivers who can see the face of a speaker achieve more accurate percepts than those who cannot (e.g., Sumby & Pollack, 1954). Moreover, given appropriate dubbings of acoustic syllables or words onto visible facial gestures for other syllables or words, perceivers integrate information from the two modalities (the McGurk effect; e.g., Brancazio, in press; Massaro, 1998; McGurk & MacDonald, 1976). For example, acoustic ma dubbed onto visible da is identified predominantly as na, an outcome that integrates visible information for place of articulation with acoustic information for manner and voicing. This outcome can be phenomenally striking; people hear one syllable with their eyes open and a different one with their eyes closed.
Another perceptual system that provides useful information about speech is the haptic system. Some individuals who are deaf and blind have learned to talk by placing their hands on the face of (and, in Helen Keller's case, sometimes in the mouth of) a speaker (Lash, 1980; Chomsky, 1986). Moreover, naive normally seeing and hearing individuals show a haptic version of the McGurk effect. With hands in surgical gloves placed over the mouth and jaw of a speaker as the speaker mouths ga, perceivers increase identifications as ga of syllables along an acoustic continuum ranging from ba to ga (Fowler & Dekle, 1991).
How should we understand speech perception such that phonetic perception can be achieved in all of these ways?
An analogous question arises when we consider speech production as perceptually guided action. We can first consider speech production guided by the perception of one's own speech. When speakers' own acoustic signals are fed back transformed in some way, their speech is affected. For example, in the Lombard effect, speakers increase their vocal amplitude in the presence of noise (e.g., Lane & Tranel, 1971). When feedback about vowel production is transformed acoustically, speakers change the way they produce vowels as if in compensation (Houde & Jordan, 1998). Hommel, Müsseler, Aschersleben, and Prinz (in press) raise the more general question, how can percepts communicate with action plans? Aren't they coded in different ways, percepts as representations of features of the stimulus input and action plans as some kind of motoric specification? How should we understand speech production and speech perception such that perceived speech can affect produced speech?
Finally, a related question arises when we consider speech as a communicative activity taking place between people. In cooperative conversations, speakers may converge in their dialects, vocal intensity, speaking rate, and rate of pausing (see Giles, Coupland, & Coupland, 1991, for a review). More generally, listeners perceive the phonological message that talkers meant to convey. Speakers talk by producing actions of vocal tract articulators. Ultimately their speech action plan must be to produce those actions. For their part, listeners receive acoustic speech signals, signals that have acoustic, not motoric, features. How can speakers communicate with listeners? How can we understand speaking and listening such that a perceived dialect can affect a produced one, and so, more fundamentally, that a listener can perceive a talker's phonological message? Each of these questions is addressed in this chapter.
| |