| |
Carl Wernicke argued that cortical areas involved in the sensory representation of speech played an important role in speech production (Wernicke, 1874/1969). His argument was based on the clinical observation that the speech output of aphasic patients with posterior lesions in the left hemisphere was fluent but error prone. Modern evidence from both lesion and neuroimaging studies strongly supports Wernicke's claim, and recent work has made progress in identifying an auditory-motor interface circuit for speech.
Developmental considerations make a strong case for the existence of an auditory-motor integration network for speech (Doupe and Kuhl, 1999). In learning to articulate the speech sounds in the local linguistic environment, there must be a mechanism by which (1) sensory representations of speech uttered by others can be stored, (2) articulatory attempts can be compared against these stored representations, and (3) the degree of mismatch revealed by this comparison can be used to shape future articulatory attempts. This auditory-motor integration network is still functional in adults, as revealed by the fact that it is possible to repeat pseudo-words accurately and by the effects of late-onset deafness on speech output (Waldstein, 1989).
Clinical evidence supports the view that “sensory” cortex participates in speech production. The classical fluent aphasias—Wernicke's aphasia, conduction aphasia, transcortical sensory aphasia, and anomic aphasia—are all associated with left posterior cerebral lesions, that is, with regions that are commonly thought to be sensory in nature. Yet each of these fluent aphasias has prominent speech output symptoms: semantic and/ or phonemic paraphasias (speech errors), paragrammatism (inappropriate use of grammatical markers), and anomia (naming difficulties) (Damasio, 1992). This observation demonstrates the general point that posterior “sensory” systems play an important role in speech production.
Evidence relevant to the more specific issue of auditory-motor integration comes from conduction aphasia (Hickok, 2001). A hallmark of conduction aphasia is the predominance of phonemic paraphasias, which can be evident across a large range of production tasks, including spontaneous speech, naming, reading aloud, and repetition (Goodglass, 1992). The preponderance of phonemic errors has led some authors to characterize conduction aphasia as a selective impairment in phonological encoding for production (Wilshire and McCarthy, 1996). Although the classical model holds that conduction aphasia is a disconnection syndrome involving damage to the arcuate fasciculus (Geschwind, 1965), recent evidence has shown that the syndrome can be caused by damage to, or electrical stimulation of, auditory-related cortical fields in the left superior temporal gyrus (Damasio and Damasio, 1980; Anderson et al., 1999). This region has been strongly implicated in speech perception, based on neuroimaging data (Zatorre et al., 1996; Norris and Wise, 2000), suggesting some degree of overlap in the systems supporting sensory and motor aspects of speech. This argument raises an apparent paradox, namely, that damage to systems strongly implicated in speech perception (i.e., left superior temporal gyrus) leads to a syndrome, conduction aphasia, characterized predominantly by a production deficit. This paradox can be resolved, however, on the assumption that speech perception is largely bilaterally organized, and that residual abilities of right hemisphere auditory systems function sufficiently well to support auditory comprehension (Hickok, 2000; Hickok and Poeppel, 2000).
Recent neuroimaging studies have supported and extended findings from the clinical literature. The left superior temporal gyrus has been shown to activate during a variety of speech production tasks (where speech is produced covertly, so that there is no external auditory input) including picture naming (Levelt et al., 1998; Hickok et al., 2000), repetition (Buchsbaum, Hickok, and Humphries, 2001), and word generation (Wise et al., 1991). Importantly, evidence from an MEG study of picture naming (Levelt et al., 1998) has shown that this left superior temporal activation occurs during a time frame prior to articulatory processes, suggesting that this region is involved in phonological code retrieval in preparation for speaking and is not merely a form of motor-to-sensory feedback mechanism, although the latter mechanism may also exist.
Two studies have looked explicitly for overlap in activation associated with speech perception and speech production. The first used positron emission tomography to map areas of overlap when participants listened to stories versus performed a verb generation task (Papathanassiou et al., 2000). A region of overlap was found in the superior temporal gyrus, predominantly on the left, as expected, based on results reviewed earlier. Additional areas of overlap included inferior temporal regions and portions of the left inferior frontal gyrus. The second study (Buchsbaum et al., 2001) used functional magnetic resonance imaging to map activated regions when subjects first listened to and then covertly rehearsed a set of three multisyllabic pseudo-words. Two left posterior sites responded both to the auditory and motor phases of the trial: a site in the sylvian fissure at the parietal-temporal boundary (area Spt) and a more ventral site in the superior temporal sulcus. Brodmann's area 44 (posterior Broca's area) and a more dorsal premotor site also responded to both the auditory and motor phases of the trial. The activation time course of area Spt and of area 44 in that study were particularly strongly correlated, suggesting a tight functional relation. A viable hypothesis is that the STS site supports auditory representations of speech and that the Spt site serves as an interface system translating between auditory and motor representations of speech. This hypothesis is consistent with recent work in vision demonstrating the existence of visuomotor systems in the dorsal parietal lobe that compute coordinate transformations, such as transformations of retinocentric to head-and body-centered coordinates, which allows visual information to interface with various motor-effector systems that act on that visual input (Andersen, 1997; Rizzolatti, Fogassi, and Gallese, 1997).
Sensorimotor interaction is pervasive across many hierarchical levels in the central nervous system. The empirical record supports conceptual arguments for sensorimotor interaction in speech and language and has begun to elucidate sensorimotor cortical circuits for speech. This work helps bridge the gap between functional anatomical models of speech and language and models of the functional organization of cortex more generally (Hickok and Poeppel, 2000).
| |