| |
One of the enduring interests of both neuroscientists and philosophers of science concerns how we perceive the external world, and speculation about the underlying processes involved has been ongoing since long before the advent of formal scientific disciplines (e.g., see Aristotle's De Anima). The rapid pace of technological developments in recent years has markedly advanced research in all the disciplines in which perception is a topic of interest, and which are loosely captured by the term neuroscience. However, much of the history of perceptual research can be characterized as a “sense-by-sense” approach, in which researchers have focused on the functional properties of one sensory modality at a time. These efforts have produced an enormous amount of information about sensory perception at every level of analysis, from the single-cell level to the level of phenomenology. We know far more about each of the senses than could possibly have been imagined by the early philosophers, who nevertheless posed some of the most fundamental questions about the mechanisms underlying our ability to perceive.
However, it is interesting to note that with the specialization of modern research and the tendency to focus on the functional properties of individual senses, an early perspective was set aside, namely, that perception is fundamentally a multisensory phenomenon. There can be no doubt that our senses are designed to function in concert and that our brains are organized to use the information they derive from their various sensory channels cooperatively in order to enhance the probability that objects and events will be detected rapidly, identified correctly, and responded to appropriately. Thus, even those experiences that at first may appear to be modality-specific are most likely to have been influenced by activity in other sensory modalities, despite our lack of awareness of such interactions. Indeed, mounting evidence now suggests that we are rarely aware of the full extent of these multisensory contributions to our perception.
Researchers now recognize that the everyday environment, outside the highly controlled laboratory of the scientific researcher, engenders a constant influx of sensory information in most of the sensory pathways. The brain's task is to sort through the massive and multiple streams of information it receives and to couple those signals that, regardless of their modality, should be related to one another because they are derived from a common event. At the same time, the brain also needs to keep separate the signals derived from different perceptual events. The final decision concerning what a particular event or object is, and what should be done with (or about) it, is frequently an operation requiring the synthesis of information derived from multiple sensory channels. Thus, to fully appreciate the processes underlying much of sensory perception, we must understand not only how information from each sensory modality is transduced and decoded along the pathways primarily devoted to that sense, but also how this information is modulated by what is going on in the other sensory pathways.
The recognition that a multisensory perspective is necessary to enhance our understanding of sensory perception has led many scientists to adopt a different research strategy in recent years. The result has been the emergence of a distinct field of scientific endeavor that has been loosely designated multisensory integration or multisensory processing. At the same time, an international organization, the International Multisensory Research Forum (IMRF), has been formed that is dedicated to promoting these efforts and has helped to spur progress in this emerging field. Indeed, there has been a dramatic upsurge over the last decade in the number of research programs directed toward understanding how the brain synthesizes information from the different senses, and this expansion of interest shows no sign of abating. This situation in turn has led to a growing awareness that multisensory processes are quite common and profoundly affect our appreciation of environmental events. The rapid growth of this field has not been restricted to an individual technique, discipline, species, or perspective; advances in multiple species have been made at the level of the single neuron, networks of neurons, neural modeling, development, and also psychophysics. These efforts have yielded data suggesting the presence of remarkable constancies in some of the underlying principles by which the brain synthesizes the different sensory inputs that are available to it, indicating their broad applicability in very different ecological circumstances. These principles appear to be operative regardless of the specific combination of senses being assessed. Thus, some of the same principles governing the synthesis of visual and auditory information apply equally well to other combinations of the spatial senses, and may even apply to the chemical senses (such as gustation and olfaction) as well. To take but one example close to the hearts of the editors of this volume, the multisensory response enhancement first reported at the single-cell level in the cat (see Stein & Meredith, 1993, for a review) has subsequently been documented in domains as diverse as the combination of olfactory and gustatory cues in flavor perception (e.g., Dalton, Doolittle, Nagata, & Breslin, 2000) and the hemodynamic responses of the human brain to congruent versus incongruent audiovisual speech stimuli (Calvert, Campbell, & Brammer, 2000). It will be particularly interesting in future research to determine how these principles affect the perception of multisensory signals that are critical for social interaction in the animal kingdom (see, e.g., Hughes, 1998; Partan & Marler, 2000).
Closely related, and of general concern at the perceptual level, is how the brain weighs the inputs it receives from the different senses in producing a final perceptual output or experience. This is an issue with a rich history in experimental psychology. It has been known for some time that when conflicting signals are presented via the different sensory modalities, the emergent percept is typically dominated by the most persuasive sensory cue in that particular context (e.g., Rock & Victor, 1964; Welch & Warren, 1980). These sorts of dominance effects have been reported in a number of species (see, e.g., Wilcoxin, Dragoin, & Kral, 1971), and recent computational modeling has revealed how they may actually reflect an optimal integration of what are inherently noisy perceptual inputs (e.g., Ernst & Banks, 2002).
Because the relevant literature on multisensory processing is spread across multiple disciplines, it has become increasingly fragmented in recent years. Providing a single source for descriptions of the most notable advances in these diverse areas has been a central motivation in preparing this book. We hope that by bringing together in one place as much of this disparate information as we can, we will provide researchers with a convenient source of knowledge with which to examine how data from very different experimental paradigms relate to one another. This organization allows readers to examine what is known of the principles of multisensory integration that operate at the level of the individual neuron, as well as those that operate at the level of neural networks and thereby encompass many neurons in multiple brain structures. These principles can then be related to the final perceptual and behavioral products that are initiated by multisensory stimuli. It also provides a means of understanding much of the thinking that has guided the construction of computational models of multisensory processes in the past few years.
By juxtaposing discussions of how different brain areas come to be capable of processing their respective sensory information as a consequence of development and sensory experience, and how the brain is able to coordinate the different reference schemes used by the different senses so that stimuli can be localized regardless of their modality of input, we hope to provide some insight into the monumental biological problems inherent to building a system of sensors that can be used synergistically. In addition, discussions about the fascinating cases of synesthesia and the susceptibility of multisensory information processing to various kinds of brain damage provide us with an appreciation of the magnitude of individual variability in these multisensory processes, even among members of the same species.
Organization of the book
This handbook is organized around eight key themes, with each chapter presented as a state-of-the-art review by leading researchers in the field. Although the chapters are grouped into sections that might appear to be independent of one another, the grouping is merely one of convenience. Many of the themes and models discussed actually exceed the organizational framework of particular sections, and many of the problems and solutions are referenced in multiple sections.
Part I: Perceptual Consequences of Multiple Sensory Systems
The chapters in Part I focus primarily on multisensory contributions to perception in humans. Included are chapters on the multisensory recognition of objects (Newell), multisensory contributions to the perception of movement (Soto-Faraco and Kingstone), multisensory flavor perception (Stevenson and Boakes), and multisensory texture perception (Lederman and Klatzky). Many of the chapters highlight studies documenting the perceptual consequences of conflicting cues in different senses, such as the ventriloquism effect (Woods and Recanzone) and the cross-modal dynamic capture effect (Soto-Faraco and Kingstone). The various newly discovered multisensory illusions, such as the “freezing” illusion (Vroomen & de Gelder, 2000), the “double flash” illusion (Shams, Kitamani, & Shimojo, 2000), and the “bouncing balls” illusion (Sekuler, Sekuler, & Lau, 1997), are also discussed at some length in several of the chapters (e.g., Shams, Kamitani, and Shimojo; Vroomen and de Gelder). Studying such illusions has helped scientists to understand some of the rules by which the senses interact even if their particular perceptual products do not enhance perceptual performance. Such laboratory conditions of sensory “conflict” also raise important questions about which responses on the part of the participants reflect the genuine products of multisensory integration and which reflect decisions or response-related strategies instead (e.g., see the chapters by Soto-Faraco and Kingstone and by Marks).
Finally, while the impact of prior experience on multisensory integration is addressed most thoroughly in animal studies (see the section on cross-modal plasticity), Stevenson and Boakes provide some fascinating evidence suggesting that our previous experience with particular combinations of tastes and smells in particular foodstuffs may affect the nature of the multisensory interactions that emerge in the perception of flavor. Thus, future studies of multisensory flavor perception may provide a particularly fascinating window into the way in which the multisensory interactions that govern our perception and behavior are molded by the particular sensory environments in which we develop.
Part II: Is Speech a Special Case of Multisensory Integration?
The multisensory perception of speech has already garnered more than its fair share of analysis and commentary in the literature in recent years (e.g., Campbell, Dodd, & Burnham, 1998; Dodd & Campbell, 1987; Massaro, 1998). When current views of speech research are brought under one roof, it is possible to see how some of the rules of audiovisual integration derived for speech perception fit the broader context of multisensory integration. A question that then emerges is whether the processes of multisensory integration taking place in the case of speech are somehow special. We are fortunate to have contributions from researchers on both sides of this debate. In particular, the chapters incorporated here address the issue of whether the type of multisensory integration that influences speech perception is fundamentally different from other forms of multisensory integration. Taking a computational approach to audiovisual speech perception, Massaro argues that data on the McGurk effect can be easily described by a general pattern recognition algorithm—in this case, the fuzzy logical model of perception—which indicates that speech is simply another case of audiovisual integration. A contrary stance is taken by Fowler, who expounds the gestural theory of speech perception. Fowler argues that the results from modality-specific auditory speech experiments are in direct contrast to the conclusions drawn from Massaro's studies based on auditory-visual speech perception experiments. Meanwhile, Munhall and Vatikiotis-Bateson review studies assessing whether the spatial and temporal constraints affecting multisensory speech perception are the same as for other types of audiovisual integration. In their chapter, Bernstein, Auer, and Moore investigate the neural sites specialized for the processing of speech stimuli. Finally, Sarah Partan broadens the discussion by reviewing what is currently known about multisensory contributions to animal communication.
Part III: The Neural Mechanisms Underlying the Integration of Cross-Modal Cues
Part III directly approaches the neural mechanisms underlying multisensory integration. A number of model species (e.g., rat, cat, monkey) are used to explore what is known about multisensory processing in the midbrain and cortex. The section begins with two chapters dealing with what is perhaps the best-known model of multisensory integration, the cat superior colliculus (SC) neuron. The first chapter, by Stein, Jiang, and Stanford, discusses the principles guiding the integration of visual, auditory, and somatosensory information in SC neurons and the consequences of multisensory integration on overt orientation. It also introduces the inherent problem in maintaining alignment among sensory representations when the peripheral sensory organs move with respect to one another (a longstanding problem in neuroscience research; e.g., see Pöppel, 1973). This issue is dealt with in regard to the alignment of cortical maps in Part VI. The neural data that have been obtained in SC neurons are then evaluated and modeled in the chapter by Anastasio and Patton, who apply a quantitative Bayesian perspective to the underlying processes. This analysis is followed by a series of chapters dealing with the many cortical regions in which convergent sensory information is processed. Kaas and Collins initiate this discussion with an overview of the monkey cortex and the possible anatomical substrates for multisensory convergence. Schroeder and Foxe, who also use the nonhuman primate model, show that multisensory influences can be exerted very early in the stream of cortical information processing, an issue that has proved to be particularly contentious in the last few years (e.g., Rockland & Ojima, 2003; Falchier, Clavagnier, Barone, & Kennedy, 2002). The rich variety of multisensory interactions should become apparent as one reads the chapter by Rolls, in which the chemical as well as special senses in this model are dealt with. Parker and Easton discuss the integral role of learning and memory in dealing with multisensory events in monkeys, and Meredith provides a broad overview of the neural mechanisms of integration in cat cortex that impact a variety of physiological measures of sensory processing. Finally, Barth and Brett-Green show that the multisensory convergence of sensory information is present in several areas of the rat cortex previously thought to be modality-specific. These several chapters use various models to illustrate how different regions of the brain integrate multisensory information in performing their various roles.
Part IV: Multisensory Mechanisms in Orientation
Some of the early themes in Part III are revisited in Part IV in order to deal specifically with the behavioral consequences of multisensory integration. The research in this section is heavily dependent on the nonhuman primate model, as reflected in each of the chapters included here. In the first two chapters, Van Opstal and Munoz discuss how the integration of auditory and visual information at the level of the individual neuron speeds shifts of gaze and thereby facilitates the localization of external events, and Diederich and Colonius discuss both manual and eye movement control within this context. Lackner and DiZio then address the utility of multisensory integration in the broader circumstances in which body orientation and movement must be controlled. Fogassi and Gallese discuss similar issues, but from the perspective of how an organism's action influences the integration of multisensory cues, and Graziano, Gross, Taylor, and Moore evaluate the role of multisensory neurons in initiating protective movements in response to approaching targets. Then Ishibashi, Obayashi, and Iriki show the existence of multisensory neurons specifically involved in tool use and demonstrate that the experience of using a tool can change the properties of these multisensory neurons. Finally, Cohen and Andersen deal with an inherent problem in coordinating cues from different senses to locate an external event: because each sensory modality uses a different coordinate system to represent space, the brain has had to find an efficient way of interrelating these schemes in order to use cross-modal cues cooperatively.
Part V: Human Brain Studies of Multisensory Processes
The introduction of modern neuroimaging techniques has enabled researchers to examine the neural consequences of multisensory integration in the human brain (e.g., Calvert et al., 1997; Macaluso, Frith, & Driver, 2000). The chapters in Part V span a range of human imaging techniques, including electroencephalography (EEG), magnetoencephalography (MEG), positron emission tomography (PET), and functional magnetic resonance imaging (fMRI). The excellent spatial resolution of methods such as PET and fMRI has been exploited to localize the neuronal networks involved in multisensory integration, while the superior temporal resolution of electromagnetic techniques such as EEG and MEG has yielded vital insights into the time course of multisensory interactions during distinct cross-modal operations. Using these studies, researchers are now beginning to construct a picture of multisensory integration in humans that includes parallel stages of convergence at both early (see the chapter by Fort and Giard) and late (see the chapters by Calvert and Lewis and by Raij and Jousmäki) stages of information processing in what have traditionally been considered primary sensory areas, as well as in the better-known multisensory cortical regions. It is interesting that despite the considerable variability among the different techniques, paradigms, and analytic strategies used, similar principles of multisensory integration reported at the cellular level in nonhuman species appear to be operative in large populations of neurons in humans. These processes also appear to be sensitive to attentional state (see the chapters by Fort and Giard, Eimer, and Macaluso and Driver), and the emotional context of an experience may reflect the synthesis of cross-modal cues as readily as the detection and localization of events (see Chapter 35, by O'Doherty, Rolls, and Kringelbach, and Chapter 36, by de Gelder, Vroomen, and Pourtois).
Part VI: The Maturation and Plasticity of Multisensory Processes
Many of the multisensory processes discussed in previous parts require postnatal sensory experience to emerge, and this part deals with these factors. A number of animal models have been used in these studies with the objective of manipulating experience in one sensory modality and determining its consequence on perception in other modalities. Part VI begins with a chapter by King, Doubell, and Skaliora on the impact of experience on aligning the visual and auditory maps in the ferret midbrain. These issues, and the presumptive neural circuits involved in effecting this spatial register between the visual and auditory maps in the midbrain, are then dealt with in the owl in Chapter 38, by Gutfreund and Knudsen. In Chapter 39, by Wallace, these issues are extended beyond the spatial register of visual and auditory midbrain maps to the impact of experience on the ability of neurons to synthesize cross-modal cues. Together, these chapters help clarify the ontogeny of the fundamental multisensory processes in the midbrain and their impact on overt orientation. The discussion then moves from orientation to perception. Chapter 40, by Lickliter and Bahrick, and Chapter 41, by Lewkowicz and Kraebel, discuss the maturation of multisensory processing in human infants, providing a timely update on Lewkowicz and Lickliter's (1994) seminal edited volume on this topic. These authors deal with some longstanding controversies regarding whether perception develops from a state in which the senses are differentiated at birth and then become integrated later, or vice versa. Examples from multiple species are brought to bear on these arguments, thereby demonstrating that many of the principles underlying multisensory processes are highly conserved across species.
Part VII: Cross-Modal Plasticity
The chapters in Part VII address some of the basic issues about the possible equivalence of brain regions thought to be genetically crafted to deal with input from a given sensory modality. In Chapter 42 Sur describes experiments in which early surgical intervention forces signals from one sense to be directed to the cortex normally used to deal with information from another. The ability of these cortices to deal with these redirected inputs is striking and raises questions about the driving force behind brain specialization. Using the results from animal experiments, Rauschecker examines the consequences of depriving the brain of information from one sense on the functional properties of other senses. This is an issue that is of particular concern for those who have lost the use of one sensory system (such as the blind or deaf), and the consequences for human perception and performance are examined in detail in the chapters by Röder and Rösler and by Bergeson and Pisoni.
Part VIII: Perspectives Derived from Clinical Studies
What happens to the multisensory processes underlying human perception when the normal processes of multisensory integration break down? That is the topic addressed by the chapters assembled in this final part of the handbook. Làdavas and Farnè in Chapter 50 and Maravita and Driver in Chapter 51 discuss what happens when brain damage results in attentional deficits such as extinction and neglect. Persons with these deficits can fail to respond to a variety of sensory stimuli situated in the side of space contralateral to their brain lesion. Synesthesia, a condition in which a particular sensory event in one modality produces an additional sensory experience in a different modality, provides some fascinating insights into the multisensory brain. Research in this area has progressed very rapidly in the last few years with the advent of increasingly sophisticated psychophysical paradigms, which, combined with neuroimaging technologies, are now providing some provocative insights into the differences in the connections and patterns of brain activation in synesthetic, as compared to nonsynesthetic, individuals. These results have led to the development of a variety of neurocognitive theories of synesthesia, which are discussed in Chapter 53, by Mattingley and Rich, and in Chapter 54, by Ramachandran, Hubbard, and Butcher.
| |