From Towards a Science of Consciousness 3 Section 4: Vision and Consciousness -- Introduction CogNet Proceedings
Standard accounts of vision implicitly assume that the purpose of the visual system is to construct some sort of internal model of the world outside-a kind of simulacrum of the real thing, which can then serve as the perceptual foundation for all visually derived thought and action. The association of rich and distinctive conscious experiences with most of our perceptions gives credence to the idea that they must constitute a vital and necessary prerequisite for all of our visually based behavior.
But even though the perceptual representation of objects and events in the world is an important function of vision, it should not be forgotten that vision evolved in the first place, not to provide perception of the world per se, but to provide distal sensory control of the many different movements that organisms make. Many of the visual control systems for the different motor outputs evolved as relatively independent input-output modules. Thus, the different patterns of behavior exhibited by vertebrates, from catching prey to avoiding obstacles, can be shown to depend on independent pathways from the visual receptors through to the motor nuclei, each pathway processing a particular constellation of inputs and each evoking a particular combination of effector outputs.
Of course, the visually guided behavior of many animals, particularly complex animals such as humans, is not rigidly bound to a set of visuomotor modules, however subtle those mechanisms might be. Much of our behavior is quite arbitrary with respect to sensory input and is clearly mediated by some sort of internal model of the world in which we live. In other words, representational systems have evolved-systems that permit the brain to model the world, to identify objects and events, to attach meaning and significance to them, and to establish their causal relations. In humans and other primates, vision provides some of the most important inputs to these representational systems. Such systems are not linked directly to specific motor outputs but are linked instead to cognitive systems subserving memory, semantics, planning, and communication. Of course the ultimate function even of these higher-order systems has to be the production of adaptive behavior. The distinction between systems of this kind and the dedicated visuomotor modules described earlier is that the former enable us to select appropriate courses of action with respect to patterns of visual input, while the latter provide the immediate visual control required to execute those actions.
In our book The Visual Brain in Action, we argue that these two broad kinds of vision can be distinguished not only on functional grounds, but also by the fact that they are subserved by anatomically distinct substrates in the brain. Thus the distinction between vision for action and vision for perception helps us to understand the logic lying behind the organization of the visual pathways in the brain.
[Insert figure 11.1 about here; figures not yet available]
Evolution has provided primates with a complex patchwork of visual areas occupying the posterior 50 percent or so of the cerebral cortex (for review see Zeki 1993). But despite the complexity of the interconnections between these different areas, two broad "streams" of projections have been identified in the macaque monkey brain, each originating from the primary visual area (V1): a ventral stream projecting eventually to the inferior temporal (IT) cortex, and a dorsal stream projecting to the posterior parietal (PP) cortex (Ungerleider and Mishkin 1982). The two streams and the cortical regions to which they project are illustrated in figure 11.1. Of course, these regions also receive inputs from a number of other subcortical visual structures, such as the superior colliculus (via the thalamus). Although some caution must be exercised in generalizing from monkey to human, it seems likely that the visual projections from primary visual cortex to the temporal and parietal lobes in the human brain will involve a separation into ventral and dorsal streams similar to those seen in the monkey.
In 1982, Ungerleider and Mishkin argued that the two streams of visual processing play different but complementary roles in the perception of incoming visual information. According to their original account, the ventral stream plays a critical role in the identification and recognition of objects, while the dorsal stream mediates the localization of those same objects. Some have referred to this distinction in visual processing as one between object vision and spatial vision-"what" versus "where." Apparent support for this idea came from work with monkeys. Lesions of inferior temporal cortex produced deficits in the animal's ability to discriminate between objects on the basis of their visual features but did not affect their performance on a spatially demanding "landmark" task. Conversely, lesions of the posterior parietal cortex produced deficits in performance on the landmark task but did not affect object discrimination learning. Although the evidence available at the time fitted well with Ungerleider and Mishkin's proposal, recent findings from a broad range of studies in both humans and monkeys are more consistent with a distinction not between subdomains of perception, but between perception on the one hand and the guidance of action on the other.
One source of evidence for the perception-action distinction comes from the study of the visual properties of neurons in the ventral and dorsal streams. Neurons in ventral stream areas such as IT are tuned to the features of objects, and many of them show remarkable categorical specificity; some of these category-specific cells maintain their selectivity irrespective of viewpoint, retinal image size, and even color. They are little affected by the monkey's motor behavior, but many are modulated by how often the visual stimulus has been presented and others by whether or not it has been associated with reward. Such observations are consistent with the suggestion that the ventral stream is more concerned with the enduring characteristics and significance of objects than with moment-to-moment changes in the visual array.
Neurons in the dorsal stream show quite different properties from those in the ventral stream. In fact, the visual properties of neurons in this stream were discovered only when methodological advances permitted the experimenter to record from awake monkeys performing visuomotor tasks. Different subsets of neurons in PP cortex turned out to be activated by visual stimuli as a function of the different kinds of responses the monkey makes to those stimuli. For example, some cells respond when the stimulus is the target of an arm reach; others when it is the object of a grasp response; others when it is the target of a saccadic eye movement; others when the stimulus is moving and is followed by a slow pursuit eye movement; and still others when the stimulus is stationary and the object of an ocular fixation. In addition, of course, there are many cells in the dorsal stream, as there are in the ventral stream, that can be activated passively by visual stimuli-indeed logic requires that the visuomotor neurons must receive their visual inputs from visual cells that are not themselves visuomotor. These purely visual neurons are now known to include some that are selective for the orientation of a stimulus object. One important characteristic of many PP neurons is that they respond better to a visual stimulus when the monkey is attending to it, in readiness to make a saccadic or manual response. This phenomenon is known as neuronal enhancement.
The electrophysiology can readily explain why posterior parietal lesions impair landmark task performance: quite simply, the monkey fails to orient toward the landmark. Recent behavioral studies bear out this interpretation. The electrophysiology also explains one of the most obvious effects of PP lesions, namely the monkeys' inability to reach accurately to grasp a moving or stationary food morsel, and why they fail to shape and orient their hands and fingers appropriately to pick up the morsel. The most recent development in this area has been the elegant experiments of Gallese and his colleagues (1997). They have demonstrated that micro-injections of a drug (muscimol) into a particular part of the PP cortex will cause a temporary impairment in hand shaping when the monkey reaches to grasp objects. This fits well with the recent discovery of visually responsive cells within that same part of PP cortex, as well as in anatomically linked areas of premotor cortex, which respond selectively during the grasping of particular objects (Sakata et al. 1997; Rizzolatti et al. 1988). Such evidence is consistent with the proposal that visual networks in the dorsal stream compute more than just spatial location. Indeed, in agreement with the electrophysiology, the behavioral literature is fully consistent with the idea that the dorsal stream has a primary role in mediating the visual control and guidance of a wide range of behavioral acts (Milner and Goodale 1993). Furthermore, even though the egocentric locations of visual targets are indeed computed within the PP cortex, it has now been clearly shown that this is done separately for guiding movements of the eyes and for movements of the hands, both in the monkey brain (Snyder et al. 1997) and in the human brain (Kawashima et al. 1996).
While lesions of one system (the dorsal stream) can thus disrupt visuomotor control without affecting perception, the converse is also true. The classic studies of bilateral temporal lobe lesions in monkeys showed unequivocally that visual recognition was severely affected (Klüver and Bucy 1938), but the investigators noticed that the monkeys retained a wide range of visuomotor skills. For example, they observed that the lesioned monkeys did not bump into obstacles or misjudge distances when jumping. In a more recent study, IT-lesioned monkeys that had failed to learn a pattern discrimination despite many weeks of training, nevertheless remained highly adept at catching gnats flying within the cage room. In another study, inferotemporal monkeys were found able to track and seize a rapidly and erratically moving peanut. Thus the evidence from IT lesions allows us to delineate a range of residual visual skills that do not depend on the ventral stream.
The same dissociations following brain damage have been observed in humans. The first systematic description of a patient of bilateral posterior parietal damage was published by Bálint (see Harvey 1995). Bálint's patient had three major groups of symptoms: attentional (including a narrowing of visual attention), visuomotor (what Bálint called optic ataxia), and oculomotor (fixed gaze). Optic ataxia was manifest as a difficulty in accurately reaching in space to pick up objects with the right hand. In many respects, these disorders closely resemble those seen in the PP-lesioned monkey. In both monkey and man, for example, optic ataxia appears to be visuomotor rather than purely visual or purely motor.
Accordingly, similar lesions in the superior parietal lobule and the neighboring intraparietal sulcus also cause difficulties in executing visually controlled saccadic eye movements in space. Furthermore, patients with optic ataxia not only fail to reach in the right direction but also have difficulty orienting their hand and forming their grasp appropriately with respect to target objects. For example, Perenin and Vighetto (1988) found that their optic ataxic subjects made errors in hand rotation as they tried to reach toward and into a large oriented slot. Often such patients are also unable to use visual information to form their grip as they reach toward an object. Although a normal individual opens the hand in anticipation of the target object, the maximum aperture being scaled in proportion to the size of the object, patients with lesions in the superior parietal cortex often show deficient grip scaling as they reach out to pick up an object (Jeannerod 1986). Yet despite the failure of these patients to orient their hands, to scale their grip appropriately, or to reach toward the right location, they have comparatively little difficulty in giving perceptual reports of the orientation and location of the very objects they fail to grasp. [Insert figure 11.2 about here]
On the other side of the equation, an impairment of ventral stream function seems to occur in humans who suffer from the condition known as visual form agnosia. The classic case of this disorder was described by Benson and Greenberg (1969). Their patient was not only unable to recognize faces or objects, he could not even reliably identify geometric shapes visually, nor distinguish reliably between a square and a rectangle with a 2:1 aspect ratio. Yet the patient was certainly not cortically blind. Recently we have described a very similar patient, D. F. (Milner et al. 1991). We have examined her spared abilities to use visual information in a series of experimental studies. We have found that her attempts to make a perceptual report of the orientation of an oriented slot show little relationship to its actual orientation, whether her reports are made verbally or by manual means. However, when she was asked to insert her hand or a hand-held card into the slot, she shows no difficulty, moving her hand or the card toward the slot in the correct orientation and inserting it quite accurately. Videorecordings have shown that her hand begins to rotate in the appropriate direction as soon as it leaves the start position. In short, although she cannot report the orientation of the slot, she can insert her hand or post a card into it with considerable skill. This dissociation is illustrated in figure 11.2.
Similar dissociations between perceptual report and visuomotor control were also observed in D. F. when she was asked to deal with the intrinsic properties of objects such as their size and shape. Thus, she showed excellent visual control of anticipatory hand posture when she was asked to reach out to pick up blocks of different sizes that she could not distinguish perceptually. Just like normal subjects, D. F. adjusted her finger-thumb separation well in advance of her hand's arrival at the object, and scaled her grip size in a perfectly normal and linear fashion in relation to the target width (Goodale et al. 1991). Yet when she was asked to use her finger and thumb to make a perceptual judgment of the object's width on a separate series of trials, D. F.'s responses were unrelated to the actual stimulus dimensions, and showed high variation from trial to trial.
D. F.'s accurate calibration of grip size during reaching to grasp contrasts markedly with the poor performance of optic ataxic patients with occipito-parietal damage. D. F. is as adept as normal subjects in many grasping tasks. In a recent study (Carey et al. 1996), for example, we have shown that when reaching to pick up rectangular shapes that varied in their orientation as well as their width, D. F. showed simultaneously both normal sensitivity to orientation and normal sensitivity to width. She is not entirely normal in dealing with complex shapes however. We found no evidence, for example, that she is able to deal with two different orientations present in a single target object, such as a cross, when reaching to grasp it. Yet, despite this difficulty with two oriented contours, we have found some evidence that the gross shape of an object can influence where D. F. places her fingers when picking it up (Goodale et al. 1994b, Carey et al. 1996).
If, then, we make the plausible assumption that the ventral stream is severely damaged and/or disconnected in D. F. (an assumption that is quite consistent with her pattern of brain damage), it is reasonable to infer that the calibration of these various residual visuomotor skills must depend on intact mechanisms within the dorsal stream. The visual inputs to this stream, which provide the necessary information for coding orientation, size, and shape, could possibly arise via V1, or via the collicular-thalamic route, or via both. Both routes would appear to be available to D. F., since MRI evidence indicates a substantial sparing of V1 in this patient, with no suggestion of collicular or thalamic damage. Patients with lesions of V1, however, although in some cases able to perform such visuomotor tasks at an above-chance level ('blindsight': Perenin and Rossetti 1996, Rossetti 1998), do so far less proficiently than D. F. We therefore believe that the collicular-pulvinar route alone cannot account for her preserved abilities.
Our various studies of D. F. show that she is able to govern many of her actions using visual information of which she has no awareness. But it is clear that this is only true of actions that are targeted directly at the visual stimulus. She cannot successfully use the same visual information to guide an identical but displaced response-a response using the same distal musculature but at another location. Presumably the difference is that a response displaced in this way is necessarily an arbitrary or symbolic one-not one that would fall within the natural repertoire of a hard-wired visuomotor control system. Thus D. F. seems to be using a visual processing system dedicated for motor control, which will normally only come into play when she carries out natural goal-directed actions.
There are temporal as well as spatial limits on D. F.s ability to drive her motor behavior visually. After showing her a rectangular block, Goodale et al. (1994a) asked D. F. to delay for either 2 or 30 seconds with eyes closed, before allowing her to reach out as if to grasp it. Even after a 30 second delay, the preparatory grip size of normal subjects still correlated well with object width. In D. F., however, all evidence of grip scaling during her reaches had evaporated after a delay of even 2 seconds. This failure was not due to a general impairment in short-term memory. Instead, it seems that a delayed reach is no longer a natural movement, and indeed this is so even for normal subjects. A detailed kinematic analysis of the control subjects showed that they moved their hand abnormally in the delay conditions, as if their apparently normal grip scaling was actually generated artificially by imagining the object and then "pantomiming" the grasp. This pantomiming strategy would not have been open to D. F., since she could not have generated a visual image of something that she failed to perceive in the first place. Presumably the visual processing that is available to her has a very short time constant, because it is designed to deal with present or imminent states of the visual world, and to disregard past states that may no longer be relevant (for example as a result of self-motion). Rossetti (1998) has recently described a similar loss of visuomotor control in the hemianopic field of a "blindsight" patient following a brief delay. Perhaps more surprisingly, we have recently observed a complementary improvement in visuomotor performance in a bilateral optic ataxic patient (A.T.) after a 5-second delay. Presumably in this case the patient was able to throw off the dominance of the dorsal stream under the delay condition, allowing her to make use of her better-preserved ventral system.
One of the ways in which the visual information used by the motor system can be shown to be quite different from that which we experience perceptually is through the study of visual illusions. Gregory (1997) has argued over many years that higher-level visual illusions, including geometric illusions, deceive the perceptual system because the system makes (false) assumptions about the structure of the world based on stored knowledge. These include, for example, assumptions about perceptual stability and spatial constancy. It seems that the dorsal system, by and large, is not deceived by such spatial illusions (Bridgeman et al. 1979, 1981; Wong and Mack 1981; Goodale et al. 1986), perhaps because evolution has taught it that a little "knowledge" can be quite literally a dangerous thing. Instead, the dorsal stream directs our saccadic eye movements and our hand movements to where a target really is, which is not always where our perceptual system tells us it is. Similarly, under appropriate circumstances geometric illusions can be seen to affect visually guided reaching (Gentilucci et al. 1996) and grasping (Aglioti et al. 1995; Brenner and Smeets 1996; Haffenden and Goodale 1998) far less than they affect our perceptual judgments (see figures 11.3 and 11.4). Thus we may perceive an object as bigger than it really is, but we open our finger-thumb grip veridically when reaching for it.
We propose that the processing accomplished by the ventral stream both generates and is informed by stored abstract visual knowledge about objects and their spatial relationships. We further surmise that the particular kinds of coding that are necessary to achieve these ends coincide with those that render the representations accessible to our awareness. This would fit with the idea that coded descriptions of enduring object properties, rather than transitory egocentric views, are precisely what we need for mental manipulations such as those required for the planning of action sequences and the mental rehearsal of alternative courses of action.
But of course, the mere fact that processing occurs in this generalized way in the ventral stream could not be a sufficient condition for its reaching visual awareness. For example, there are generally many items processed in parallel at any given time, most of which will be filtered out of awareness by the operation of selective attention. We have therefore proposed that it is only those items that receive more than a certain threshold level of relative activation, for example through the sharpening effects of spatial gating processes known to be active during selective attention (e.g., Moran and Desimone 1985, Chelazzi et al. 1993), that will reach awareness. That is, we are proposing a conjoint requirement for an item to attain visual awareness: (a) a certain kind of coding (one that is object-based and abstracted from the viewer-centered and egocentric particulars of the visual stimulation that gives rise to it) and (b) a certain level of activation of these coding circuits above the background level of neighbouring circuits.
We do not deny, then, that perception can proceed unconsciously under some circumstances, for example, when the stimuli are degraded by masking or short exposure, or when they are outside the current focus of selective attention. We believe that there is good empirical evidence for such "subliminal" perception of complex patterns, processing that is capable of activating semantic representations of certain kinds. Our assumption is that this form of unconscious perception arises through the partial or diffused activation of neuronal assemblies in the ventral stream, and that it does not reach awareness due to the fact that there is insufficient focussing of the activation above the noise of the surrounding assemblies. If this notion is correct, we would predict that such subconscious stimulation, although able to prime certain kinds of semantic decision tasks, would not provide usable inputs to the visuomotor system. Conversely, visual form information that can successfully guide action in a patient like D. F. should not be expected to have significant priming effects on semantic tasks - precisely because that visual processing is never available to conscious experience, even in the normal observer. In short, it may be the case that for an "undetected" visual stimulus to be able to prime decision tasks, it must at least in principle be accessible to consciousness.
At the level of visual processing, however, the visuomotor modules in the primate parietal lobe function quite independently from the occipitotemporal mechanisms generating perception-based knowledge of the world. Only this latter, perceptual, system can provide suitable raw materials for our thought processes to act upon. In contrast, the other is designed to guide actions purely in the "here and now," and its products are consequently useless for later reference. To put it another way, it is only through knowledge gained via the ventral stream that we can exercise insight, hindsight and foresight about the visual world. The visuomotor system may be able to give us "blindsight," but in doing so can offer no direct input to our mental life (Weiskrantz 1997).
Aglioti, S., M. A. Goodale and J. F. X. DeSouza. 1995. Size-contrast illusions deceive the eye but not the hand. In Curr. Biol. 5:679-685.
Benson, D. F., and J. P. Greenberg. 1969. Visual form agnosia: a specific deficit in visual discrimination. In Arch. Neurol., 20:82-89.
Brenner, E., and J. B. J. Smeets. 1996. Size illusion influences how we lift but not how we grasp an object. In Exp. Brain Res., 111:473-476.
Bridgeman, B., S. Lewis, G. Heit, and M. Nagle. 1979. Relation between cognitive and motor-oriented systems of visual position perception. In J. Exp. Psychol. (Hum. Percept.), 5:692-800.
Bridgeman, B., M. Kirch, and A. Sperling. 1981. Segregation of cognitive and motor aspects of visual function using induced motion. Percept. In Psychophys., 29:336-342.
Carey, D. P., M. Harvey, and A. D. Milner. 1996. Visuomotor sensitivity for shape and orientation in a patient with visual form agnosia. In Neuropsychologia, 34:329-338.
Chelazzi, L., E. K. Miller, J. Duncan, and R. Desimone. 1993. A neural basis for visual search in inferior temporal cortex. In Nature 363:345-347.
Gallese, V., L. Fadiga, L. Fogassi, G. Luppino, and A. Murata. 1997. A parietal-frontal circuit for hand grasping movements in the monkey: evidence from reversible inactivation experiments. In: Parietal lobe contributions to orientation in 3D-space (Eds. P. Thier and H.-O. Karnath) pp. 255-270. Springer-Verlag, Heidelberg.
Gentilucci, M., S. Chieffi, E. Daprati, M.C. Saetti, and I. Toni. 1996. Visual illusion and action. In Neuropsychologia, 34, 369-376.
Goodale, M. A., D. Pélisson, and C. Prablanc. 1986. Large adjustments in visually guided reaching do not depend on vision of the hand or perception of target displacement. In Nature 320:748-850.
Goodale, M. A., A. D. Milner, L. S. Jakobson, and D. P. Carey. 1991. A neurological dissociation between perceiving objects and grasping them. In Nature 349:154-156.
Goodale, M. A., L. S. Jakobson, and J. M. Keillor. 1994a. Differences in the visual control of pantomimed and natural grasping movements. In Neuropsychologia 32:1159-1178.
Goodale, M. A., J. P. Meenan, H.H. Bülthoff, D. A. Nicolle, K. J. Murphy, and C. I. Racicot. 1994b. Separate neural pathways for the visual analysis of object shape in perception and prehension. In Current Biol. 4:604-610.
Gregory, R. 1997. Knowledge in perception and illusion. In Phil. Trans. R. Soc. Lond. B 352:1121-1127.
Haffenden, A. M., and Goodale, M. A. 1998. The effect of pictorial illusion on prehension and perception. In J. Cogn. Neurosci. 10:122-136.
Harvey, M. 1995. Translation of "Psychic paralysis of gaze, optic ataxia, and spatial disorder of attention" by Rudolph Bálint. In Cognitive Neuropsychol. 12:261-282.
Jeannerod, M. 1986. The formation of finger grip during prehension: a cortically mediated visuomotor pattern. In Behav. Brain Res., 19:99-116.
Kawashima, R., E. Naitoh, M.Matsumura, H.Itoh, S. Ono, K. Satoh, R. Gotoh, M. Koyama,K. Inoue, S. Yoshioka, and H. Fukuda. 1996. Topographic representation in human intraparietal sulcus of reaching and saccade. In Neuroreport 7:1253-1256.
Klüver, H., and P. C. Bucy. 1938. An analysis of certain effects of bilateral temporal lobectomy in the rhesus monkey, with special reference to "psychic blindness." In J. Psychol. 5:33-54.
Milner, A. D., M. A. and Goodale. 1993. Visual pathways to perception and action. In: Progress in Brain Research, Vol. 95, (ed. T. P. Hicks, S. Molotchnikoff, and T. Ono) pp. 317-337. Elsevier, Amsterdam.
Milner, A. D., D. I. Perrett, R.S. Johnston, P.J. Benson, T. R. Jordan, D. W. Heeley, D. Bettucci, F. Mortara, R. Mutani, E. Terazzi, and D.L.W. Davidson. 1991. Perception and action in visual form agnosia. In Brain 114:405-428.
Moran, J., and R.Desimone. 1985. Selective attention gates visual processing in the extrastriate cortex. In Science 229:782-884.
Perenin, M.-T., and Y. Rossetti. 1996. Grasping without form discrimination in a hemianopic field. In Neuroreport 7:793-897.
Perenin, M.-T., and A. Vighetto. 1988. Optic ataxia: a specific disruption in visuomotor mechanisms. I. Different aspects of the deficit in reaching for objects. In Brain 111:643-674.
Rizzolatti, G., R. Camarda, L. Fogassi, M. Gentilucci, G. Luppino, and M. Matelli. 1988. Functional organization of inferior area 6 in the macaque monkey. II. Area F5 and the control of distal movements. In Exp. Brain Res. 71:491-507.
Rossetti, Y. 1998. Implicit perception in action: short-lived motor representations of space. In Consciousness and Cognition, 7: in press.
Sakata, H., M. Taira, A. Murata, V. Gallese, Y. Tanaka, E. Shikata, and M. Kusunoki. 1997. Parietal visual neurons coding 3-D characteristics of objects and their relation to hand action. In: Parietal lobe contributions to orientation in 3D space (Ed. P. Thier and H.-O. Karnath), pp.237-254. Springer-Verlag, Heidelberg.
Snyder, L. H., A. P. Batista, and R. A. Andersen. 1997. Coding of intention in the posterior parietal cortex. In Nature 386:167-170.
Ungerleider, L.G., and M. Mishkin. 1982. Two cortical visual systems. In D. J. Ingle, M. A. Goodale, and R. J. W. Mansfield (Eds.). Analysis of Visual Behavior (pp. 549-586). Cambridge, Mass.: MIT Press
Weiskrantz, L. 1997. Consciousness Lost and Found. Oxford: Oxford University Press.
Wong, E., and A. Mack. 1981. Saccadic programming and perceived location.In Acta Psychol. 48:123-131.
Zeki, S. 1993. A Vision of the Brain. Oxford: Blackwell.