| |
History of the multiscale view
The emergence of spatial scale as an important aspect of visual processing came at about the same time from neurophysiological studies of animals and psychophysical studies of humans. In humans, following from the early aerial reconnaissance work of Selwyn (1948) in England and the applied work of Schade (1956) in the United States, Campbell and Green (1965) began to develop and apply the measurement of contrast sensitivity to better specify human vision. This approach emphasized the importance of visual processing for object sizes larger than the resolution limit.
In particular, the contrast sensitivity function describes the relationship between the size of a stimulus and the contrast necessary to just detect it (i.e., its contrast threshold). The stimulus of choice is a sinusoidal grating (black and white bars with a sinusoidal luminance profile modulated about a mean light level) specified in units of spatial frequency (in cycles per degree subtended at the eye). Such a stimulus allows contrast to be altered without affecting the mean adaptational state of the eyes; the retinal image will also be sinusoidal in form. The form of this relationship is shown in Figure 68.1. Human sensitivity is best at intermediate object sizes (or spatial frequencies) and is reduced at both higher and lower spatial frequencies. Although the optics contributes to the reduction in contrast sensitivity at high spatial frequencies, the majority of the falloff at high spatial frequencies and all of the falloff at low spatial frequencies is due to the sensitivity of neural processes. The contrast sensitivity curve depicted in Figure 68.1 is for foveal viewing and photopic light levels. If stimuli are imaged on more peripheral parts of the field or under scotopic light levels, there is preferential loss of sensitivity at high spatial frequencies as a consequence of the reduced neural sensitivity under these conditions.
Figure 68.1..
Contrast sensitivity function for foveal vision under photopic conditions. Contrast sensitivity (the reciprocal of the contrast needed for threshold detection) is plotted against the spatial frequency of a sinusoidal grating stimulus. This overall sensitivity function is itself composed of a set of more spatially restricted mechanisms termed spatial channels.
The next major advance came when Campbell and Robson (1968), Pantle and Sekular (1968) and Blakemore and Campbell (1969), provided evidence that the contrast sensitivity function was itself composed of a number of more narrowly tuned, independent spatial mechanisms. A series of psychophysical studies followed, outlining the degree to which these spatial channels, as they were called, were independent (Graham et al., 1978) and the ways in which they interacted (Graham and Nachmias, 1971).
During this same period, single-cell measurements from different parts of the visual pathway showed that neuronal receptive fields came in various sizes (Hubel and Wiesel, 1959, 1962) and that there was a systematic scaling of receptive field size with eccentricity in the retina and, to some extent, in the cortex (Hubel and Wiesel, 1968). Up to this time, neurons were not really considered filters (in the sense that they only transmitted information of a particular spatial scale, just as a green filter attenuates blue and red light and only allows light of intermediate wavelengths to pass through); the prevailing idea was that they encoded certain stimulus features, and that they needed to do this at a range of different sizes. The idea that neurons could be sufficiently linear to be considered filters emerged from the retinal work of Enroth-Cugell and Robson (1966), which was very controversial at the time, beginning to have its main impact only a decade later. The primary visual cortex contains cells that are grouped together along a number of key processing dimensions: orientation, ocular dominance, and spatial frequency. Cells with similar spatial frequency preference are grouped together into domains whose map is locally continuous across V1. The distance between domains conforms to the hypercolumn description of cortical organization (Issa et al., 2000).
The impact of spatial scale on our thinking in vision research is reflected in the different computational approaches (Marr, 1982; Marr and Hildreth, 1980; Marr and Poggio, 1979; Morrone and Burr, 1988; Watt and Morgan, 1985; Wilson and Gelb, 1984; Wilson and Richards, 1989) that have developed subsequent to the work described above. All the models assume an initial spatial decomposition. They differ in the extent of this decomposition and in the level at which the information from these different spatial filters is combined and analyzed. The two extreme versions of this model were captured in Watt and Morgans' MIRAGE Model and Wilson's Line Element Model. In the MIRAGE model, spatial filters were combined at an early stage, and only symbolic descriptors subsequent to this combination were used (Watt and Morgan, 1985). In the Wilson Line Element Model, later stages of processing had independent access to the output of individual spatial filters, and their outputs were flexibly combined to solve different tasks (Wilson and Gelb, 1984; Wilson and Richards, 1989).
In this chapter, I will give examples of the advantages of accessing information at different spatial scales; these include foveal specialization and visual stability under different light levels. After detailing the evidence for independent access to scale information, I will discuss the scale selection rules. Finally, I will give examples of situations where information at different scales is not kept separate but is combined in specific ways (the scale combination rules).
| |