MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

The CogNet Library : References Collection
mitecs_logo  The Visual Neurosciences : Table of Contents: A Modern View of the Classical Receptive Field: Linear and Nonlinear Spatiotemporal Processing by V1 Neurons : Section 1
Next »»
 

Framework

The receptive field is a central construct in virtually all physiological and computational accounts of the workings of the early visual system. It describes how visual inputs are transformed into neural responses and thus represents the computations performed by a neuron. For this reason, studies of receptive fields are essential for understanding the neural basis of visual perception.

The receptive field of a visual neuron was originally defined as the region of the retina that must be illuminated in order to evoke a response from the neuron (Hartline, 1940). In addition to defining the spatial extent of the receptive field, traditional descriptions specified that particular regions of a receptive field could respond to luminance increments (ON subregions), luminance decrements (OFF subregions), or both (ON-OFF subregions). Using this basic scheme, the spatial organization of receptive fields in the retina, lateral geniculate nucleus (LGN), and striate cortex was elucidated in a series of classic studies (e.g., Barlow, 1953; Hubel and Wiesel, 1961, 1962; Kuffler, 1953), some of which are described below. However, research over the past two decades has revealed a number of response properties that are not easily described within the basic scheme of ON and OFF subregions, suggesting that an expanded view of the receptive field is needed.

There are two major limitations to the traditional characterization of receptive fields. One is that it does not describe the temporal dynamics of neuronal responses. The response of a neuron to a very brief stimulus rises after some delay and then decays over time, and the time course of this response varies among neurons in the early visual pathways (e.g., DeAngelis et al., 1993a; De Valois et al., 2000; Wolfe and Palmer, 1998). In addition, the temporal dynamics may vary dramatically from one spatial location to another within the classically defined receptive field (DeAngelis et al., 1993a; McLean et al., 1994). Thus, it is essential for any modern description of receptive fields to incorporate both spatial and temporal dimensions.

The second major limitation of the traditional description of receptive fields is that it does not allow for a rigorous characterization of nonlinear response properties. The response of a neuron to multiple stimuli distributed across space and time is often not equal to the linear summation of the responses to the individual stimuli, and in many cases the dominant component of the response is due to nonlinear interactions among stimuli. Thus, a modern description of the receptive field must be able to represent nonlinear response properties in addition to linear characteristics.

In this chapter, we define a receptive field as the spatiotemporal structure that characterizes the transformation between the visual image and a neuron's response, a view that has emerged from research done over the past few decades in numerous laboratories. The main advantage of this modern view is that it describes temporal dynamics of neuronal responses, and it considers the information processing performed by neurons through spatiotemporal interactions (both linear and nonlinear) within the classical receptive field. We show that this view provides new and better explanations for some of the response properties of neurons in primary visual cortex. It should be noted, however, that we do not deal with response modulations originating from outside the classical receptive field, which are described elsewhere in this volume (Chapter 45). Although nonclassical surround effects constitute important receptive field properties, they are not generally elicited by the types of stimuli discussed here and will not be a part of our descriptions.

To characterize the transformation from visual image to neuronal response, it is useful to start by considering the relevant dimensions of the visual input (Adelson and Bergen, 1991). Ignoring color for the sake of simplicity, the image formed on the retina of each eye is a time (T)-varying pattern of luminance across two spatial dimensions (X,Y), which we denote as I(X,Y,T). The response (instantaneous spike rate) of a neuron, on the other hand, is a function of only time and will be represented here by R(T). In our framework, the receptive field transforms the image, I(X,Y,T), into the neuronal response, R(T), in two possible ways: linearly or nonlinearly (Fig. 44.1A, B). In the linear case, the neuron's response, R(T), is described by a convolution of the image, I(X,Y,T), with the spatiotemporal receptive field, denoted by RFL(X,Y,T):

Figure 44.1..  

A visual neuron can be viewed as a system that takes an image, I(X,Y,T), as input and produces a response, R(T), as output. I(X,Y,T) represents a luminance distribution over two dimensions of space (X,Y) and time (T), whereas R(T) denotes the instantaneous spike rate as a function of time. The receptive field of a visual neuron can then be defined as a spatiotemporal structure that characterizes the transformation between the image, I(X,Y,T), and the response, R(T). A, For a linear neuron, this transformation is a linear summation of the image I(X,Y,T) weighted by RFL(X,Y,T), which we will call a linear receptive field map. B, For a nonlinear neuron, the response is the sum of a linear component due to RFL(X,Y,T), and a nonlinear component due to interactions among visual inputs over space and time. Transformations between combinations of inputs and nonlinear responses are represented by nonlinear receptive field maps, RFNp, where the subscript p indicates the order of nonlinear interaction. For example, the second-order nonlinear interaction between two inputs, i and j, is denoted by RFN2(Xi,Yi,Ti, Xj,Yj,Tj). C, Space-time separable and inseparable receptive fields. Each contour plot shows a spatiotemporal receptive field map, RFL(X,T), for a hypothetical V1 simple cell (the Y-axis of RFL(X,Y,T) has been eliminated for simplicity). Solid and dashed contours represent responses to luminance increments (bright stimuli) and decrements (dark stimuli), respectively. The receptive field map on the left is space-time separable, meaning that RFL(X,T) can be described as a product of a spatial function G(X) and a temporal function H(T), that is, RFL(X,T) = G(X)H(T). The map on the right is inseparable; it cannot be decomposed into a spatial function and a temporal function.


In other words, the neuron's response is the linear summation of I(X,Y,T) weighted by RFL(X,Y,T). We shall call RFL(X,Y,T) a linear receptive field map or a linear map for short. For a linear neuron, RFL(X,Y,T) can be obtained by measuring the neuron's response to a small, brief stimulus (i.e., a spatiotemporal impulse) presented at various locations throughout the receptive field. In turn, RFL(X,Y,T) can be used to predict the response of the linear neuron to any arbitrary stimulus according to equation 1. It follows that the response to a combination of multiple stimuli can be predicted by the sum of responses to the individual stimuli. In reality, neurons' behavior cannot be strictly linear due to the threshold for spike generation. However, if neurons behave otherwise linearly, equation 1 still holds for a limited range of inputs, that is, above threshold. In the third section, we will examine the spatiotemporal structure of linear receptive field maps for neurons in the central visual pathways. We will see that linear maps provide a good account of the response properties, including directional selectivity, of many neurons.

On the other hand, if a neuron combines visual inputs across space and time in a nonlinear fashion, the response will deviate from that predicted by the convolution integral of equation 1. For instance, the response of a nonlinear neuron to a combination of two stimuli would be the sum of responses to the individual stimuli (linear response) combined with the response caused by interactions between the two stimuli (nonlinear response). Thus, to describe the response, R(T), of a nonlinear neuron completely, the nonlinear response due to interactions among combinations of inputs must be specified (Fig. 44.1B). We will call maps that describe this transformation nonlinear receptive field maps or nonlinear maps. For example, the map that describes nonlinear interactions between two inputs i and j (i.e., second-order interactions) will be a function of the spatial and temporal dimensions associated with the two inputs. We denote this second-order nonlinear map as RFN2(Xi,Yi,Ti,Xj,Yj,Tj). Then the response, R(T), of a second-order nonlinear neuron can be described as the sum of two components: a linear response given by equation 1 and a nonlinear response given by a combination of two visual inputs, I(Xi,Yi,Ti) and I(Xj,Yj,Tj), weighted by the nonlinear map RFN2(Xi,Yi,Ti,Xj,Yj,Tj):

In theory, nonlinear interactions can be of any order. Therefore, the receptive field of a nonlinear neuron is generally represented by multiple maps. However, responses of many neurons in early visual pathways are well described by just two maps: a linear map and a second-order nonlinear map. As we will see in the fourth section, second-order nonlinear interactions are characteristic of complex cells and can be measured with appropriately chosen pairs of stimuli. These interactions serve to extract useful information from the visual image, such as motion and binocular disparity. We will see that the essence of such second-order nonlinear computations can be captured with an energy model.

It is worth noting that the temporal dimension, T, of the linear and nonlinear receptive field maps represents the time over which the neuron integrates visual inputs. Thus, it differs from the temporal dimension of the image, I(X,Y,T). One of the central tenets of our framework is that the time dimension needs to be considered as part of receptive field descriptions. If the temporal dynamics of a receptive field map do not depend on spatial location, then the map can be summarized by separate spatial and temporal profiles: RFL(X,Y,T) = G(X,Y) H(T). This case is referred to as space-time separable (Fig. 44.1C). However, as we will see shortly, receptive field maps of neurons in the primary visual cortex often depart from space-time separability in functionally important ways. For these neurons, a joint space-time map is the minimum acceptable descriptor of the receptive field.

 
Next »»


© 2010 The MIT Press
MIT Logo