| |
Visual psychophysics
Introduction: Comparison of Psychophysical and Electrophysiological Approaches
Electrophysiological procedures have proven to be of considerable value in assessing the functional properties of classes of neurons within the visual pathway in both normal visual systems and those with pathology. For example, the a-wave of the ERG has provided important information about the integrity of the rod and cone photoreceptors in retinal degenerations such as retinitis pigmentosa (RP).46
In many applications, the electrophysiological response represents the summed activity of neurons that are responding to stimuli covering a broad region of visual space. However, with the advent of focal and multifocal ERG and multifocal VEP techniques (reviewed by Hood45; see also chapter 14), it has become possible to record the electrical activity of neurons responding to stimuli that are presented within spatially delimited regions. Nevertheless, the ability to record specifically from spatially localized generators of electrophysiological responses remains somewhat limited, owing in part to the need to achieve adequate signal-to-noise ratios.
By comparison, psychophysical procedures can provide a measure of visual function within a quite small region of the visual field, with stimuli sometimes subtending less than 1 minute of visual angle. However, unlike electrophysiological responses, psychophysical responses represent the properties of the entire visual pathway, from photoreceptors to cortex. Furthermore, psychophysical measurements are subject to the potential influence of cognitive factors such as attention, and they are also dependent on the motor skills that are involved in producing a response.
Nevertheless, psychophysical procedures, particularly in combination with electrophysiological techniques, can provide important insights into the site and nature of defects within the visual pathway in disorders of the visual system. For example, Seiple and colleagues92 investigated the retinal site of adaptation defects in patients with RP by comparing increment thresholds that were derived from psychophysics with those that were derived from the focal ERG. On the basis of similarities in the results obtained with the two approaches, they concluded that the adaptation defects shown by the patients with RP had an outer retinal locus.
As was discussed by Seiple et al.,91 however, any direct comparison between psychophysical and electrophysiological procedures should consider a number of factors before firm conclusions can be drawn about the relationship with the disease process. These factors include the size and duration of the stimulus, the mechanism of response generation (whether the response is generated by the most sensitive unit or is the summed response of a number of units), the gain of the response, and the adaptation level. Additional discussions of the linking hypotheses or propositions that should be considered in specifying the relationship between psychophysical results and physiological states can be found in the work of Brindley,16 Teller,100 and Lee.58
Fundamental Concepts of Psychophysics
In current usage, the term psychophysics refers both to a set of methods and to a body of knowledge about the visual system that has been gathered with these methods. The general aim of psychophysical methods is to relate sensory states to the physical properties of visual stimuli in a quantitative manner. The physical properties of visual stimuli can be easily obtained through the appropriate instrumentation, such as photometers. Information about sensory states is less readily available. Observers typically communicate information about sensory states through a verbal response, such as “yes, those two lights look the same,” or by a motor response, such as the press of a particular button or the reaction time to stimulus presentation.
It is important to note that sensory states can vary in either quantity or quality. Sensations that vary in quantity, such as brightness, are termed prothetic, and can be plotted on a scale of magnitude. For example, a light of 100cd/m2 presented in darkness appears to have a greater brightness than does a light of 10cd/m2 under the same conditions, and so brightness is considered to be a prothetic sensation. Sensations that vary in quality but not quantity, such as hue, are termed metathetic and cannot be plotted on a magnitude scale. For example, the hue “green” is neither more nor less in quantity than the hue “red,” so no magnitude scale of hue is justified. However, both prothetic and metathetic sensations are amenable to psychophysical measurements.
The usual goal of a psychophysical experiment is to determine a person's threshold. The term threshold refers to the stimulus magnitude that provides a transition between two sensory states, either between “no sensation” and “sensation” or between two different sensations. In some applications, the outcome measure is termed sensitivity, which is the reciprocal of threshold.
In current usage, the threshold is not a fixed stimulus value but varies stochastically. That is, owing to various sources of variability, such as quantal fluctuations in the light output or intrinsic noise within the visual system, there is no one stimulus magnitude that forms the absolute boundary between two sensory states. Instead, stimuli that are near a certain value may be reported as “seen” on some occasions and not others. As a consequence, when one plots the percent of trials on which a stimulus is reported “seen” versus the values of the stimulus, the data typically form an S-shaped function rather than a function with an abrupt step at some particular stimulus value.
There are two general classes of thresholds: detection thresholds and difference thresholds. The detection or absolute threshold represents the minimum stimulation necessary to detect the presence of a stimulus. The difference or increment threshold refers to the change in visual stimulation that is necessary for the observer to discriminate between a test stimulus and a reference stimulus. Detection can be considered to be a special case of discrimination in which the reference stimulus has a value of zero.
A clinical example of a detection threshold is the quantal flux necessary for detecting a flash of light that is presented to the visual field periphery in dark-adapted perimetry. An example of a difference threshold can be found in the anomaloscope test of color vision defects. In this test, the observer's task is to determine whether a mixture of middle- and long-wavelength lights is different from a reference light of intermediate wavelength. Static perimetry is an additional example of the clinical application of the difference or increment threshold. In static perimetry, the observer's task is to discriminate a small flash of light (an increment) from the adapting field of the perimeter bowl.
Classical Psychophysical Techniques
Three basic psychophysical methods for measuring thresholds were introduced by Gustav Fechner in the 1800s. Perhaps the most straightforward of these classical psychophysical techniques is the method of adjustment, in which the observer manipulates the test stimulus until it is just detectable or is just noticeably different from a reference stimulus. Typically, the threshold is defined as the mean of a series of such measurements. A variation of the method of adjustment is the tracking procedure, in which the observer continuously adjusts the stimulus to maintain it at a threshold level. Tracking has proven useful in measuring sensory events that change over time, such as the recovery of sensitivity following exposure to a bleaching light. Because the observer has direct control over the stimulus, however, the method of adjustment is open to potential artifacts. For example, the observer may adjust the stimulus by some fixed, arbitrary amount on each trial without regard to sensory events. Furthermore, the tracking method can be influenced by any changes that may occur in the observer's response criterion over time.
A second classical psychophysical procedure is the method of limits. In this procedure, the experimenter initially sets the stimulus to a value that is either below or above the estimated threshold and then alters the stimulus value in small steps until the observer signals that the stimulus has just been detected (ascending method) or that it has just disappeared (descending method). The threshold is defined as the mean of a series of such measurements. The method of limits has proven valuable in the clinical setting but is also vulnerable to artifacts. These include errors of habituation, in which the observer maintains the same response (“seen” or “not seen”) from trial to trial without regard to sensory events, and errors of anticipation, in which the observer reports prematurely that the stimulus has become visible or has disappeared.
The third classical approach is the method of constant stimuli. In this technique, a fixed number of different test stimuli are presented whose values span the region of the estimated threshold in discrete steps. Each stimulus is presented the same number of times in a random order. The observer responds “seen” or “not seen” on each trial. The percentage “seen” is plotted for each stimulus value, resulting in a psychometric function, as illustrated in figure 26.1. The data are typically fit with an ogival function such as a cumulative normal distribution, or with a sigmoidal function, such as a logistic function or a Weibull function. The threshold is defined typically as the value that is reported “seen” on 50% of the trials. In the method of constant stimuli, catch trials are often used, in which no test stimulus is presented. If the observer responds that a stimulus was seen on such a trial, he or she is informed of the mistake and is urged to try harder.
Figure 26.1.
An example of a psychometric function derived from the method of constant stimuli, in which the percent of trials on which a stimulus was reported “seen” is plotted against the stimulus magnitude. The curve fit to the data points represents a logistic function. The threshold refers to the stimulus value that was reported “seen” on 50% of the trials, as derived from the fitted curve.
Signal Detection Theory
Although the classical psychophysical techniques have proven useful in a clinical setting, it is apparent that they do not take into account the observer's response bias or criterion, which can have a substantial effect on the threshold estimate. An alternative approach is the set of methods derived from signal detection theory (SDT), in which there is no assumption of a sensory threshold. Instead, an emphasis is placed on the decision strategies that are employed by the observer, who is required to detect a signal in the presence of noise. The noise is usually considered to consist of some combination of external noise (outside the observer) and internal noise (within the observer). The SDT approach provides a way to separate an observer's actual sensitivity from his or her response criterion. It should be noted that the term sensitivity in this context refers to an observer's ability to discriminate a signal from noise, not to the reciprocal of threshold.
A standard SDT method is the “yes-no” procedure, in which there is a single observation period that contains either noise alone or a signal embedded in noise. An example of this approach is the detection of a grating patch (signal) that has been added to external white noise.74 The observer responds either “yes,” a signal was present, or “no,” a signal was not present. Various values of the signal are presented across trials. For simplicity of analysis, it is often assumed that the probability density distributions of both the noise and the signal-plus-noise are Gaussian. In an analysis of the results, the two most important events are hits, in which the observer correctly responds that a signal was present, and false alarms, in which the observer reports that a signal was present when it was not.
For a given signal strength, a plot of hit rate versus false alarm rate yields a point on a function termed a receiver operating characteristic (ROC) curve. To generate an ROC curve with multiple points, the observer's criterion is usually manipulated by varying the payoffs associated with hits and false alarms and/or by varying the probability of signal presentation. The distance of the ROC curve from chance performance provides an index of the observer's sensitivity. Although it is an excellent method for distinguishing between an observer's sensitivity and his or her response criterion, the yes-no procedure is generally a time-consuming technique that has seen limited clinical application.
A related SDT procedure that has been widely used in the clinical setting is the forced-choice procedure. In this technique, the observer is presented with two or more observation intervals. These may be separated spatially (i.e., the test stimulus is presented at one of several possible test locations within the visual field) or temporally (i.e., the test stimulus is always presented at the same location but in one of two or more well-defined time periods). Only one of the observation intervals contains a signal, and the observer's task is to report the interval in which the signal occurred.
In the forced-choice procedure, a fixed set of stimulus values is presented multiple times across a series of trials, and the percentage correct value is derived for each stimulus magnitude. The percentage correct value is then plotted as a function of stimulus magnitude to derive a psychometric function, from which the observer's sensitivity can be derived. An example of a psychometric function derived from a two-alternative forced-choice (2AFC) procedure is given in figure 26.2.
Figure 26.2.
An example of a psychometric function derived from a two-alternative forced-choice procedure, in which the percent correct value is plotted for each stimulus value. The curve fit to the data points represents a Weibull function. The threshold refers to the stimulus value at which the observer was correct on 75% of the trials, as derived from the fitted curve.
A typical (although arbitrary) measure of sensitivity is the stimulus that results in a percentage correct value that lies halfway between chance performance and perfect performance. Chance performance, or the “guessing rate,” is equal to 1/n, where n is the number of alternatives (e.g., chance performance in a 2AFC procedure is 50%). The results of a 2AFC procedure are related quantitatively to those of a comparable yes-no procedure in that, for a given signal strength, the percentage correct value is equal to the area under the ROC curve.31
The forced-choice procedure is often termed criterion free, because it provides a way to measure an observer's sensitivity independent of the response criterion. The forced-choice method is not without potential drawbacks, however. First, it depends on an observer's cooperation. As an extreme example, a malingering observer might decide to respond incorrectly on some trials, thereby affecting the estimate of sensitivity. Frequently, an observer may be inattentive on a certain percentage of trials, which leads to a lapsing rate, or less-than-perfect performance at high stimulus values. Another potential drawback to the forced-choice procedure is that it is based on the idea of an unbiased observer, and this is not likely to be the case. Observers tend to exhibit nonrandom behavior, such as an avoidance of long strings of identical responses, even though these can occur statistically. Observers may also have a position bias in a spatial forced-choice procedure or an interval bias in a temporal forced-choice procedure such that they tend to prefer one response interval over another. In addition, observers may become confused by the number of possible choices if there are more than two alternatives.
In the forced-choice approach, certain observers, especially patients, may be unwilling to give a response when they are certain that they see nothing or when they feel that they cannot discriminate between alternatives. Consequently, an “unforced-choice” method has been proposed, in which the response “I don't know” is allowed. The properties of the unforced-choice procedure have been analyzed statistically,50,53 and it has been shown that under certain conditions, this technique can have advantages over the standard forced-choice approach.
One of the useful concepts derived from SDT is that of the “ideal observer.” An ideal observer is one who has access to all the information that is present in the stimulus. The stimulus can be defined either as a distal stimulus (before it has entered the eye) or, more commonly, as a proximal stimulus (one that has been subjected to some degree of optical and/or neural processing). The efficiency of human performance can then be derived from a comparison of the results of an actual human observer with that of the ideal observer. This approach has been applied to a wide variety of visual tasks, ranging from simple two-point discrimination,36 in which the optical and photoreceptoral properties of the eye are taken into account, to complex tasks such as reading,60 in which visual, lexical, and oculomotor sources of information are included.
Adaptive Psychophysical Techniques
The classical psychophysical methods and those of SDT tend to be time-consuming and inefficient, because they typically present a set of stimuli that span a range of values, from those that are non-detectable to those that are detected with a high degree of probability. A more efficient strategy is to concentrate on stimulus values that lie near the presumed threshold. This is the approach taken by adaptive psychophysical procedures. In adaptive psychophysics, the stimulus to be presented on a given trial depends on the observer's prior responses. An example of a simple adaptive technique is the tracking method, described earlier. A number of different adaptive psychophysical procedures have been devised that use various decision rules to guide the stimulus choice on any given trial. An excellent historical overview of these adaptive procedures has been given by Leek.59
Adaptive procedures can be divided into two general categories: those that are parametric, in which there is an explicit assumption about the nature of the underlying psychometric function, and those that are non-parametric, in which there is no particular assumption about the psychometric function except that it is monotonic with stimulus magnitude. Non-parametric techniques are generally variations of the staircase method, which is related in turn to the tracking method. In the simplest staircase procedure, a response of “seen” (or a correct response in a forced-choice procedure) results in a decrease in stimulus magnitude on the subsequent trial. A response of “not seen” (or an incorrect response in a forced-choice procedure) results in an increase in the stimulus magnitude.
This conceptually simple “up-down” staircase approach has not proven effective, however. For example, a “seen/not seen” staircase is subject to changes in an observer's response criterion over time, just as is the tracking procedure. Furthermore, in a 2AFC staircase, the observer will be correct on 50% of the trials by chance alone, independent of the detectability of the stimulus. Therefore, chance plays too large a role in governing the decision as to whether to increase or decrease the step in the simple up-down forced-choice staircase.
As a result of these inadequacies, the simple up-down staircase was replaced by the transformed up-down staircase.63 In the transformed staircase, the stimulus value to be presented on a given trial depends on the outcome of more trials than just the preceding one. In deciding which stimulus value to use on each trial, a number of different decision rules can be applied. A common rule is the “two-down, one-up” decision rule. According to this rule, two consecutive correct responses are required before the stimulus magnitude can be decreased, whereas only one incorrect response is sufficient to increase the stimulus magnitude. An illustration of a forced-choice staircase using a “three-down, one-up” decision rule is given in figure 26.3. Different staircase decision rules can be used to estimate specific points on a psychometric function.63 For example, the “two-down, one-up” rule provides an estimate of the 70.7% correct point. Staircase procedures often use step sizes that are equivalent for the upward and downward directions, but some advantages of using asymmetrical steps (larger up than down) have been discussed by Garcia-Perez.34
Figure 26.3.
An illustration of an up-down transformed staircase based on a two-alternative forced-choice procedure, with the stimulus magnitude plotted for each trial. The staircase used a “one-down, one-up” decision rule until the first reversal was reached. Subsequently, a “three-down, one-up” decision rule was used, in which three consecutive correct responses were required to decrease the stimulus value by one step, whereas a single incorrect response was sufficient to increase the stimulus value by one step. The open circles represent correct responses; the solid circles represent incorrect responses. The arrows indicate the staircase reversal points. The dashed line represents the threshold, which was defined as the mean of the last six reversal points.
In the staircase approach, sensitivity is often defined as the mean of a number of staircase reversal points, as illustrated in figure 26.3. Sensitivity can also be derived by fitting a psychometric function to the complete data set by using a maximum likelihood procedure and then obtaining the stimulus magnitude that corresponds to a particular percentage correct value. A discussion of the relative merits of these two approaches has been provided by Klein.53
In addition to the transformed staircase, non-parametric staircase approaches include parameter estimation by sequential testing (PEST),99 which uses a heuristic set of rules to define the step size, with the staircase terminating when the step size reaches a predefined value, and the modified binary search (MOBS),102 which uses a bisection method to define the step size. The parametric approach is typified by QUEST,104 in which the stimulus value on a given trial is based on the most probable estimate of sensitivity as derived from a Bayesian statistical analysis of the results of previous trials. In QUEST, the underlying psychometric function is assumed to correspond to a Weibull function. A comprehensive, quantitative analysis of the various parametric and non-parametric adaptive techniques and their advantages and disadvantages has been provided by Treutwein.101
Because of their relative efficiency, adaptive psychophysical techniques have seen widespread use in the clinical setting. One common application is in static perimetry, in which the goal is the rapid, accurate assessment of increment thresholds at multiple locations throughout the visual field. Some of the adaptive algorithms that are commonly used in commercial static perimeters have been discussed by Johnson.49
Interactive examples of some of the psychophysical methods described in the preceding sections can be found on the CD-ROM that accompanies the textbook on sensation and perception by Levine.62
Suprathreshold Psychophysical Techniques
Although psychophysical procedures are used primarily to derive thresholds, much of sensory experience results from suprathreshold stimulation. The assessment of visual responses to suprathreshold stimulation is readily performed using electrophysiological procedures, such as the measurement of an ERG luminance-response function. However, the assessment of visual responses to suprathreshold stimuli is problematic for psychophysical techniques. Nevertheless, sensory scaling procedures have been developed, in large part by S. S. Stevens, that can potentially assess the suprathreshold properties of the human visual system (reviewed by Marks and Gescheider70).
One approach to sensory scaling is the method of magnitude estimation, in which an observer is asked to assign a number to the magnitude of the sensory experience that is elicited by a stimulus presentation. The technique of magnitude production has also been used, in which the observer manipulates the stimulus value in order to generate sensory events of particular magnitudes. The stipulation in both magnitude estimation and magnitude production is that the assigned numbers or stimulus settings reflect the magnitude of sensory experience on a ratio scale (i.e., this light is twice as bright as that one). From such techniques, it is possible to derive a scale of the relationship between stimulus magnitude (S) and sensation magnitude (R). Experiments of this nature have typically reported a power law relationship between these two variables:
R = kSn(1)
in which k is a constant of proportionality and n is an exponent that varies with the sensory modality. Psychophysical methods for sensory scaling have found limited application in the clinical setting but have proven useful in specialized circumstances, such as in characterizing the nature of suprathreshold contrast perception in amblyopia.65
Clinical Applications of Psychophysical Methodology
Psychophysical methods are of considerable interest in their own right, but they are also of immense practical value in the clinical setting by virtue of their ability to provide important information about disease processes that is not readily available by other means. For this reason, psychophysical techniques have long played a key role in the clinical evaluation of various forms of visual deficits, whether applied informally, as in the measurement of Snellen visual acuity or, more formally, as in the sophisticated adaptive algorithms used in commercial static perimetry. The aim has generally been to provide information about the characteristics, time course, and underlying pathophysiology of visual system disorders, which is useful in patient management and in determining whether visual disorders are amenable to therapeutic intervention.
It has become increasingly apparent, however, that the traditional clinical psychophysical tests such as those of visual acuity and perimetry might not reveal the full extent of damage in disorders of the visual system.83 As an example, people with melanoma-associated retinopathy, a form of night blindness associated with a malignant skin cancer, have normal visual acuity but a substantial reduction in their sensitivity to motion that is not apparent on standard clinical vision tests.106 Therefore, new clinical psychophysical approaches are being developed that take into account the fact that the visual system consists of parallel pathways that may be differentially vulnerable to ocular disease processes.
A number of different parallel pathways have been identified in electrophysiological recordings from the primate visual system, and it is generally assumed that these pathways are applicable to human vision as well. Examples include the red-green and blue-yellow chromatic pathways for spectral coding;25 the ON and OFF pathways, which code information about light increments and decrements, respectively;89 the magnocellular and parvocellular pathways, which have different contrast coding properties as well as other distinctive features;51 and dorsal and ventral streams, which are thought to process information about “where” an object is in space and “what” it is, respectively.71
These multiple processing streams can be viewed as consisting of sets of filters or analyzers that extract specific types of information from the visual environment.38,83 Traditionally, standard clinical psychophysical procedures have tended to emphasize only first-order analyzers, which encode luminance variations, while neglecting other types of analyzers, such as the second-order system that responds to local variations in contrast or texture.66
By choosing the appropriate stimuli and visual tasks, it is possible to emphasize specific visual subsystems to test for “hidden” losses that are not readily apparent by standard clinical psychophysical procedures. As an example of this approach, patients with glaucoma were tested with specialized perimetric procedures that consisted of red-on-white increments, blue-on-white increments, and critical flicker frequency.73 The goal was to evaluate relative sensitivity losses within a red-green chromatic mechanism, a “blue-on” chromatic mechanism, and an achromatic temporal mechanism, respectively. Chromatic defects were more apparent than achromatic defects in the glaucoma patients, emphasizing the need to test for specific pathway deficits in patients with ocular disease.
In summary, a wide variety of psychophysical procedures have been developed by which to evaluate and understand visual function, of both visually normal individuals and those with visual system disorders. The psychophysical method of choice in any given situation depends ultimately on trade-offs among a number of different factors, including the necessity for the control of the observer's criterion, the efficiency of the threshold estimation procedure, the cost in terms of the observer's and experimenter's time, and the nature of the visual task. Applied optimally, clinical psychophysical techniques provide a powerful noninvasive means for defining the pathophysiology of visual system disorders and for assessing the impact of potential treatment methods.
| |