Monthly
288 pp. per issue
6 x 9, illustrated
ISSN
0899-7667
E-ISSN
1530-888X
2014 Impact factor:
2.21

Neural Computation

November 1, 2000, Vol. 12, No. 11, Pages 2547-2572
(doi: 10.1162/089976600300014845)
© 2000 Massachusetts Institute of Technology
A Model of Invariant Object Recognition in the Visual System: Learning Rules, Activation Functions, Lateral Inhibition, and Information-Based Performance Measures
Article PDF (245.37 KB)
Abstract

VisNet2 is a model to investigate some aspects of invariant visual object recognition in the primate visual system. It is a four-layer feedforward network with convergence to each part of a layer from a small region of the preceding layer, with competition between the neurons within a layer and with a trace learning rule to help it learn transform invariance. The trace rule is a modified Hebbian rule, which modifies synaptic weights according to both the current firing rates and the firing rates to recently seen stimuli. This enables neurons to learn to respond similarly to the gradually transforming inputs it receives, which over the short term are likely to be about the same object, given the statistics of normal visual inputs. First, we introduce for VisNet2 both single-neuron and multiple-neuron information-theoretic measures of its ability to respond to transformed stimuli. Second, using these measures, we show that quantitatively resetting the trace between stimuli is not necessary for good performance. Third, it is shown that the sigmoid activation functions used in VisNet2, which allow the sparseness of the representation to be controlled, allow good performance when using sparse distributed representations. Fourth, it is shown that VisNet2 operates well with medium-range lateral inhibition with a radius in the same order of size as the region of the preceding layer from which neurons receive inputs. Fifth, in an investigation of different learning rules for learning transform invariance, it is shown that VisNet2 operates better with a trace rule that incorporates in the trace only activity from the preceding presentations of a given stimulus, with no contribution to the trace from the current presentation, and that this is related to temporal difference learning.