Monthly
288 pp. per issue
6 x 9, illustrated
ISSN
0899-7667
E-ISSN
1530-888X
2014 Impact factor:
2.21

Neural Computation

July 1, 1998, Vol. 10, No. 5, Pages 1157-1178
(doi: 10.1162/089976698300017395)
© 1998 Massachusetts Institute of Technology
A Learning Theorem for Networks at Detailed Stochastic Equilibrium
Article PDF (689.2 KB)
Abstract

This article analyzes learning in continuous stochastic neural networks defined by stochastic differential equations (SDE). In particular, it studies gradient descent learning rules to train the equilibrium solutions of these networks. A theorem is given that specifies sufficient conditions for the gradient descent learning rules to be local covariance statistics between two random variables: (1) an evaluator that is the same for all the network parameters and (2) a system variable that is independent of the learning objective. While this article focuses on continuous stochastic neural networks, the theorem applies to any other system with Boltzmann-like equilibrium distributions. The generality of the theorem suggests that instead of suppressing noise present in physical devices, a natural alternative is to use it to simplify the credit assignment problem. In deterministic networks, credit assignment requires an evaluation signal that is different for each node in the network. Surprisingly, when noise is not suppressed, all that is needed is an evaluator that is the same for the entire network and a local Hebbian signal. This modularization of signals greatly simplifies hardware and software implementations. The article shows how the theorem applies to four different learning objectives that span supervised, reinforcement, and unsupervised problems: (1) regression, (2) density estimation, (3) risk minimization, and (4) information maximization. Simulations, implementation issues, and implications for computational neuroscience are discussed.