Monthly
288 pp. per issue
6 x 9, illustrated
ISSN
0899-7667
E-ISSN
1530-888X
2014 Impact factor:
2.21

Neural Computation

February 2009, Vol. 21, No. 2, Pages 301-339
(doi: 10.1162/neco.2008.08-07-593)
© 2008 Massachusetts Institute of Technology
A Spiking Neural Network Model of an Actor-Critic Learning Agent
Article PDF (321.5 KB)
Abstract

The ability to adapt behavior to maximize reward as a result of interactions with the environment is crucial for the survival of any higher organism. In the framework of reinforcement learning, temporal-difference learning algorithms provide an effective strategy for such goal-directed adaptation, but it is unclear to what extent these algorithms are compatible with neural computation. In this article, we present a spiking neural network model that implements actor-critic temporal-difference learning by combining local plasticity rules with a global reward signal. The network is capable of solving a nontrivial gridworld task with sparse rewards. We derive a quantitative mapping of plasticity parameters and synaptic weights to the corresponding variables in the standard algorithmic formulation and demonstrate that the network learns with a similar speed to its discrete time counterpart and attains the same equilibrium performance.