MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Actor-Critic Algorithms

 Vijay R. Konda and John N. Tsitsiklis
  
 

Abstract:
We propose and analyze a class of actor-critic algorithms for simulation-based optimization of a Markov decision process over a parameterized family of randomized stationary policies. These are two-time-scale algorithms in which the critic uses TD learning with a linear approximation architecture, and the actor is updated in an approximate gradient direction based on information provided by the critic. We show that a set of appropriate features for the critic is prescribed by the choice of parametrization of the actor. We provide an interpretation of the gradient in terms of Riemannian geometry, and conclude by discussing convergence properties and some open problems.

 
 


© 2010 The MIT Press
MIT Logo