MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

An Environment Model for Nonstationary Reinforcement Learning

 Samuel P. Choi, Dit-Yan Yeung and Nevin L. Zhang
  
 

Abstract:
Reinforcement learning in nonstationary environments is generally regarded as an important and yet difficult problem. This paper partially addresses the problem by formalizing a subclass of nonstationary environments. The environment model, called hidden-mode Markov decision process (HM-MDP), assumes that environmental changes are always confined to a small number of hidden modes. A mode basically indexes a Markov decision process (MDP) and evolves with time according to a Markov chain. While HM-MDP is a special case of partially observable Markov decision processes (POMDP), modeling an HM-MDP environment via the more general POMDP model unnecessarily increases the problem complexity. A variant of the Baum-Welch algorithm is developed for model learning requiring less data and time.

 
 


© 2010 The MIT Press
MIT Logo