MIT CogNet, The Brain Sciences ConnectionFrom the MIT Press, Link to Online Catalog
SPARC Communities
Subscriber : Stanford University Libraries » LOG IN

space

Powered By Google 
Advanced Search

 

Monte Carlo POMDPs

 Sebastian Thrun
  
 

Abstract:
We present a Monte Carlo algorithm for learning to act optimally in partially observable Markov decision processes (POMDPs). Our approach uses importance sampling for representing beliefs, and Monte Carlo approximation for belief revision. Reinforcement learning (value iteration) is employed to learn value functions over belief functions, and a sample-based version of nearest neighbor is used to generalize across states. Our approach departs from previous work in the POMDP field in that it can handle real-valued state spaces. Initial empirical results suggest that our approach may work well in practical applications.

 
 


© 2010 The MIT Press
MIT Logo