| |
Abstract:
Many researchers have explored methods for hierarchical
reinforcement learning (RL) with temporal abstractions, in which
abstract actions are defined that can perform many primitive
actions before terminating. However, little is known about learning
with state abstractions, in which aspects of the state space are
ignored. In previous work, we developed the MAXQ method for
hierarchical RL. In this paper, we define five conditions under
which state abstraction can be combined with the MAXQ value
function decomposition. We prove that the MAXQ-Q learning algorithm
converges under these conditions and show experimentally that state
abstraction is important for the successful application of MAXQ-Q
learning.
|