| |
Abstract:
Learning Real-Time A* (LRTA*) is a popular control method
that interleaves planning and plan execution. Advantages of LRTA*
include: It allows for fine-grained control over how much planning
to do between plan executions, is able to use heuristic knowledge
to guide planning, can be interrupted at any state and resume
execution at a different state, and improves its plan-execution
time as it solves similar planning tasks, until its plan-execution
time is optimal. LRTA* has been shown to solve search problems in
known environments efficiently. In this paper, we apply LRTA* to
the problem of getting to a given goal location in an initially
unknown environment. Uninformed LRTA* with maximal lookahead always
moves on a shortest path to the closest unvisited state, that is,
to the closest potential goal state. This was believed to be a good
exploration heuristic, but we show that it does not minimize the
plan-execution time in the worst case compared to other uninformed
exploration methods. This result is also of interest to
reinforcement-learning researchers since many reinforcement
learning methods use asynchronous dynamic programming, just like
LRTA*.
|