| |
Abstract:
We propose local error estimates together with algorithms for
adaptive a-posteriori grid and time refinement in reinforcement
learning. We consider a deterministic system with continuous state
and time with an infinite horizon discounted cost functional. For
grid refinement we follow the procedure of numerical methods for
the Bellman-equation. For time refinement we propose a new
criterion, based on consistency estimates of discrete solutions of
the Bellman-equation. We demonstrate that an optimal ratio of time
to space discretization is crucial for optimal learning rates and
accuracy of the approximate optimal value function.
|