| |
Abstract:
In order to find the optimal control of continuous
state-space and time reinforcement learning (RL) problems, we
approximate the value function (VF) with a particular class of
functions called the barycentric interpolators. We establish
sufficient conditions under which a RL algorithm converges to the
optimal VF, even when we use approximate models of the state
dynamics and the reinforcement functions.
|