| |
Abstract:
In this article, we will propose a new reinforcement learning
(RL) method based on an actor-critic architecture. The actor and
the critic are approximated by Normalized Gaussian Networks
(NGnet's), which are networks of local linear regression units. The
NGnet's are trained by the on-line EM algorithm proposed in our
previous paper. We apply our RL method to a task for swinging-up
and stabilizing a single pendulum, and a task for balancing a
double pendulum near the upright position. The experimental results
show that our RL method can be applied to optimal control problems
having continuous state/action spaces and it achieves a good
control in a small number of trial-and-errors
|