Autonomous control of real snake-like robot using reinforcement learning; Abstraction of state-action space using properties of real world

Author(s):  
Kazuyuki Ito ◽  
Yoshitaka Fukumori ◽  
Akihiro Takayama
2012 ◽  
Vol 45 ◽  
pp. 515-564 ◽  
Author(s):  
J. Garcia ◽  
F. Fernandez

In this paper, we consider the important problem of safe exploration in reinforcement learning. While reinforcement learning is well-suited to domains with complex transition dynamics and high-dimensional state-action spaces, an additional challenge is posed by the need for safe and efficient exploration. Traditional exploration techniques are not particularly useful for solving dangerous tasks, where the trial and error process may lead to the selection of actions whose execution in some states may result in damage to the learning system (or any other system). Consequently, when an agent begins an interaction with a dangerous and high-dimensional state-action space, an important question arises; namely, that of how to avoid (or at least minimize) damage caused by the exploration of the state-action space. We introduce the PI-SRL algorithm which safely improves suboptimal albeit robust behaviors for continuous state and action control tasks and which efficiently learns from the experience gained from the environment. We evaluate the proposed method in four complex tasks: automatic car parking, pole-balancing, helicopter hovering, and business management.


Author(s):  
Takaaki Kobayashi ◽  
◽  
Takeshi Shibuya ◽  
Masahiko Morita

When applying reinforcement learning (RL) algorithms such as Q-learning to real-world applications, we must consider the influence of sensor noise. The simplest way to reduce such noise influence is to additionally use other types of sensors, but this may require more state space -- and probably increase redundancy. Conventional value-function approximators used to RL in continuous state-action space do not deal appropriately with such situations. The selective desensitization neural network (SDNN) has high generalization ability and robustness against noise and redundant input. We therefore propose an SDNN-based value-function approximator for Q-learning in continuous state-action space, and evaluate its performance in terms of robustness against redundant input and sensor noise. Results show that our proposal is strongly robust against noise and redundant input and enables the agent to take better actions by using additional inputs without degrading learning efficiency. These properties are eminently advantageous in real-world applications such as in robotic systems.


Sign in / Sign up

Export Citation Format

Share Document