Autonomous control of real snake-like robot using reinforcement learning; Abstraction of state-action space using properties of real world

In this paper, we consider the important problem of safe exploration in reinforcement learning. While reinforcement learning is well-suited to domains with complex transition dynamics and high-dimensional state-action spaces, an additional challenge is posed by the need for safe and efficient exploration. Traditional exploration techniques are not particularly useful for solving dangerous tasks, where the trial and error process may lead to the selection of actions whose execution in some states may result in damage to the learning system (or any other system). Consequently, when an agent begins an interaction with a dangerous and high-dimensional state-action space, an important question arises; namely, that of how to avoid (or at least minimize) damage caused by the exploration of the state-action space. We introduce the PI-SRL algorithm which safely improves suboptimal albeit robust behaviors for continuous state and action control tasks and which efficiently learns from the experience gained from the environment. We evaluate the proposed method in four complex tasks: automatic car parking, pole-balancing, helicopter hovering, and business management.

Download Full-text

Reinforcement learning in multi-dimensional state-action space using random rectangular coarse coding and gibbs sampling

SICE Annual Conference 2007 ◽

10.1109/sice.2007.4421457 ◽

2007 ◽

Author(s):

Hajime Kimura

Keyword(s):

Reinforcement Learning ◽

Gibbs Sampling ◽

Action Space ◽

State Action ◽

Coarse Coding

Download Full-text

Q-Learning in Continuous State-Action Space with Noisy and Redundant Inputs by Using a Selective Desensitization Neural Network

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2015.p0825 ◽

2015 ◽

Vol 19 (6) ◽

pp. 825-832 ◽

Cited By ~ 2

Author(s):

Takaaki Kobayashi ◽

◽

Takeshi Shibuya ◽

Masahiko Morita

Keyword(s):

Neural Network ◽

Real World ◽

Value Function ◽

Action Space ◽

Sensor Noise ◽

Q Learning ◽

State Action ◽

Continuous State ◽

Real World Applications ◽

Function Approximator

When applying reinforcement learning (RL) algorithms such as Q-learning to real-world applications, we must consider the influence of sensor noise. The simplest way to reduce such noise influence is to additionally use other types of sensors, but this may require more state space -- and probably increase redundancy. Conventional value-function approximators used to RL in continuous state-action space do not deal appropriately with such situations. The selective desensitization neural network (SDNN) has high generalization ability and robustness against noise and redundant input. We therefore propose an SDNN-based value-function approximator for Q-learning in continuous state-action space, and evaluate its performance in terms of robustness against redundant input and sensor noise. Results show that our proposal is strongly robust against noise and redundant input and enables the agent to take better actions by using additional inputs without degrading learning efficiency. These properties are eminently advantageous in real-world applications such as in robotic systems.

Download Full-text

Improving state-action space exploration in reinforcement learning using geometric properties

2017 IEEE 56th Annual Conference on Decision and Control (CDC) ◽

10.1109/cdc.2017.8264625 ◽

2017 ◽

Author(s):

Ion Matei ◽

Raj Minhas ◽

Johan de Kleer ◽

Anurag Ganguli

Keyword(s):

Reinforcement Learning ◽

Space Exploration ◽

Action Space ◽

Geometric Properties ◽

State Action

Download Full-text