TOWARDS CONTINUOUS CONTROL FOR MOBILE ROBOT NAVIGATION: A REINFORCEMENT LEARNING AND SLAM BASED APPROACH

<p><strong>Abstract.</strong> We introduce a new autonomous path planning algorithm for mobile robots for reaching target locations in an unknown environment where the robot relies on its on-board sensors. In particular, we describe the design and evaluation of a deep reinforcement learning motion planner with continuous linear and angular velocities to navigate to a desired target location based on deep deterministic policy gradient (DDPG). Additionally, the algorithm is enhanced by making use of the available knowledge of the environment provided by a grid-based SLAM with Rao-Blackwellized particle filter algorithm in order to shape the reward function in an attempt to improve the convergence rate, escape local optima and reduce the number of collisions with the obstacles. A comparison is made between a reward function shaped based on the map provided by the SLAM algorithm and a reward function when no knowledge of the map is available. Results show that the required learning time has been decreased in terms of number of episodes required to converge, which is 560 episodes compared to 1450 episodes in the standard RL algorithm, after adopting the proposed approach and the number of obstacle collision is reduced as well with a success ratio of 83% compared to 56% in the standard RL algorithm. The results are validated in a simulated experiment on a skid-steering mobile robot.</p>

Download Full-text

Deep Reinforcement Learning for Indoor Mobile Robot Path Planning

Sensors ◽

10.3390/s20195493 ◽

2020 ◽

Vol 20 (19) ◽

pp. 5493

Author(s):

Junli Gao ◽

Weijie Ye ◽

Jing Guo ◽

Zhongjuan Li

Keyword(s):

Reinforcement Learning ◽

Path Planning ◽

Mobile Robot ◽

Parameters Optimization ◽

Reward Function ◽

3D Environment ◽

Training Mode ◽

Planning Algorithm ◽

Incremental Training ◽

Path Planner

This paper proposes a novel incremental training mode to address the problem of Deep Reinforcement Learning (DRL) based path planning for a mobile robot. Firstly, we evaluate the related graphic search algorithms and Reinforcement Learning (RL) algorithms in a lightweight 2D environment. Then, we design the algorithm based on DRL, including observation states, reward function, network structure as well as parameters optimization, in a 2D environment to circumvent the time-consuming works for a 3D environment. We transfer the designed algorithm to a simple 3D environment for retraining to obtain the converged network parameters, including the weights and biases of deep neural network (DNN), etc. Using these parameters as initial values, we continue to train the model in a complex 3D environment. To improve the generalization of the model in different scenes, we propose to combine the DRL algorithm Twin Delayed Deep Deterministic policy gradients (TD3) with the traditional global path planning algorithm Probabilistic Roadmap (PRM) as a novel path planner (PRM+TD3). Experimental results show that the incremental training mode can notably improve the development efficiency. Moreover, the PRM+TD3 path planner can effectively improve the generalization of the model.

Download Full-text

Continuous Control with Deep Reinforcement Learning for Mobile Robot Navigation

2019 Chinese Automation Congress (CAC) ◽

10.1109/cac48633.2019.8996652 ◽

2019 ◽

Cited By ~ 1

Author(s):

Jiaqi Xiang ◽

Qingdong Li ◽

Xiwang Dong ◽

Zhang Ren

Keyword(s):

Reinforcement Learning ◽

Mobile Robot ◽

Robot Navigation ◽

Mobile Robot Navigation ◽

Continuous Control

Download Full-text

Deep reinforcement learning based socially aware mobile robot navigation framework

2020 7th NAFOSTED Conference on Information and Computer Science (NICS) ◽

10.1109/nics51282.2020.9335911 ◽

2020 ◽

Author(s):

Nam Thang Do ◽

Trung Dung Pham ◽

Nguyen Huu Son ◽

Trung Dung Ngo ◽

Xuan Tung Truong

Keyword(s):

Reinforcement Learning ◽

Mobile Robot ◽

Robot Navigation ◽

Mobile Robot Navigation

Download Full-text

Mobile Robot Navigation Using Reinforcement Learning Based on Neural Network with Short Term Memory

Advanced Intelligent Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-24728-6_28 ◽

2011 ◽

pp. 210-217 ◽

Cited By ~ 4

Author(s):

Andrey V. Gavrilov ◽

Artem Lenskiy

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Mobile Robot ◽

Short Term Memory ◽

Robot Navigation ◽

Mobile Robot Navigation ◽

Short Term ◽

Term Memory

Download Full-text

CBNAV: Costmap Based Approach to Deep Reinforcement Learning Mobile Robot Navigation

10.1109/lars/sbr/wre54079.2021.9605463 ◽

2021 ◽

Author(s):

Darci Luiz Tomasi ◽

Eduardo Todt

Keyword(s):

Reinforcement Learning ◽

Mobile Robot ◽

Robot Navigation ◽

Mobile Robot Navigation

Download Full-text

Applying topological maps to accelerate reinforcement learning in mobile robot navigation

IEEE International Conference on Systems, Man and Cybernetics ◽

10.1109/icsmc.2002.1173376 ◽

2005 ◽

Author(s):

A.P.S. Braga ◽

A.F.R. Araiijo

Keyword(s):

Reinforcement Learning ◽

Mobile Robot ◽

Robot Navigation ◽

Mobile Robot Navigation ◽

Topological Maps

Download Full-text

Sensor-Based Mobile Robot Navigation via Deep Reinforcement Learning

2018 IEEE International Conference on Big Data and Smart Computing (BigComp) ◽

10.1109/bigcomp.2018.00030 ◽

2018 ◽

Cited By ~ 1

Author(s):

Seung-Ho Han ◽

Ho-Jin Choi ◽

Philipp Benz ◽

Jorge Loaiciga

Keyword(s):

Reinforcement Learning ◽

Mobile Robot ◽

Robot Navigation ◽

Mobile Robot Navigation

Download Full-text

Navigation in Unknown Dynamic Environments Based on Deep Reinforcement Learning

Sensors ◽

10.3390/s19183837 ◽

2019 ◽

Vol 19 (18) ◽

pp. 3837 ◽

Cited By ~ 7

Author(s):

Junjie Zeng ◽

Rusheng Ju ◽

Long Qin ◽

Yue Hu ◽

Quanjun Yin ◽

...

Keyword(s):

Reinforcement Learning ◽

Domain Knowledge ◽

Moving Objects ◽

Dynamic Environment ◽

Dynamic Environments ◽

Continuous Control ◽

Complex Environments ◽

Reward Function ◽

Knowledge Based ◽

Task Architecture

In this paper, we propose a novel Deep Reinforcement Learning (DRL) algorithm which can navigate non-holonomic robots with continuous control in an unknown dynamic environment with moving obstacles. We call the approach MK-A3C (Memory and Knowledge-based Asynchronous Advantage Actor-Critic) for short. As its first component, MK-A3C builds a GRU-based memory neural network to enhance the robot’s capability for temporal reasoning. Robots without it tend to suffer from a lack of rationality in face of incomplete and noisy estimations for complex environments. Additionally, robots with certain memory ability endowed by MK-A3C can avoid local minima traps by estimating the environmental model. Secondly, MK-A3C combines the domain knowledge-based reward function and the transfer learning-based training task architecture, which can solve the non-convergence policies problems caused by sparse reward. These improvements of MK-A3C can efficiently navigate robots in unknown dynamic environments, and satisfy kinetic constraints while handling moving objects. Simulation experiments show that compared with existing methods, MK-A3C can realize successful robotic navigation in unknown and challenging environments by outputting continuous acceleration commands.

Download Full-text