Distributed Non-Communicating Multi-Robot Collision Avoidance via Map-Based Deep Reinforcement Learning

It is challenging to avoid obstacles safely and efficiently for multiple robots of different shapes in distributed and communication-free scenarios, where robots do not communicate with each other and only sense other robots’ positions and obstacles around them. Most existing multi-robot collision avoidance systems either require communication between robots or require expensive movement data of other robots, like velocities, accelerations and paths. In this paper, we propose a map-based deep reinforcement learning approach for multi-robot collision avoidance in a distributed and communication-free environment. We use the egocentric local grid map of a robot to represent the environmental information around it including its shape and observable appearances of other robots and obstacles, which can be easily generated by using multiple sensors or sensor fusion. Then we apply the distributed proximal policy optimization (DPPO) algorithm to train a convolutional neural network that directly maps three frames of egocentric local grid maps and the robot’s relative local goal positions into low-level robot control commands. Compared to other methods, the map-based approach is more robust to noisy sensor data, does not require robots’ movement data and considers sizes and shapes of related robots, which make it to be more efficient and easier to be deployed to real robots. We first train the neural network in a specified simulator of multiple mobile robots using DPPO, where a multi-stage curriculum learning strategy for multiple scenarios is used to improve the performance. Then we deploy the trained model to real robots to perform collision avoidance in their navigation without tedious parameter tuning. We evaluate the approach with multiple scenarios both in the simulator and on four differential-drive mobile robots in the real world. Both qualitative and quantitative experiments show that our approach is efficient and outperforms existing DRL-based approaches in many indicators. We also conduct ablation studies showing the positive effects of using egocentric grid maps and multi-stage curriculum learning.

Download Full-text

Multi-Robot Collision Avoidance with Map-based Deep Reinforcement Learning

2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI) ◽

10.1109/ictai50040.2020.00088 ◽

2020 ◽

Author(s):

Shunyi Yao ◽

Guangda Chen ◽

Lifan Pan ◽

Jun Ma ◽

Jianmin Ji ◽

...

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Multi Robot

Download Full-text

Multi-robot Target Encirclement Control with Collision Avoidance via Deep Reinforcement Learning

Journal of Intelligent & Robotic Systems ◽

10.1007/s10846-019-01106-x ◽

2019 ◽

Vol 99 (2) ◽

pp. 371-386 ◽

Cited By ~ 3

Author(s):

Junchong Ma ◽

Huimin Lu ◽

Junhao Xiao ◽

Zhiwen Zeng ◽

Zhiqiang Zheng

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Multi Robot

Download Full-text

Edge-weighted consensus-based formation control strategy with collision avoidance

Robotica ◽

10.1017/s0263574714000368 ◽

2014 ◽

Vol 33 (2) ◽

pp. 332-347 ◽

Cited By ~ 32

Author(s):

Riccardo Falconi ◽

Lorenzo Sabattini ◽

Cristian Secchi ◽

Cesare Fantuzzi ◽

Claudio Melchiorri

Keyword(s):

Mobile Robots ◽

Collision Avoidance ◽

Obstacle Avoidance ◽

Control Strategy ◽

Formation Control ◽

Weighted Graphs ◽

Wheeled Robots ◽

Unknown Environments ◽

Multi Robot

SUMMARYIn this paper, a consensus-based control strategy is presented to gather formation for a group of differential-wheeled robots. The formation shape and the avoidance of collisions between robots are obtained by exploiting the properties of weighted graphs. Since mobile robots are supposed to move in unknown environments, the presented approach to multi-robot coordination has been extended in order to include obstacle avoidance. The effectiveness of the proposed control strategy has been demonstrated by means of analytical proofs. Moreover, results of simulations and experiments on real robots are provided for validation purposes.

Download Full-text

Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios

The International Journal of Robotics Research ◽

10.1177/0278364920916531 ◽

2020 ◽

Vol 39 (7) ◽

pp. 856-892 ◽

Cited By ~ 4

Author(s):

Tingxiang Fan ◽

Pinxin Long ◽

Wenxi Liu ◽

Jia Pan

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Autonomous Navigation ◽

Large Scale ◽

Learning Algorithm ◽

Free Action ◽

Parameter Tuning ◽

Movement Velocity ◽

Robot Systems ◽

Multi Robot

Developing a safe and efficient collision-avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generates its paths with limited observation of other robots’ states and intentions. Prior distributed multi-robot collision-avoidance systems often require frequent inter-robot communication or agent-level features to plan a local collision-free action, which is not robust and computationally prohibitive. In addition, the performance of these methods is not comparable with their centralized counterparts in practice. In this article, we present a decentralized sensor-level collision-avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent’s steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy-gradient-based reinforcement-learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy’s robustness and effectiveness. We validate the learned sensor-level collision-3avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller’s robustness against the simulation-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution for safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. More importantly, the policy has been successfully deployed on different types of physical robot platforms without tedious parameter tuning. Videos are available at https://sites.google.com/view/hybridmrca .

Download Full-text

A novel mobile robot navigation method based on deep reinforcement learning

International Journal of Advanced Robotic Systems ◽

10.1177/1729881420921672 ◽

2020 ◽

Vol 17 (3) ◽

pp. 172988142092167

Author(s):

Hao Quan ◽

Yansheng Li ◽

Yi Zhang

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Mobile Robots ◽

Recurrent Neural Network ◽

Three Dimensional ◽

Mobile Robot Navigation ◽

Simulation Environment ◽

Network Modules ◽

Good Improvement ◽

Dimensional Simulation

At present, the application of mobile robots is more and more extensive, and the movement of mobile robots cannot be separated from effective navigation, especially path exploration. Aiming at navigation problems, this article proposes a method based on deep reinforcement learning and recurrent neural network, which combines double net and recurrent neural network modules with reinforcement learning ideas. At the same time, this article designed the corresponding parameter function to improve the performance of the model. In order to test the effectiveness of this method, based on the grid map model, this paper trains in a two-dimensional simulation environment, a three-dimensional TurtleBot simulation environment, and a physical robot environment, and obtains relevant data for peer-to-peer analysis. The experimental results show that the proposed algorithm has a good improvement in path finding efficiency and path length.

Download Full-text