Reinforcement-Learning-Based Asynchronous Formation Control Scheme for Multiple Unmanned Surface Vehicles

The high performance and efficiency of multiple unmanned surface vehicles (multi-USV) promote the further civilian and military applications of coordinated USV. As the basis of multiple USVs’ cooperative work, considerable attention has been spent on developing the decentralized formation control of the USV swarm. Formation control of multiple USV belongs to the geometric problems of a multi-robot system. The main challenge is the way to generate and maintain the formation of a multi-robot system. The rapid development of reinforcement learning provides us with a new solution to deal with these problems. In this paper, we introduce a decentralized structure of the multi-USV system and employ reinforcement learning to deal with the formation control of a multi-USV system in a leader–follower topology. Therefore, we propose an asynchronous decentralized formation control scheme based on reinforcement learning for multiple USVs. First, a simplified USV model is established. Simultaneously, the formation shape model is built to provide formation parameters and to describe the physical relationship between USVs. Second, the advantage deep deterministic policy gradient algorithm (ADDPG) is proposed. Third, formation generation policies and formation maintenance policies based on the ADDPG are proposed to form and maintain the given geometry structure of the team of USVs during movement. Moreover, three new reward functions are designed and utilized to promote policy learning. Finally, various experiments are conducted to validate the performance of the proposed formation control scheme. Simulation results and contrast experiments demonstrate the efficiency and stability of the formation control scheme.

Download Full-text

Mapless Collaborative Navigation for a Multi-Robot System Based on the Deep Reinforcement Learning

Applied Sciences ◽

10.3390/app9204198 ◽

2019 ◽

Vol 9 (20) ◽

pp. 4198

Author(s):

Wenzhou Chen ◽

Shizheng Zhou ◽

Zaisheng Pan ◽

Huixian Zheng ◽

Yong Liu

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Gradient Algorithm ◽

Lidar Data ◽

Robot System ◽

Navigation Task ◽

System A ◽

Group Navigation ◽

Policy Gradient ◽

Multi Robot

Compared with the single robot system, a multi-robot system has higher efficiency and fault tolerance. The multi-robot system has great potential in some application scenarios, such as the robot search, rescue and escort tasks, and so on. Deep reinforcement learning provides a potential framework for multi-robot formation and collaborative navigation. This paper mainly studies the collaborative formation and navigation of multi-robots by using the deep reinforcement learning algorithm. The proposed method improves the classical Deep Deterministic Policy Gradient (DDPG) to address the single robot mapless navigation task. We also extend the single-robot Deep Deterministic Policy Gradient algorithm to the multi-robot system, and obtain the Parallel Deep Deterministic Policy Gradient (PDDPG). By utilizing the 2D lidar sensor, the group of robots can accomplish the formation construction task and the collaborative formation navigation task. The experiment results in a Gazebo simulation platform illustrates that our method is capable of guiding mobile robots to construct the formation and keep the formation during group navigation, directly through raw lidar data inputs.

Download Full-text

Reinforcement Learning Based Multi-robot Formation Control Under Separation Bearing Orientation Scheme

2020 Chinese Automation Congress (CAC) ◽

10.1109/cac51589.2020.9327315 ◽

2020 ◽

Author(s):

Zichen He ◽

Lu Dong ◽

Changyin Sun ◽

Jiawei Wang

Keyword(s):

Reinforcement Learning ◽

Formation Control ◽

Multi Robot

Download Full-text

Research Progress on Synergistic Technologies of Agricultural Multi-Robots

Applied Sciences ◽

10.3390/app11041448 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1448

Author(s):

Wenju Mao ◽

Zhijie Liu ◽

Heng Liu ◽

Fuzeng Yang ◽

Meirong Wang

Keyword(s):

Path Planning ◽

Formation Control ◽

Research Progress ◽

Hybrid Architecture ◽

Labor Costs ◽

Robot System ◽

System Architectures ◽

Research Results ◽

Robot Systems ◽

Multi Robot

Multi-robots have shown good application prospects in agricultural production. Studying the synergistic technologies of agricultural multi-robots can not only improve the efficiency of the overall robot system and meet the needs of precision farming but also solve the problems of decreasing effective labor supply and increasing labor costs in agriculture. Therefore, starting from the point of view of an agricultural multiple robot system architectures, this paper reviews the representative research results of five synergistic technologies of agricultural multi-robots in recent years, namely, environment perception, task allocation, path planning, formation control, and communication, and summarizes the technological progress and development characteristics of these five technologies. Finally, because of these development characteristics, it is shown that the trends and research focus for agricultural multi-robots are to optimize the existing technologies and apply them to a variety of agricultural multi-robots, such as building a hybrid architecture of multi-robot systems, SLAM (simultaneous localization and mapping), cooperation learning of robots, hybrid path planning and formation reconstruction. While synergistic technologies of agricultural multi-robots are extremely challenging in production, in combination with previous research results for real agricultural multi-robots and social development demand, we conclude that it is realistic to expect automated multi-robot systems in the future.

Download Full-text

Analysis and solution of a predator–protector–prey multi-robot system by a high-level reinforcement learning architecture and the adaptive systems theory

Robotics and Autonomous Systems ◽

10.1016/j.robot.2010.08.005 ◽

2010 ◽

Vol 58 (12) ◽

pp. 1266-1272 ◽

Cited By ~ 3

Author(s):

José Antonio Martín H. ◽

Javier de Lope ◽

Darío Maravall

Keyword(s):

Reinforcement Learning ◽

Systems Theory ◽

Adaptive Systems ◽

Robot System ◽

High Level ◽

Multi Robot

Download Full-text

Improving the Robustness of Reinforcement Learning for a Multi-Robot System Environment

Advances in Soft Computing - Soft Computing as Transdisciplinary Science and Technology ◽

10.1007/3-540-32391-0_34 ◽

2007 ◽

pp. 263-272 ◽

Cited By ~ 1

Author(s):

Toshiyuki Yasuda ◽

Kazuhiro Ohkura

Keyword(s):

Reinforcement Learning ◽

Robot System ◽

System Environment ◽

Multi Robot

Download Full-text

Position adaptive formation control for multi-robot system using a redundant adaptive robust Kalman filter

2014 IEEE International Conference on Robotics and Biomimetics (ROBIO 2014) ◽

10.1109/robio.2014.7090445 ◽

2014 ◽

Cited By ~ 1

Author(s):

Xiancui Wei ◽

Zhiguo Shi

Keyword(s):

Kalman Filter ◽

Formation Control ◽

Robot System ◽

Robust Kalman Filter ◽

Multi Robot

Download Full-text

Dynamic model based formation control and obstacle avoidance of multi-robot systems

Robotica ◽

10.1017/s0263574707004092 ◽

2008 ◽

Vol 26 (3) ◽

pp. 345-356 ◽

Cited By ~ 58

Author(s):

Celso De La Cruz ◽

Ricardo Carelli

Keyword(s):

Dynamic Model ◽

Obstacle Avoidance ◽

Formation Control ◽

Inverse Dynamics ◽

Single Equation ◽

Model Parameters ◽

Robot System ◽

System A ◽

Robot Systems ◽

Multi Robot

SUMMARYThis work presents, first, a complete dynamic model of a unicycle-like mobile robot that takes part in a multi-robot formation. A linear parameterization of this model is performed in order to identify the model parameters. Then, the robot model is input-output feedback linearized. On a second stage, for the multi-robot system, a model is obtained by arranging into a single equation all the feedback linearized robot models. This multi-robot model is expressed in terms of formation states by applying a coordinate transformation. The inverse dynamics technique is then applied to design a formation control. The controller can be applied both to positioning and to tracking desired robot formations. The formation control can be centralized or decentralized and scalable to any number of robots. A strategy for rigid formation obstacle avoidance is also proposed. Experimental results validate the control system design.

Download Full-text