Reinforcement Learning Based Evolution and Learning Algorithm for Cooperative Behavior of Swarm Robot System

Swarm robotic systems (SRSs) are a type of multi-robot system in which robots operate without any form of centralized control. The typical design methodology for SRSs comprises a behavior-based approach, where the desired collective behavior is obtained manually by designing the behavior of individual robots in advance. In contrast, in an automatic design approach, a certain general methodology is adopted. This paper presents a deep reinforcement learning approach for collective behavior acquisition of SRSs. The swarm robots are expected to collect information in parallel and share their experience for accelerating their learning. We conducted real swarm robot experiments and evaluated the learning performance of the swarm in a scenario where the robots consecutively traveled between two landmarks.

Download Full-text

Cooperative behavior acquisition mechanism for a multi-robot system based on reinforcement learning in continuous space

Proceedings 2003 IEEE International Symposium on Computational Intelligence in Robotics and Automation ◽

10.1109/cira.2003.1222226 ◽

2004 ◽

Author(s):

T. Yasuda ◽

K. Ohkura ◽

T. Taura

Keyword(s):

Reinforcement Learning ◽

Cooperative Behavior ◽

Robot System ◽

Continuous Space ◽

Multi Robot

Download Full-text

Behavior Learning and Evolution of Individual Robot for Cooperative Behavior of Swarm Robot System

Journal of Korean institute of intelligent systems ◽

10.5391/jkiis.2006.16.2.131 ◽

2006 ◽

Vol 16 (2) ◽

pp. 131-137 ◽

Cited By ~ 5

Author(s):

Kwee-Bo Sim ◽

Dong-Wook Lee

Keyword(s):

Cooperative Behavior ◽

Robot System ◽

Behavior Learning ◽

Swarm Robot

Download Full-text

Behavior learning and evolution of swarm robot system for cooperative behavior

2009 IEEE/ASME International Conference on Advanced Intelligent Mechatronics ◽

10.1109/aim.2009.5229933 ◽

2009 ◽

Cited By ~ 1

Author(s):

Seo Sang-Wook ◽

Yang Hyun-Chang ◽

Sim Kwee-Bo

Keyword(s):

Cooperative Behavior ◽

Robot System ◽

Behavior Learning ◽

Swarm Robot

Download Full-text

Adaptive reinforcement Q-Learning algorithm for swarm-robot system using pheromone mechanism

2013 IEEE International Conference on Robotics and Biomimetics (ROBIO) ◽

10.1109/robio.2013.6739586 ◽

2013 ◽

Cited By ~ 2

Author(s):

Zhiguo Shi ◽

Jun Tu ◽

Yuankai Li ◽

Zeying Wang

Keyword(s):

Learning Algorithm ◽

Robot System ◽

Q Learning ◽

Swarm Robot

Download Full-text

Cooperative Behavior Acquisition for a Multi-Robot System by Reinforcement Learning in Continuous Space

The Proceedings of JSME annual Conference on Robotics and Mechatronics (Robomec) ◽

10.1299/jsmermd.2003.108_1 ◽

2003 ◽

Vol 2003 (0) ◽

pp. 108

Author(s):

T. Yasuda ◽

K. Ohkura ◽

K. Ueda ◽

T. Taura

Keyword(s):

Reinforcement Learning ◽

Cooperative Behavior ◽

Robot System ◽

Continuous Space ◽

Multi Robot

Download Full-text

Mapless Collaborative Navigation for a Multi-Robot System Based on the Deep Reinforcement Learning

Applied Sciences ◽

10.3390/app9204198 ◽

2019 ◽

Vol 9 (20) ◽

pp. 4198

Author(s):

Wenzhou Chen ◽

Shizheng Zhou ◽

Zaisheng Pan ◽

Huixian Zheng ◽

Yong Liu

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Gradient Algorithm ◽

Lidar Data ◽

Robot System ◽

Navigation Task ◽

System A ◽

Group Navigation ◽

Policy Gradient ◽

Multi Robot

Compared with the single robot system, a multi-robot system has higher efficiency and fault tolerance. The multi-robot system has great potential in some application scenarios, such as the robot search, rescue and escort tasks, and so on. Deep reinforcement learning provides a potential framework for multi-robot formation and collaborative navigation. This paper mainly studies the collaborative formation and navigation of multi-robots by using the deep reinforcement learning algorithm. The proposed method improves the classical Deep Deterministic Policy Gradient (DDPG) to address the single robot mapless navigation task. We also extend the single-robot Deep Deterministic Policy Gradient algorithm to the multi-robot system, and obtain the Parallel Deep Deterministic Policy Gradient (PDDPG). By utilizing the 2D lidar sensor, the group of robots can accomplish the formation construction task and the collaborative formation navigation task. The experiment results in a Gazebo simulation platform illustrates that our method is capable of guiding mobile robots to construct the formation and keep the formation during group navigation, directly through raw lidar data inputs.

Download Full-text