Instance-Based Reinforcement Learning Technique with a Meta-learning Mechanism for Robust Multi-Robot Systems

We have been developing a reinforcement learning technique called BRL as an approach to autonomous specialization, which is a new concept in cooperative multi-robot systems. BRL has a mechanism for autonomously segmenting the continuous state and action space. However, as in other machine learning approaches, overfitting is occasionally observed after successful learning. This paper proposes a technique to sophisticatedly utilize messy knowledge acquired using BRL. The proposed technique is expected to show better robustness against environmental changes. We investigate the proposed technique by conducting computer simulations of a cooperative carrying task.

Download Full-text

Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios

The International Journal of Robotics Research ◽

10.1177/0278364920916531 ◽

2020 ◽

Vol 39 (7) ◽

pp. 856-892 ◽

Cited By ~ 4

Author(s):

Tingxiang Fan ◽

Pinxin Long ◽

Wenxi Liu ◽

Jia Pan

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Autonomous Navigation ◽

Large Scale ◽

Learning Algorithm ◽

Free Action ◽

Parameter Tuning ◽

Movement Velocity ◽

Robot Systems ◽

Multi Robot

Developing a safe and efficient collision-avoidance policy for multiple robots is challenging in the decentralized scenarios where each robot generates its paths with limited observation of other robots’ states and intentions. Prior distributed multi-robot collision-avoidance systems often require frequent inter-robot communication or agent-level features to plan a local collision-free action, which is not robust and computationally prohibitive. In addition, the performance of these methods is not comparable with their centralized counterparts in practice. In this article, we present a decentralized sensor-level collision-avoidance policy for multi-robot systems, which shows promising results in practical applications. In particular, our policy directly maps raw sensor measurements to an agent’s steering commands in terms of the movement velocity. As a first step toward reducing the performance gap between decentralized and centralized methods, we present a multi-scenario multi-stage training framework to learn an optimal policy. The policy is trained over a large number of robots in rich, complex environments simultaneously using a policy-gradient-based reinforcement-learning algorithm. The learning algorithm is also integrated into a hybrid control framework to further improve the policy’s robustness and effectiveness. We validate the learned sensor-level collision-3avoidance policy in a variety of simulated and real-world scenarios with thorough performance evaluations for large-scale multi-robot systems. The generalization of the learned policy is verified in a set of unseen scenarios including the navigation of a group of heterogeneous robots and a large-scale scenario with 100 robots. Although the policy is trained using simulation data only, we have successfully deployed it on physical robots with shapes and dynamics characteristics that are different from the simulated agents, in order to demonstrate the controller’s robustness against the simulation-to-real modeling error. Finally, we show that the collision-avoidance policy learned from multi-robot navigation tasks provides an excellent solution for safe and effective autonomous navigation for a single robot working in a dense real human crowd. Our learned policy enables a robot to make effective progress in a crowd without getting stuck. More importantly, the policy has been successfully deployed on different types of physical robot platforms without tedious parameter tuning. Videos are available at https://sites.google.com/view/hybridmrca .

Download Full-text

A Reinforcement Learning Technique with an Adaptive Action Generator for a Multi-Robot System

Multi-Robot Systems, Trends and Development ◽

10.5772/13337 ◽

2011 ◽

Author(s):

Kazuhiro Ohkura ◽

Toshiyuki Yasu

Keyword(s):

Reinforcement Learning ◽

Robot System ◽

Learning Technique ◽

Adaptive Action ◽

Multi Robot

Download Full-text

A reinforcement learning trained fuzzy neural network controller for maintaining wireless communication connections in multi-robot systems

10.1117/12.2053991 ◽

2014 ◽

Author(s):

Xu Zhong ◽

Yu Zhou

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Wireless Communication ◽

Fuzzy Neural Network ◽

Neural Network Controller ◽

Network Controller ◽

Robot Systems ◽

Fuzzy Neural ◽

Multi Robot

Download Full-text

A Reinforcement Learning Technique with an Adaptive Action Generator for a Multi-robot System

Lecture Notes in Computer Science - From Animals to Animats 10 ◽

10.1007/978-3-540-69134-1_25 ◽

2008 ◽

pp. 250-259 ◽

Cited By ~ 5

Author(s):

Toshiyuki Yasuda ◽

Kazuhiro Ohkura

Keyword(s):

Reinforcement Learning ◽

Robot System ◽

Learning Technique ◽

Adaptive Action ◽

Multi Robot

Download Full-text

A deep reinforcement learning approach to preserve connectivity for multi-robot systems

2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI) ◽

10.1109/cisp-bmei.2017.8302157 ◽

2017 ◽

Cited By ~ 2

Author(s):

Wanrong Huang ◽

Yanzhen Wang ◽

Xiaodong Yi

Keyword(s):

Reinforcement Learning ◽

Learning Approach ◽

Robot Systems ◽

Multi Robot

Download Full-text

An Analysis of Rule Deletion Scheme in XCS on Reinforcement Learning Problem

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2017.p0876 ◽

2017 ◽

Vol 21 (5) ◽

pp. 876-884

Author(s):

Masaya Nakata ◽

◽

Tomoki Hamagami

Keyword(s):

Reinforcement Learning ◽

Learning Problem ◽

Rule Based ◽

Learning Mechanism ◽

Q Learning ◽

State Action ◽

Classifier System ◽

Specific Subset ◽

Learning Technique

The XCS classifier system is an evolutionary rule-based learning technique powered by a Q-learning like learning mechanism. It employs a global deletion scheme to delete rules from all rules covering all state-action pairs. However, the optimality of this scheme remains unclear owing to the lack of intensive analysis. We here introduce two deletion schemes: 1) local deletion, which can be applied to a subset of rules covering each state (a match set), and 2) stronger local deletion, which can be applied to a more specific subset covering each state-action pair (an action set). The aim of this paper is to reveal how the above three deletion schemes affect the performance of XCS. Our analysis shows that the local deletion schemes promote the elimination of inaccurate rules compared with the global deletion scheme. However, the stronger local deletion scheme occasionally deletes a good rule. We further show that the two local deletion schemes greatly improve the performance of XCS on a set of noisy maze problems. Although the localization strength of the proposed deletion schemes may require consideration, they can be adequate for XCS rather than the original global deletion scheme.

Download Full-text

Multi-robot systems with agent-based reinforcement learning: evolution, opportunities and challenges

International Journal of Modelling Identification and Control ◽

10.1504/ijmic.2009.024735 ◽

2009 ◽

Vol 6 (4) ◽

pp. 271 ◽

Cited By ~ 4

Author(s):

Erfu Yang ◽

Dongbing Gu

Keyword(s):

Reinforcement Learning ◽

Agent Based ◽

Robot Systems ◽

Multi Robot

Download Full-text