Sequencing of multi-robot behaviors using reinforcement learning

Control Theory and Technology ◽

10.1007/s11768-021-00069-5 ◽

2021 ◽

Author(s):

Pietro Pierpaoli ◽

Thinh T. Doan ◽

Justin Romberg ◽

Magnus Egerstedt

Keyword(s):

Reinforcement Learning ◽

Task Performance ◽

Gradient Descent ◽

Object Manipulation ◽

Individual Behavior ◽

Performance Level ◽

Sequencing Problem ◽

Differential Drive ◽

The Individual ◽

Multi Robot

AbstractGiven a collection of parameterized multi-robot controllers associated with individual behaviors designed for particular tasks, this paper considers the problem of how to sequence and instantiate the behaviors for the purpose of completing a more complex, overarching mission. In addition, uncertainties about the environment or even the mission specifications may require the robots to learn, in a cooperative manner, how best to sequence the behaviors. In this paper, we approach this problem by using reinforcement learning to approximate the solution to the computationally intractable sequencing problem, combined with an online gradient descent approach to selecting the individual behavior parameters, while the transitions among behaviors are triggered automatically when the behaviors have reached a desired performance level relative to a task performance cost. To illustrate the effectiveness of the proposed method, it is implemented on a team of differential-drive robots for solving two different missions, namely, convoy protection and object manipulation.

Download Full-text

Distributed Reinforcement Learning for Multi-robot Decentralized Collective Construction

Distributed Autonomous Robotic Systems - Springer Proceedings in Advanced Robotics ◽

10.1007/978-3-030-05816-6_3 ◽

2019 ◽

pp. 35-49 ◽

Cited By ~ 10

Author(s):

Guillaume Sartoretti ◽

Yue Wu ◽

William Paivine ◽

T. K. Satish Kumar ◽

Sven Koenig ◽

...

Keyword(s):

Reinforcement Learning ◽

Collective Construction ◽

Distributed Reinforcement ◽

Multi Robot

Download Full-text

Multi-Robot Collision Avoidance with Map-based Deep Reinforcement Learning

2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI) ◽

10.1109/ictai50040.2020.00088 ◽

2020 ◽

Author(s):

Shunyi Yao ◽

Guangda Chen ◽

Lifan Pan ◽

Jun Ma ◽

Jianmin Ji ◽

...

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Multi Robot

Download Full-text

Reinforcement-Learning-Based Asynchronous Formation Control Scheme for Multiple Unmanned Surface Vehicles

Applied Sciences ◽

10.3390/app11020546 ◽

2021 ◽

Vol 11 (2) ◽

pp. 546

Author(s):

Jiajia Xie ◽

Rui Zhou ◽

Yuan Liu ◽

Jun Luo ◽

Shaorong Xie ◽

...

Keyword(s):

Reinforcement Learning ◽

Formation Control ◽

Rapid Development ◽

Gradient Algorithm ◽

Robot System ◽

Physical Relationship ◽

Unmanned Surface Vehicles ◽

Main Challenge ◽

Control Scheme ◽

Multi Robot

The high performance and efficiency of multiple unmanned surface vehicles (multi-USV) promote the further civilian and military applications of coordinated USV. As the basis of multiple USVs’ cooperative work, considerable attention has been spent on developing the decentralized formation control of the USV swarm. Formation control of multiple USV belongs to the geometric problems of a multi-robot system. The main challenge is the way to generate and maintain the formation of a multi-robot system. The rapid development of reinforcement learning provides us with a new solution to deal with these problems. In this paper, we introduce a decentralized structure of the multi-USV system and employ reinforcement learning to deal with the formation control of a multi-USV system in a leader–follower topology. Therefore, we propose an asynchronous decentralized formation control scheme based on reinforcement learning for multiple USVs. First, a simplified USV model is established. Simultaneously, the formation shape model is built to provide formation parameters and to describe the physical relationship between USVs. Second, the advantage deep deterministic policy gradient algorithm (ADDPG) is proposed. Third, formation generation policies and formation maintenance policies based on the ADDPG are proposed to form and maintain the given geometry structure of the team of USVs during movement. Moreover, three new reward functions are designed and utilized to promote policy learning. Finally, various experiments are conducted to validate the performance of the proposed formation control scheme. Simulation results and contrast experiments demonstrate the efficiency and stability of the formation control scheme.

Download Full-text

EVA 2.0: Emotional and rational multimodal argumentation between virtual agents

it - Information Technology ◽

10.1515/itit-2020-0050 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Niklas Rach ◽

Klaus Weber ◽

Yuchi Yang ◽

Stefan Ultes ◽

Elisabeth André ◽

...

Keyword(s):

Reinforcement Learning ◽

User Feedback ◽

Virtual Agents ◽

Multi Agent System ◽

Dialogue Game ◽

Multi Agent ◽

The Individual ◽

Minimal Bias ◽

Intuitive Interface ◽

Emotional Level

Abstract Persuasive argumentation depends on multiple aspects, which include not only the content of the individual arguments, but also the way they are presented. The presentation of arguments is crucial – in particular in the context of dialogical argumentation. However, the effects of different discussion styles on the listener are hard to isolate in human dialogues. In order to demonstrate and investigate various styles of argumentation, we propose a multi-agent system in which different aspects of persuasion can be modelled and investigated separately. Our system utilizes argument structures extracted from text-based reviews for which a minimal bias of the user can be assumed. The persuasive dialogue is modelled as a dialogue game for argumentation that was motivated by the objective to enable both natural and flexible interactions between the agents. In order to support a comparison of factual against affective persuasion approaches, we implemented two fundamentally different strategies for both agents: The logical policy utilizes deep Reinforcement Learning in a multi-agent setup to optimize the strategy with respect to the game formalism and the available argument. In contrast, the emotional policy selects the next move in compliance with an agent emotion that is adapted to user feedback to persuade on an emotional level. The resulting interaction is presented to the user via virtual avatars and can be rated through an intuitive interface.

Download Full-text

Reinforcement Learning Based Multi-robot Formation Control Under Separation Bearing Orientation Scheme

2020 Chinese Automation Congress (CAC) ◽

10.1109/cac51589.2020.9327315 ◽

2020 ◽

Author(s):

Zichen He ◽

Lu Dong ◽

Changyin Sun ◽

Jiawei Wang

Keyword(s):

Reinforcement Learning ◽

Formation Control ◽

Multi Robot

Download Full-text

Optimising Performance for NB-IoT UE Devices through Data Driven Models

Journal of Sensor and Actuator Networks ◽

10.3390/jsan10010021 ◽

2021 ◽

Vol 10 (1) ◽

pp. 21

Author(s):

Omar Nassef ◽

Toktam Mahmoodi ◽

Foivos Michelinakis ◽

Kashif Mahmood ◽

Ahmed Elmokashfi

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Gradient Descent ◽

Deep Neural Network ◽

Narrow Band ◽

Learning Algorithm ◽

Base Station ◽

User Equipment ◽

Data Driven ◽

Superior Performance

This paper presents a data driven framework for performance optimisation of Narrow-Band IoT user equipment. The proposed framework is an edge micro-service that suggests one-time configurations to user equipment communicating with a base station. Suggested configurations are delivered from a Configuration Advocate, to improve energy consumption, delay, throughput or a combination of those metrics, depending on the user-end device and the application. Reinforcement learning utilising gradient descent and genetic algorithm is adopted synchronously with machine and deep learning algorithms to predict the environmental states and suggest an optimal configuration. The results highlight the adaptability of the Deep Neural Network in the prediction of intermediary environmental states, additionally the results present superior performance of the genetic reinforcement learning algorithm regarding its performance optimisation.

Download Full-text