scholarly journals DQN as an alternative to Market-based approaches for Multi-Robot processing Task Allocation (MRpTA)

2021 ◽  
Vol 3 (1) ◽  
pp. 69-98
Author(s):  
Paul Gautier ◽  
Johann Laurent

Multi-robot task allocation (MRTA) problems require that robots make complex choices based on their understanding of a dynamic and uncertain environment. As a distributed computing system, the Multi-Robot System (MRS) must handle and distribute processing tasks (MRpTA). Each robot must contribute to the overall efficiency of the system based solely on a limited knowledge of its environment. Market-based methods are a natural candidate to deal processing tasks over a MRS but recent and numerous developments in reinforcement learning and especially Deep Q-Networks (DQN) provide new opportunities to solve the problem. In this paper we propose a new DQN-based method so that robots can learn directly from experience, and compare it with Market-based approaches as well with centralized and purely local solutions. Our study shows the relevancy of learning-based methods and also highlight research challenges to solve the processing load-balancing problem in MRS.

2021 ◽  
Vol 11 (7) ◽  
pp. 2895
Author(s):  
Ahmed Elfakharany ◽  
Zool Hilmi Ismail

In this paper, we present a novel deep reinforcement learning (DRL) based method that is used to perform multi-robot task allocation (MRTA) and navigation in an end-to-end fashion. The policy operates in a decentralized manner mapping raw sensor measurements to the robot’s steering commands without the need to construct a map of the environment. We also present a new metric called the Task Allocation Index (TAI), which measures the performance of a method that performs MRTA and navigation from end-to-end in performing MRTA. The policy was trained on a simulated gazebo environment. The centralized learning and decentralized execution paradigm was used for training the policy. The policy was evaluated quantitatively and visually. The simulation results showed the effectiveness of the proposed method deployed on multiple Turtlebot3 robots.


2021 ◽  
Author(s):  
Ching-Wei Chuang ◽  
Harry H. Cheng

Abstract In the modern world, building an autonomous multi-robot system is essential to coordinate and control robots to help humans because using several low-cost robots becomes more robust and efficient than using one expensive, powerful robot to execute tasks to achieve the overall goal of a mission. One research area, multi-robot task allocation (MRTA), becomes substantial in a multi-robot system. Assigning suitable tasks to suitable robots is crucial in coordination, which may directly influence the result of a mission. In the past few decades, although numerous researchers have addressed various algorithms or approaches to solve MRTA problems in different multi-robot systems, it is still difficult to overcome certain challenges, such as dynamic environments, changeable task information, miscellaneous robot abilities, the dynamic condition of a robot, or uncertainties from sensors or actuators. In this paper, we propose a novel approach to handle MRTA problems with Bayesian Networks (BNs) under these challenging circumstances. Our experiments exhibit that the proposed approach may effectively solve real problems in a search-and-rescue mission in centralized, decentralized, and distributed multi-robot systems with real, low-cost robots in dynamic environments. In the future, we will demonstrate that our approach is trainable and can be utilized in a large-scale, complicated environment. Researchers might be able to apply our approach to other applications to explore its extensibility.


2021 ◽  
Vol 12 (1) ◽  
pp. 272
Author(s):  
Bumjin Park ◽  
Cheongwoong Kang ◽  
Jaesik Choi

This paper deals with the concept of multi-robot task allocation, referring to the assignment of multiple robots to tasks such that an objective function is maximized. The performance of existing meta-heuristic methods worsens as the number of robots or tasks increases. To tackle this problem, a novel Markov decision process formulation for multi-robot task allocation is presented for reinforcement learning. The proposed formulation sequentially allocates robots to tasks to minimize the total time taken to complete them. Additionally, we propose a deep reinforcement learning method to find the best allocation schedule for each problem. Our method adopts the cross-attention mechanism to compute the preference of robots to tasks. The experimental results show that the proposed method finds better solutions than meta-heuristic methods, especially when solving large-scale allocation problems.


2021 ◽  
Vol 11 (2) ◽  
pp. 546
Author(s):  
Jiajia Xie ◽  
Rui Zhou ◽  
Yuan Liu ◽  
Jun Luo ◽  
Shaorong Xie ◽  
...  

The high performance and efficiency of multiple unmanned surface vehicles (multi-USV) promote the further civilian and military applications of coordinated USV. As the basis of multiple USVs’ cooperative work, considerable attention has been spent on developing the decentralized formation control of the USV swarm. Formation control of multiple USV belongs to the geometric problems of a multi-robot system. The main challenge is the way to generate and maintain the formation of a multi-robot system. The rapid development of reinforcement learning provides us with a new solution to deal with these problems. In this paper, we introduce a decentralized structure of the multi-USV system and employ reinforcement learning to deal with the formation control of a multi-USV system in a leader–follower topology. Therefore, we propose an asynchronous decentralized formation control scheme based on reinforcement learning for multiple USVs. First, a simplified USV model is established. Simultaneously, the formation shape model is built to provide formation parameters and to describe the physical relationship between USVs. Second, the advantage deep deterministic policy gradient algorithm (ADDPG) is proposed. Third, formation generation policies and formation maintenance policies based on the ADDPG are proposed to form and maintain the given geometry structure of the team of USVs during movement. Moreover, three new reward functions are designed and utilized to promote policy learning. Finally, various experiments are conducted to validate the performance of the proposed formation control scheme. Simulation results and contrast experiments demonstrate the efficiency and stability of the formation control scheme.


2006 ◽  
Vol 13 (5) ◽  
pp. 548-551 ◽  
Author(s):  
Ping-an Gao ◽  
Zi-xing Cai

Sign in / Sign up

Export Citation Format

Share Document