Learning multi-agent communication with double attentional deep reinforcement learning

2020 ◽  
Vol 34 (1) ◽  
Author(s):  
Hangyu Mao ◽  
Zhengchao Zhang ◽  
Zhen Xiao ◽  
Zhibo Gong ◽  
Yan Ni
2020 ◽  
Vol 34 (05) ◽  
pp. 7160-7168
Author(s):  
Benjamin Freed ◽  
Guillaume Sartoretti ◽  
Jiaheng Hu ◽  
Howie Choset

This work focuses on multi-agent reinforcement learning (RL) with inter-agent communication, in which communication is differentiable and optimized through backpropagation. Such differentiable approaches tend to converge more quickly to higher-quality policies compared to techniques that treat communication as actions in a traditional RL framework. However, modern communication networks (e.g., Wi-Fi or Bluetooth) rely on discrete communication channels, for which existing differentiable approaches that consider real-valued messages cannot be directly applied, or require biased gradient estimators. Some works have overcome this problem by treating the message space as an extension of the action space, and use standard RL to optimize message selection, but these methods tend to converge slower and to inferior policies. In this paper, we propose a stochastic message encoding/decoding procedure that makes a discrete communication channel mathematically equivalent to an analog channel with additive noise, through which gradients can be backpropagated. Additionally, we introduce an encryption step for use in noisy channels that forces channel noise to be message-independent, allowing us to compute unbiased derivative estimates even in the presence of unknown channel noise. To the best of our knowledge, this work presents the first differentiable communication learning approach that can compute unbiased derivatives through channels with unknown noise. We demonstrate the effectiveness of our approach in two example multi-robot tasks: a path finding and a collaborative search problem. There, we show that our approach achieves learning speed and performance similar to differentiable communication learning with real-valued messages (i.e., unlimited communication bandwidth), while naturally handling more realistic real-world communication constraints. Content Areas: Multi-Agent Communication, Reinforcement Learning.


2020 ◽  
Vol 34 (04) ◽  
pp. 5142-5149
Author(s):  
Hangyu Mao ◽  
Zhengchao Zhang ◽  
Zhen Xiao ◽  
Zhibo Gong ◽  
Yan Ni

Communication is a crucial factor for the big multi-agent world to stay organized and productive. Recently, Deep Reinforcement Learning (DRL) has been applied to learn the communication strategy and the control policy for multiple agents. However, the practical limited bandwidth in multi-agent communication has been largely ignored by the existing DRL methods. Specifically, many methods keep sending messages incessantly, which consumes too much bandwidth. As a result, they are inapplicable to multi-agent systems with limited bandwidth. To handle this problem, we propose a gating mechanism to adaptively prune less beneficial messages. We evaluate the gating mechanism on several tasks. Experiments demonstrate that it can prune a lot of messages with little impact on performance. In fact, the performance may be greatly improved by pruning redundant messages. Moreover, the proposed gating mechanism is applicable to several previous methods, equipping them the ability to address bandwidth restricted settings.


Author(s):  
Hao Jiang ◽  
Dianxi Shi ◽  
Chao Xue ◽  
Yajie Wang ◽  
Gongju Wang ◽  
...  

Author(s):  
Xiaoyu Zhu ◽  
Yueyi Luo ◽  
Anfeng Liu ◽  
Md Zakirul Alam Bhuiyan ◽  
Shaobo Zhang

2021 ◽  
Vol 11 (11) ◽  
pp. 4948
Author(s):  
Lorenzo Canese ◽  
Gian Carlo Cardarilli ◽  
Luca Di Di Nunzio ◽  
Rocco Fazzolari ◽  
Daniele Giardino ◽  
...  

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.


Sign in / Sign up

Export Citation Format

Share Document