Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems

The aim of multi-agent reinforcement learning systems is to provide interacting agents with the ability to collaboratively learn and adapt to the behavior of other agents. Typically, an agent receives its private observations providing a partial view of the true state of the environment. However, in realistic settings, the harsh environment might cause one or more agents to show arbitrarily faulty or malicious behavior, which may suffice to allow the current coordination mechanisms fail. In this paper, we study a practical scenario of multi-agent reinforcement learning systems considering the security issues in the presence of agents with arbitrarily faulty or malicious behavior. The previous state-of-the-art work that coped with extremely noisy environments was designed on the basis that the noise intensity in the environment was known in advance. However, when the noise intensity changes, the existing method has to adjust the configuration of the model to learn in new environments, which limits the practical applications. To overcome these difficulties, we present an Attention-based Fault-Tolerant (FT-Attn) model, which can select not only correct, but also relevant information for each agent at every time step in noisy environments. The multihead attention mechanism enables the agents to learn effective communication policies through experience concurrent with the action policies. Empirical results showed that FT-Attn beats previous state-of-the-art methods in some extremely noisy environments in both cooperative and competitive scenarios, much closer to the upper-bound performance. Furthermore, FT-Attn maintains a more general fault tolerance ability and does not rely on the prior knowledge about the noise intensity of the environment.

Download Full-text

Multi-agent Fault-tolerant Reinforcement Learning with Noisy Environments

2020 IEEE 26th International Conference on Parallel and Distributed Systems (ICPADS) ◽

10.1109/icpads51040.2020.00031 ◽

2020 ◽

Author(s):

Canhui Luo ◽

Xuan Liu ◽

Xinning Chen ◽

Juan Luo

Keyword(s):

Reinforcement Learning ◽

Fault Tolerant ◽

Noisy Environments ◽

Multi Agent

Download Full-text

Generalized and Sub-Optimal Bipartite Constraints for Conflict-Based Search

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6219 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7277-7284

Author(s):

Thayne T. Walker ◽

Nathan R. Sturtevant ◽

Ariel Felner

Keyword(s):

Bipartite Graph ◽

Continuous Time ◽

State Of The Art ◽

Main Idea ◽

Constraint Generation ◽

Empirical Results ◽

Generation Technique ◽

Unit Costs ◽

Previous State ◽

Multi Agent

The main idea of conflict-based search (CBS), a popular, state-of-the-art algorithm for multi-agent pathfinding is to resolve conflicts between agents by systematically adding constraints to agents. Recently, CBS has been adapted for new domains and variants, including non-unit costs and continuous time settings. These adaptations require new types of constraints. This paper introduces a new automatic constraint generation technique called bipartite reduction (BR). BR converts the constraint generation step of CBS to a surrogate bipartite graph problem. The properties of BR guarantee completeness and optimality for CBS. Also, BR's properties may be relaxed to obtain suboptimal solutions. Empirical results show that BR yields significant speedups in 2k connected grids over the previous state-of-the-art for both optimal and suboptimal search.

Download Full-text

Self-Attention ConvLSTM for Spatiotemporal Prediction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6819 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11531-11538

Author(s):

Zhihui Lin ◽

Maomao Li ◽

Zhuobin Zheng ◽

Yangyang Cheng ◽

Chun Yuan

Keyword(s):

Long Range ◽

State Of The Art ◽

The Self ◽

Time Step ◽

Traffic Flow Prediction ◽

Spatial Features ◽

Gating Mechanism ◽

Spatial Dependencies ◽

Previous State ◽

Global And Local

Spatiotemporal prediction is challenging due to the complex dynamic motion and appearance changes. Existing work concentrates on embedding additional cells into the standard ConvLSTM to memorize spatial appearances during the prediction. These models always rely on the convolution layers to capture the spatial dependence, which are local and inefficient. However, long-range spatial dependencies are significant for spatial applications. To extract spatial features with both global and local dependencies, we introduce the self-attention mechanism into ConvLSTM. Specifically, a novel self-attention memory (SAM) is proposed to memorize features with long-range dependencies in terms of spatial and temporal domains. Based on the self-attention, SAM can produce features by aggregating features across all positions of both the input itself and memory features with pair-wise similarity scores. Moreover, the additional memory is updated by a gating mechanism on aggregated features and an established highway with the memory of the previous time step. Therefore, through SAM, we can extract features with long-range spatiotemporal dependencies. Furthermore, we embed the SAM into a standard ConvLSTM to construct a self-attention ConvLSTM (SA-ConvLSTM) for the spatiotemporal prediction. In experiments, we apply the SA-ConvLSTM to perform frame prediction on the MovingMNIST and KTH datasets and traffic flow prediction on the TexiBJ dataset. Our SA-ConvLSTM achieves state-of-the-art results on both datasets with fewer parameters and higher time efficiency than previous state-of-the-art method.

Download Full-text

Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6216 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7253-7260 ◽

Cited By ~ 2

Author(s):

Yuhang Song ◽

Andrzej Wojcicki ◽

Thomas Lukasiewicz ◽

Jianyi Wang ◽

Abi Aryan ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Research Community ◽

Learning Agents ◽

General Evaluation ◽

Agent Learning ◽

Multi Agent ◽

Agent Intelligence ◽

Training Schemes ◽

Evaluation Platform

Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/.

Download Full-text

Design of an Adaptive e-Learning System based on Multi-Agent Approach and Reinforcement Learning

Engineering, Technology & Applied Science Research ◽

10.48084/etasr.3905 ◽

2021 ◽

Vol 11 (1) ◽

pp. 6637-6644

Author(s):

H. El Fazazi ◽

M. Elgarej ◽

M. Qbadou ◽

K. Mansouri

Keyword(s):

Reinforcement Learning ◽

Learning Style ◽

Learning Process ◽

Learning Experience ◽

Personalized Learning ◽

Learning Objects ◽

Learning Systems ◽

Learning System ◽

E Learning ◽

Multi Agent

Adaptive e-learning systems are created to facilitate the learning process. These systems are able to suggest the student the most suitable pedagogical strategy and to extract the information and characteristics of the learners. A multi-agent system is a collection of organized and independent agents that communicate with each other to resolve a problem or complete a well-defined objective. These agents are always in communication and they can be homogeneous or heterogeneous and may or may not have common objectives. The application of the multi-agent approach in adaptive e-learning systems can enhance the learning process quality by customizing the contents to students’ needs. The agents in these systems collaborate to provide a personalized learning experience. In this paper, a design of an adaptative e-learning system based on a multi-agent approach and reinforcement learning is presented. The main objective of this system is the recommendation to the students of a learning path that meets their characteristics and preferences using the Q-learning algorithm. The proposed system is focused on three principal characteristics, the learning style according to the Felder-Silverman learning style model, the knowledge level, and the student's possible disabilities. Three types of disabilities were taken into account, namely hearing impairments, visual impairments, and dyslexia. The system will be able to provide the students with a sequence of learning objects that matches their profiles for a personalized learning experience.

Download Full-text

VarLenMARL: A Framework of Variable-Length Time-Step Multi-Agent Reinforcement Learning for Cooperative Charging in Sensor Networks

2021 18th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON) ◽

10.1109/secon52354.2021.9491594 ◽

2021 ◽

Author(s):

Yuxin Chen ◽

Hejun Wu ◽

Yongheng Liang ◽

Guoming Lai

Keyword(s):

Reinforcement Learning ◽

Sensor Networks ◽

Variable Length ◽

Time Step ◽

Multi Agent

Download Full-text

Multi-Agent Deep Reinforcement Learning for Solving Large-scale Air Traffic Flow Management Problem: A Time-Step Sequential Decision Approach

10.1109/dasc52595.2021.9594329 ◽

2021 ◽

Author(s):

Yifan Tang ◽

Yan Xu

Keyword(s):

Reinforcement Learning ◽

Traffic Flow ◽

Large Scale ◽

Management Problem ◽

Sequential Decision ◽

Time Step ◽

Flow Management ◽

Air Traffic Flow Management ◽

Traffic Flow Management ◽

Multi Agent

Download Full-text

Social interaction of cooperative communication and group generation in multi-agent reinforcement learning systems

2011 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE 2011) ◽

10.1109/fuzzy.2011.6007577 ◽

2011 ◽

Cited By ~ 1

Author(s):

Kun Zhang ◽

Yoichiro Maeda ◽

Yasutake Takahashi

Keyword(s):

Reinforcement Learning ◽

Social Interaction ◽

Cooperative Communication ◽

Learning Systems ◽

Multi Agent ◽

Group Generation

Download Full-text

Formal Reachability Analysis for Multi-Agent Reinforcement Learning Systems

IEEE Access ◽

10.1109/access.2021.3060156 ◽

2021 ◽

Vol 9 ◽

pp. 45812-45821

Author(s):

Xiaoyan Wang ◽

Jun Peng ◽

Shuqiu Li ◽

Bing Li

Keyword(s):

Reinforcement Learning ◽

Reachability Analysis ◽

Learning Systems ◽

Multi Agent

Download Full-text

KnowRU: Knowledge Reuse via Knowledge Distillation in Multi-Agent Reinforcement Learning

Entropy ◽

10.3390/e23081043 ◽

2021 ◽

Vol 23 (8) ◽

pp. 1043

Author(s):

Zijian Gao ◽

Kele Xu ◽

Bo Ding ◽

Huaimin Wang

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Training Phase ◽

Knowledge Reuse ◽

Asymptotic Performance ◽

Significant Progress ◽

Historical Experience ◽

Training Performance ◽

Knowledge Distillation ◽

Multi Agent

Recently, deep reinforcement learning (RL) algorithms have achieved significant progress in the multi-agent domain. However, training for increasingly complex tasks would be time-consuming and resource intensive. To alleviate this problem, efficient leveraging of historical experience is essential, which is under-explored in previous studies because most existing methods fail to achieve this goal in a continuously dynamic system owing to their complicated design. In this paper, we propose a method for knowledge reuse called “KnowRU”, which can be easily deployed in the majority of multi-agent reinforcement learning (MARL) algorithms without requiring complicated hand-coded design. We employ the knowledge distillation paradigm to transfer knowledge among agents to shorten the training phase for new tasks while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art MARL algorithms in collaborative and competitive scenarios. The results show that KnowRU outperforms recently reported methods and not only successfully accelerates the training phase, but also improves the training performance, emphasizing the importance of the proposed knowledge reuse for MARL.

Download Full-text