Alternation Measures for the Evaluation of Selfish Agents’ Turn-Taking

Mapping Intimacies ◽

10.3233/faia210145 ◽

2021 ◽

Author(s):

Nikolaos Al. Papadopoulos ◽

Marti Sanchez-Fibla

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Social Phenomena ◽

Learning Agents ◽

Fairness And Efficiency ◽

Turn Taking ◽

Multi Agent ◽

Series Of Experiments ◽

Selfish Agents ◽

Efficient Distribution

Multi-Agent Reinforcement Learning reductionist simulations can provide a spectrum of opportunities towards the modeling and understanding of complex social phenomena such as common-pool appropriation. In this paper, a multiplayer variant of Battle-of-the-Exes is suggested as appropriate for experimentation regarding fair and efficient coordination and turn-taking among selfish agents. Going beyond literature’s fairness and efficiency, a novel measure is proposed for turn-taking coordination evaluation, robust to the number of agents and episodes of a system. Six variants of this measure are defined, entitled Alternation Measures or ALT. ALT measures were found sufficient to capture the desired properties (alternation, fair and efficient distribution) in comparison to state-of-the-art measures, thus they were benchmarked and tested through a series of experiments with Reinforcement Learning agents, aspiring to contribute novel tools for a deeper understanding of emergent social outcomes.

Download Full-text

Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6216 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7253-7260 ◽

Cited By ~ 2

Author(s):

Yuhang Song ◽

Andrzej Wojcicki ◽

Thomas Lukasiewicz ◽

Jianyi Wang ◽

Abi Aryan ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Research Community ◽

Learning Agents ◽

General Evaluation ◽

Agent Learning ◽

Multi Agent ◽

Agent Intelligence ◽

Training Schemes ◽

Evaluation Platform

Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/.

Download Full-text

Combat Robot Strategy Adaptation Using Multiple Learning Agents

Volume 4: Dynamics, Control and Uncertainty, Parts A and B ◽

10.1115/imece2012-87521 ◽

2012 ◽

Author(s):

Thomas Recchia ◽

Jae Chung ◽

Kishore Pochiraju

Keyword(s):

Reinforcement Learning ◽

Robotic Systems ◽

Multi Agent System ◽

Learning Agents ◽

Loosely Coupled ◽

Reward Function ◽

Strategy Adaptation ◽

Agent Learning ◽

Multi Agent ◽

Reward Functions

As robotic systems become more prevalent, it is highly desirable for them to be able to operate in highly dynamic environments. A common approach is to use reinforcement learning to allow an agent controlling the robot to learn and adapt its behavior based on a reward function. This paper presents a novel multi-agent system that cooperates to control a single robot battle tank in a melee battle scenario, with no prior knowledge of its opponents’ strategies. The agents learn through reinforcement learning, and are loosely coupled by their reward functions. Each agent controls a different aspect of the robot’s behavior. In addition, the problem of delayed reward is addressed through a time-averaged reward applied to several sequential actions at once. This system was evaluated in a simulated melee combat scenario and was shown to learn to improve its performance over time. This was accomplished by each agent learning to pick specific battle strategies for each different opponent it faced.

Download Full-text

Building autonomic systems using collaborative reinforcement learning

The Knowledge Engineering Review ◽

10.1017/s0269888906000956 ◽

2006 ◽

Vol 21 (3) ◽

pp. 231-238 ◽

Cited By ~ 12

Author(s):

JIM DOWLING ◽

RAYMOND CUNNINGHAM ◽

EOIN CURRAN ◽

VINNY CAHILL

Keyword(s):

Reinforcement Learning ◽

Ad Hoc ◽

Optimization Problems ◽

System Optimization ◽

Multi Agent Systems ◽

Learning Agents ◽

Coordination Model ◽

Routing Performance ◽

Multi Agent ◽

Unpredictable Environment

This paper presents Collaborative Reinforcement Learning (CRL), a coordination model for online system optimization in decentralized multi-agent systems. In CRL system optimization problems are represented as a set of discrete optimization problems, each of whose solution cost is minimized by model-based reinforcement learning agents collaborating on their solution. CRL systems can be built to provide autonomic behaviours such as optimizing system performance in an unpredictable environment and adaptation to partial failures. We evaluate CRL using an ad hoc routing protocol that optimizes system routing performance in an unpredictable network environment.

Download Full-text

Attention-Based Fault-Tolerant Approach for Multi-Agent Reinforcement Learning Systems

Entropy ◽

10.3390/e23091133 ◽

2021 ◽

Vol 23 (9) ◽

pp. 1133

Author(s):

Shanzhi Gu ◽

Mingyang Geng ◽

Long Lan

Keyword(s):

Reinforcement Learning ◽

Noise Intensity ◽

Fault Tolerant ◽

State Of The Art ◽

Learning Systems ◽

Noisy Environments ◽

Time Step ◽

Malicious Behavior ◽

Previous State ◽

Multi Agent

The aim of multi-agent reinforcement learning systems is to provide interacting agents with the ability to collaboratively learn and adapt to the behavior of other agents. Typically, an agent receives its private observations providing a partial view of the true state of the environment. However, in realistic settings, the harsh environment might cause one or more agents to show arbitrarily faulty or malicious behavior, which may suffice to allow the current coordination mechanisms fail. In this paper, we study a practical scenario of multi-agent reinforcement learning systems considering the security issues in the presence of agents with arbitrarily faulty or malicious behavior. The previous state-of-the-art work that coped with extremely noisy environments was designed on the basis that the noise intensity in the environment was known in advance. However, when the noise intensity changes, the existing method has to adjust the configuration of the model to learn in new environments, which limits the practical applications. To overcome these difficulties, we present an Attention-based Fault-Tolerant (FT-Attn) model, which can select not only correct, but also relevant information for each agent at every time step in noisy environments. The multihead attention mechanism enables the agents to learn effective communication policies through experience concurrent with the action policies. Empirical results showed that FT-Attn beats previous state-of-the-art methods in some extremely noisy environments in both cooperative and competitive scenarios, much closer to the upper-bound performance. Furthermore, FT-Attn maintains a more general fault tolerance ability and does not rely on the prior knowledge about the noise intensity of the environment.

Download Full-text

Research on Tensor-Based Cooperative and Competitive in Multi-Agent Reinforcement Learning

European Journal of Electrical Engineering and Computer Science ◽

10.24018/ejece.2020.4.6.262 ◽

2020 ◽

Vol 4 (6) ◽

Author(s):

Tsega Weldu Araya ◽

Md Rashed Ibn Nawab ◽

A. P. Yuan Ling

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Data Representation ◽

Training Data ◽

Two Dimensional ◽

Multiple Agents ◽

Learning Agents ◽

Dimensional Array ◽

Multi Agent ◽

Agent Cooperation

As technology overgrows, the assortment of information and the density of work becomes demanding to manage. To resolve the density of employment and human labor, machine-learning (ML) technology developed. Reinforcement learning (RL) is the recent advancement of ML studies. Multi-agent reinforcement learning (MARL) is useful to train multiple agents in the surrounding environment. The previous research studies focused on two-agent cooperation. Their data representation was held in a two-dimensional array, which is called a matrix. The limitation of this two-dimensional array appears as the training data of agents increases. The growth in the training data of agents creates storage drawbacks and data redundancy. Our first aim in this research is to improve an algorithm that can represent MARL training in tensor. In MARL, multiple agents are work together to achieve joint work. To share the training records and data of numerous agents, we need to collect the previous cumulative experience of agents in tensor. Secondly, we will discover the agent's cooperation and competition, with local and global goals of agents in MARL. Local goals are the cooperation of agents in a group or team where we use the training model as a student and teacher agent. The global goal is the competition between two contrary teams to acquire the reward. All learning agents have their Q table for storing the individual agent's training data in an environment. The growth in the number of learning agents, their training experience in Q tables, and the requirement for representing multiple data become the most challenging issue. We introduce tensor to store various data to resolve the challenges for data representation in multiple agent associations. Tensor is expressed as the three-dimensional array, although it is an N-way array, which is useful for representing and accessing numerous data. Finally, we will implement an algorithm for learning three cooperative agents against the opposed team using a tensor-based framework in the Q learning algorithm. We will provide an algorithm that can store the training records and data of multiple agents. Tensor advances to get a small storage size than the matrix for the training records of agents. Although three agent cooperation benefits to having maximum optimal reward.

Download Full-text

Co-Evolution of Predator-Prey Ecosystems by Reinforcement Learning Agents

Entropy ◽

10.3390/e23040461 ◽

2021 ◽

Vol 23 (4) ◽

pp. 461

Author(s):

Jeongho Park ◽

Juwon Lee ◽

Taehwan Kim ◽

Inkyung Ahn ◽

Jooyoung Park

Keyword(s):

Reinforcement Learning ◽

Expected Returns ◽

Dynamic Nature ◽

Predator Prey ◽

Learning Agents ◽

Complex Interactions ◽

Multi Agent ◽

Simulation Results ◽

Multiple Species ◽

And Robotics

The problem of finding adequate population models in ecology is important for understanding essential aspects of their dynamic nature. Since analyzing and accurately predicting the intelligent adaptation of multiple species is difficult due to their complex interactions, the study of population dynamics still remains a challenging task in computational biology. In this paper, we use a modern deep reinforcement learning (RL) approach to explore a new avenue for understanding predator-prey ecosystems. Recently, reinforcement learning methods have achieved impressive results in areas, such as games and robotics. RL agents generally focus on building strategies for taking actions in an environment in order to maximize their expected returns. Here we frame the co-evolution of predators and preys in an ecosystem as allowing agents to learn and evolve toward better ones in a manner appropriate for multi-agent reinforcement learning. Recent significant advancements in reinforcement learning allow for new perspectives on these types of ecological issues. Our simulation results show that throughout the scenarios with RL agents, predators can achieve a reasonable level of sustainability, along with their preys.

Download Full-text

KnowRU: Knowledge Reuse via Knowledge Distillation in Multi-Agent Reinforcement Learning

Entropy ◽

10.3390/e23081043 ◽

2021 ◽

Vol 23 (8) ◽

pp. 1043

Author(s):

Zijian Gao ◽

Kele Xu ◽

Bo Ding ◽

Huaimin Wang

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Training Phase ◽

Knowledge Reuse ◽

Asymptotic Performance ◽

Significant Progress ◽

Historical Experience ◽

Training Performance ◽

Knowledge Distillation ◽

Multi Agent

Recently, deep reinforcement learning (RL) algorithms have achieved significant progress in the multi-agent domain. However, training for increasingly complex tasks would be time-consuming and resource intensive. To alleviate this problem, efficient leveraging of historical experience is essential, which is under-explored in previous studies because most existing methods fail to achieve this goal in a continuously dynamic system owing to their complicated design. In this paper, we propose a method for knowledge reuse called “KnowRU”, which can be easily deployed in the majority of multi-agent reinforcement learning (MARL) algorithms without requiring complicated hand-coded design. We employ the knowledge distillation paradigm to transfer knowledge among agents to shorten the training phase for new tasks while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art MARL algorithms in collaborative and competitive scenarios. The results show that KnowRU outperforms recently reported methods and not only successfully accelerates the training phase, but also improves the training performance, emphasizing the importance of the proposed knowledge reuse for MARL.

Download Full-text

Rebalancing Expanding EV Sharing Systems with Deep Reinforcement Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/186 ◽

2020 ◽

Author(s):

Man Luo ◽

Wenzhe Zhang ◽

Tianyou Song ◽

Kun Li ◽

Hongming Zhu ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Charging Time ◽

Novel Approach ◽

The World ◽

Net Revenue ◽

Multi Agent ◽

User Demand ◽

Policy Optimization ◽

Over Time

Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the world. One of the key challenges in their operation is vehicle rebalancing, i.e., repositioning the EVs across stations to better satisfy future user demand. This is particularly challenging in the shared EV context, because i) the range of EVs is limited while charging time is substantial, which constrains the rebalancing options; and ii) as a new mobility trend, most of the current EV sharing systems are still continuously expanding their station networks, i.e., the targets for rebalancing can change over time. To tackle these challenges, in this paper we model the rebalancing task as a Multi-Agent Reinforcement Learning (MARL) problem, which directly takes the range and charging properties of the EVs into account. We propose a novel approach of policy optimization with action cascading, which isolates the non-stationarity locally, and use two connected networks to solve the formulated MARL. We evaluate the proposed approach using a simulator calibrated with 1-year operation data from a real EV sharing system. Results show that our approach significantly outperforms the state-of-the-art, offering up to 14% gain in order satisfied rate and 12% increase in net revenue.

Download Full-text

Inducing selfish agents towards social efficient solutions

10.5753/kdmile.2020.11953 ◽

2020 ◽

Author(s):

João Schapke ◽

Ana Bazzan

Keyword(s):

Reinforcement Learning ◽

Repeated Games ◽

Price Of Anarchy ◽

Efficient Solutions ◽

High Price ◽

Central Controller ◽

The Social ◽

Multi Agent ◽

The Individual ◽

Selfish Agents

Many multi-agent reinforcement learning (MARL) scenarios lead towards Nash equilibria, which is known to not always be socially efficient. In this study we aim to align the social optimization objective of the system with the individual objectives of the agents by adopting a central controller which can interact with the agents. In details, our approach establishes a communication channel between reinforcement learning agents, and a controller implemented with metaheuristics. The interaction benefit the convergence of both algorithms. Further, we evaluate our method in repeated games with high price of anarchy and show that our approach is able to overcome much of the issues caused by the non-cooperative behaviour of the agents and the non-stationary effects they cause.

Download Full-text

Multi-agent reinforcement learning for character control

The Visual Computer ◽

10.1007/s00371-021-02269-1 ◽

2021 ◽

Author(s):

Cheng Li ◽

Levi Fussell ◽

Taku Komura

Keyword(s):

Deep Learning ◽

Reinforcement Learning ◽

Computer Games ◽

State Of The Art ◽

Research Topic ◽

Future Directions ◽

Computer Animations ◽

Survey Papers ◽

Simultaneous Control ◽

Multi Agent

AbstractSimultaneous control of multiple characters has been a research topic that has been extensively pursued for applications in computer games and computer animations, for applications such as crowd simulation, controlling two characters carrying objects or fighting with one another and controlling a team of characters playing collective sports. With the advance in deep learning and reinforcement learning, there is a growing interest in applying multi-agent reinforcement learning for intelligently controlling the characters to produce realistic movements. In this paper we will survey the state-of-the-art MARL techniques that are applicable for character control. We will then survey papers that make use of MARL for multi-character control and then discuss about the possible future directions of research.

Download Full-text