Dynamical systems as a level of cognitive analysis of multi-agent learning

Neural Computing and Applications ◽

10.1007/s00521-021-06117-0 ◽

2021 ◽

Author(s):

Wolfram Barfuss

Keyword(s):

Game Theory ◽

Reinforcement Learning ◽

Dynamical Systems ◽

Full Potential ◽

Temporal Difference ◽

Cognitive Analysis ◽

Deterministic Learning ◽

Agent Learning ◽

Multi Agent ◽

Abstraction Levels

AbstractA dynamical systems perspective on multi-agent learning, based on the link between evolutionary game theory and reinforcement learning, provides an improved, qualitative understanding of the emerging collective learning dynamics. However, confusion exists with respect to how this dynamical systems account of multi-agent learning should be interpreted. In this article, I propose to embed the dynamical systems description of multi-agent learning into different abstraction levels of cognitive analysis. The purpose of this work is to make the connections between these levels explicit in order to gain improved insight into multi-agent learning. I demonstrate the usefulness of this framework with the general and widespread class of temporal-difference reinforcement learning. I find that its deterministic dynamical systems description follows a minimum free-energy principle and unifies a boundedly rational account of game theory with decision-making under uncertainty. I then propose an on-line sample-batch temporal-difference algorithm which is characterized by the combination of applying a memory-batch and separated state-action value estimation. I find that this algorithm serves as a micro-foundation of the deterministic learning equations by showing that its learning trajectories approach the ones of the deterministic learning equations under large batch sizes. Ultimately, this framework of embedding a dynamical systems description into different abstraction levels gives guidance on how to unleash the full potential of the dynamical systems approach to multi-agent learning.

Download Full-text

User-Centric Radio Access Technology Selection: A Survey of Game Theory Models and Multi-Agent Learning Algorithms

IEEE Access ◽

10.1109/access.2021.3087410 ◽

2021 ◽

pp. 1-1

Author(s):

Giuseppe Caso ◽

Ozgu Alay ◽

Guido Carlo Ferrante ◽

Luca De Nardis ◽

Maria-Gabriella Di Benedetto ◽

...

Keyword(s):

Game Theory ◽

Learning Algorithms ◽

Technology Selection ◽

Access Technology ◽

Radio Access Technology ◽

Radio Access ◽

Agent Learning ◽

Multi Agent ◽

User Centric ◽

Radio Access Technology Selection

Download Full-text

Hierarchical Reinforcement Learning

ACM Computing Surveys ◽

10.1145/3453160 ◽

2021 ◽

Vol 54 (5) ◽

pp. 1-35

Author(s):

Shubham Pateria ◽

Budhitama Subagdja ◽

Ah-hwee Tan ◽

Chai Quek

Keyword(s):

Reinforcement Learning ◽

Future Research ◽

Comprehensive Overview ◽

Open Problems ◽

Practical Applications ◽

Hierarchical Reinforcement Learning ◽

The Past ◽

Agent Learning ◽

Multi Agent ◽

Supplementary Material

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future research in HRL. Furthermore, we outline a few suitable task domains for evaluating the HRL approaches and a few interesting examples of the practical applications of HRL in the Supplementary Material.

Download Full-text

Robust Multi-Agent Reinforcement Learning via Minimax Deep Deterministic Policy Gradient

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014213 ◽

2019 ◽

Vol 33 ◽

pp. 4213-4220 ◽

Cited By ~ 12

Author(s):

Shihui Li ◽

Yi Wu ◽

Xinyue Cui ◽

Honghua Dong ◽

Fei Fang ◽

...

Keyword(s):

Reinforcement Learning ◽

Gradient Algorithm ◽

Training Environment ◽

Local Optima ◽

Continuous Action ◽

Agent Learning ◽

Policy Gradient ◽

Multi Agent ◽

Continuous Actions ◽

Computational Intractability

Despite the recent advances of deep reinforcement learning (DRL), agents trained by DRL tend to be brittle and sensitive to the training environment, especially in the multi-agent scenarios. In the multi-agent setting, a DRL agent’s policy can easily get stuck in a poor local optima w.r.t. its training partners – the learned policy may be only locally optimal to other agents’ current policies. In this paper, we focus on the problem of training robust DRL agents with continuous actions in the multi-agent learning setting so that the trained agents can still generalize when its opponents’ policies alter. To tackle this problem, we proposed a new algorithm, MiniMax Multi-agent Deep Deterministic Policy Gradient (M3DDPG) with the following contributions: (1) we introduce a minimax extension of the popular multi-agent deep deterministic policy gradient algorithm (MADDPG), for robust policy learning; (2) since the continuous action space leads to computational intractability in our minimax learning objective, we propose Multi-Agent Adversarial Learning (MAAL) to efficiently solve our proposed formulation. We empirically evaluate our M3DDPG algorithm in four mixed cooperative and competitive multi-agent environments and the agents trained by our method significantly outperforms existing baselines.

Download Full-text

Arena: A General Evaluation Platform and Building Toolkit for Multi-Agent Intelligence

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6216 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7253-7260 ◽

Cited By ~ 2

Author(s):

Yuhang Song ◽

Andrzej Wojcicki ◽

Thomas Lukasiewicz ◽

Jianyi Wang ◽

Abi Aryan ◽

...

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Research Community ◽

Learning Agents ◽

General Evaluation ◽

Agent Learning ◽

Multi Agent ◽

Agent Intelligence ◽

Training Schemes ◽

Evaluation Platform

Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/.

Download Full-text

Combat Robot Strategy Adaptation Using Multiple Learning Agents

Volume 4: Dynamics, Control and Uncertainty, Parts A and B ◽

10.1115/imece2012-87521 ◽

2012 ◽

Author(s):

Thomas Recchia ◽

Jae Chung ◽

Kishore Pochiraju

Keyword(s):

Reinforcement Learning ◽

Robotic Systems ◽

Multi Agent System ◽

Learning Agents ◽

Loosely Coupled ◽

Reward Function ◽

Strategy Adaptation ◽

Agent Learning ◽

Multi Agent ◽

Reward Functions

As robotic systems become more prevalent, it is highly desirable for them to be able to operate in highly dynamic environments. A common approach is to use reinforcement learning to allow an agent controlling the robot to learn and adapt its behavior based on a reward function. This paper presents a novel multi-agent system that cooperates to control a single robot battle tank in a melee battle scenario, with no prior knowledge of its opponents’ strategies. The agents learn through reinforcement learning, and are loosely coupled by their reward functions. Each agent controls a different aspect of the robot’s behavior. In addition, the problem of delayed reward is addressed through a time-averaged reward applied to several sequential actions at once. This system was evaluated in a simulated melee combat scenario and was shown to learn to improve its performance over time. This was accomplished by each agent learning to pick specific battle strategies for each different opponent it faced.

Download Full-text

Multi-agent learning methods with reinforcement using game theory algorithms

Politechnical student journal ◽

10.18698/2541-8009-2020-11-652 ◽

2020 ◽

Author(s):

V.E. Bolshakov ◽

Keyword(s):

Game Theory ◽

Learning Methods ◽

Agent Learning ◽

Multi Agent

Download Full-text

Adaptive Load Balancing: A Study in Multi-Agent Learning

Journal of Artificial Intelligence Research ◽

10.1613/jair.121 ◽

1995 ◽

Vol 2 ◽

pp. 475-500 ◽

Cited By ~ 80

Author(s):

A. Schaerf ◽

Y. Shoham ◽

M. Tennenholtz

Keyword(s):

Reinforcement Learning ◽

Load Balancing ◽

Distributed System ◽

Adaptive Behavior ◽

Local Information ◽

System Efficiency ◽

Agent Learning ◽

Multi Agent ◽

Explicit Communication

We study the process of multi-agent reinforcement learning in the context ofload balancing in a distributed system, without use of either centralcoordination or explicit communication. We first define a precise frameworkin which to study adaptive load balancing, important features of which are itsstochastic nature and the purely local information available to individualagents. Given this framework, we show illuminating results on the interplaybetween basic adaptive behavior parameters and their effect on systemefficiency. We then investigate the properties of adaptive load balancing inheterogeneous populations, and address the issue of exploration vs.exploitation in that context. Finally, we show that naive use ofcommunication may not improve, and might even harm system efficiency.

Download Full-text

Deep Reinforcement Learning Versus Evolution Strategies: A Comparative Survey

10.36227/techrxiv.14679504.v2 ◽

2021 ◽

Author(s):

Amjad Yousef Majid ◽

Serge Saaybi ◽

Tomas van Rietbergen ◽

Vincent Francois-Lavet ◽

R Venkatesha Prasad ◽

...

Keyword(s):

Reinforcement Learning ◽

Evolution Strategies ◽

Sequential Decision Making ◽

Sequential Decision ◽

Level Control ◽

Agent Learning ◽

Real World Applications ◽

Multi Agent ◽

Comparative Survey ◽

Key Aspects

<div>Deep Reinforcement Learning (DRL) and Evolution Strategies (ESs) have surpassed human-level control in many sequential decision-making problems, yet many open challenges still exist.</div><div>To get insights into the strengths and weaknesses of DRL versus ESs, an analysis of their respective capabilities and limitations is provided. </div><div>After presenting their fundamental concepts and algorithms, a comparison is provided on key aspects such as scalability, exploration, adaptation to dynamic environments, and multi-agent learning. </div><div>Then, the benefits of hybrid algorithms that combine concepts from DRL and ESs are highlighted. </div><div>Finally, to have an indication about how they compare in real-world applications, a survey of the literature for the set of applications they support is provided.</div>

Download Full-text

A Cooperative Multi-Agent System for Traffic Signal Control Using Game Theory and Reinforcement Learning

IEEE Intelligent Transportation Systems Magazine ◽

10.1109/mits.2020.2990189 ◽

2020 ◽

pp. 0-0

Author(s):

Monireh Abdoos

Keyword(s):

Game Theory ◽

Reinforcement Learning ◽

Traffic Signal ◽

Signal Control ◽

Traffic Signal Control ◽

Multi Agent System ◽

Agent System ◽

Multi Agent

Download Full-text

Adaptation Method of the Exploration Ratio Based on the Orientation of Equilibrium in Multi-Agent Reinforcement Learning Under Non-Stationary Environments

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2017.p0939 ◽

2017 ◽

Vol 21 (5) ◽

pp. 939-947

Author(s):

Takuya Okano ◽

Itsuki Noda ◽

◽

Keyword(s):

Reinforcement Learning ◽

Learning Performance ◽

Evolutionary Adaptation ◽

Agent Learning ◽

Optimal Value ◽

Multi Agent ◽

Adaptation Method

In this paper, we propose a method to adapt the exploration ratio in multi-agent reinforcement learning. The adaptation of exploration ratio is important in multi-agent learning, as this is one of key parameters that affect the learning performance. In our observation, the adaptation method can adjust the exploration ratio suitably (but not optimally) according to the characteristics of environments. We investigated the evolutionarily adaptation of the exploration ratio in multi-agent learning. We conducted several experiments to adapt the exploration ratio in a simple evolutionary way, namely, mimicking advantageous exploration ratio (MAER), and confirmed that MAER always acquires relatively lower exploration ratio than the optimal value for the change ratio of the environments. In this paper, we propose a second evolutionary adaptation method, namely, win or update exploration ratio (WoUE). The results of the experiments showed that WoUE can acquire a more suitable exploration ratio than MAER, and the obtained ratio was near-optimal.

Download Full-text