scholarly journals Multiagent Learning in Large Anonymous Games

2011 ◽  
Vol 40 ◽  
pp. 571-598 ◽  
Author(s):  
I. A. Kash ◽  
E. J. Friedman ◽  
J. Y. Halpern

In large systems, it is important for agents to learn to act effectively, but sophisticated multi-agent learning algorithms generally do not scale. An alternative approach is to find restricted classes of games where simple, efficient algorithms converge. It is shown that stage learning efficiently converges to Nash equilibria in large anonymous games if best-reply dynamics converge. Two features are identified that improve convergence. First, rather than making learning more difficult, more agents are actually beneficial in many settings. Second, providing agents with statistical information about the behavior of others can significantly reduce the number of observations needed.

IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Giuseppe Caso ◽  
Ozgu Alay ◽  
Guido Carlo Ferrante ◽  
Luca De Nardis ◽  
Maria-Gabriella Di Benedetto ◽  
...  

2013 ◽  
Vol 47 ◽  
pp. 441-473 ◽  
Author(s):  
L. Cigler ◽  
B. Faltings

To achieve an optimal outcome in many situations, agents need to choose distinct actions from one another. This is the case notably in many resource allocation problems, where a single resource can only be used by one agent at a time. How shall a designer of a multi-agent system program its identical agents to behave each in a different way? From a game theoretic perspective, such situations lead to undesirable Nash equilibria. For example consider a resource allocation game in that two players compete for an exclusive access to a single resource. It has three Nash equilibria. The two pure-strategy NE are efficient, but not fair. The one mixed-strategy NE is fair, but not efficient. Aumann's notion of correlated equilibrium fixes this problem: It assumes a correlation device that suggests each agent an action to take. However, such a "smart" coordination device might not be available. We propose using a randomly chosen, "stupid" integer coordination signal. "Smart" agents learn which action they should use for each value of the coordination signal. We present a multi-agent learning algorithm that converges in polynomial number of steps to a correlated equilibrium of a channel allocation game, a variant of the resource allocation game. We show that the agents learn to play for each coordination signal value a randomly chosen pure-strategy Nash equilibrium of the game. Therefore, the outcome is an efficient correlated equilibrium. This CE becomes more fair as the number of the available coordination signal values increases.


2015 ◽  
Vol 53 ◽  
pp. 659-697 ◽  
Author(s):  
Daan Bloembergen ◽  
Karl Tuyls ◽  
Daniel Hennes ◽  
Michael Kaisers

The interaction of multiple autonomous agents gives rise to highly dynamic and nondeterministic environments, contributing to the complexity in applications such as automated financial markets, smart grids, or robotics. Due to the sheer number of situations that may arise, it is not possible to foresee and program the optimal behaviour for all agents beforehand. Consequently, it becomes essential for the success of the system that the agents can learn their optimal behaviour and adapt to new situations or circumstances. The past two decades have seen the emergence of reinforcement learning, both in single and multi-agent settings, as a strong, robust and adaptive learning paradigm. Progress has been substantial, and a wide range of algorithms are now available. An important challenge in the domain of multi-agent learning is to gain qualitative insights into the resulting system dynamics. In the past decade, tools and methods from evolutionary game theory have been successfully employed to study multi-agent learning dynamics formally in strategic interactions. This article surveys the dynamical models that have been derived for various multi-agent reinforcement learning algorithms, making it possible to study and compare them qualitatively. Furthermore, new learning algorithms that have been introduced using these evolutionary game theoretic tools are reviewed. The evolutionary models can be used to study complex strategic interactions. Examples of such analysis are given for the domains of automated trading in stock markets and collision avoidance in multi-robot systems. The paper provides a roadmap on the progress that has been achieved in analysing the evolutionary dynamics of multi-agent learning by highlighting the main results and accomplishments.


2021 ◽  
Vol 22 (2) ◽  
pp. 1-38
Author(s):  
Julian Gutierrez ◽  
Paul Harrenstein ◽  
Giuseppe Perelli ◽  
Michael Wooldridge

We define and investigate a novel notion of expressiveness for temporal logics that is based on game theoretic equilibria of multi-agent systems. We use iterated Boolean games as our abstract model of multi-agent systems [Gutierrez et al. 2013, 2015a]. In such a game, each agent  has a goal  , represented using (a fragment of) Linear Temporal Logic ( ) . The goal  captures agent  ’s preferences, in the sense that the models of  represent system behaviours that would satisfy  . Each player controls a subset of Boolean variables , and at each round in the game, player is at liberty to choose values for variables in any way that she sees fit. Play continues for an infinite sequence of rounds, and so as players act they collectively trace out a model for , which for every player will either satisfy or fail to satisfy their goal. Players are assumed to act strategically, taking into account the goals of other players, in an attempt to bring about computations satisfying their goal. In this setting, we apply the standard game-theoretic concept of (pure) Nash equilibria. The (possibly empty) set of Nash equilibria of an iterated Boolean game can be understood as inducing a set of computations, each computation representing one way the system could evolve if players chose strategies that together constitute a Nash equilibrium. Such a set of equilibrium computations expresses a temporal property—which may or may not be expressible within a particular fragment. The new notion of expressiveness that we formally define and investigate is then as follows: What temporal properties are characterised by the Nash equilibria of games in which agent goals are expressed in specific fragments of  ? We formally define and investigate this notion of expressiveness for a range of fragments. For example, a very natural question is the following: Suppose we have an iterated Boolean game in which every goal is represented using a particular fragment of : is it then always the case that the equilibria of the game can be characterised within ? We show that this is not true in general.


2021 ◽  
Vol 11 (11) ◽  
pp. 4948
Author(s):  
Lorenzo Canese ◽  
Gian Carlo Cardarilli ◽  
Luca Di Di Nunzio ◽  
Rocco Fazzolari ◽  
Daniele Giardino ◽  
...  

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.


2021 ◽  
Vol 54 (5) ◽  
pp. 1-35
Author(s):  
Shubham Pateria ◽  
Budhitama Subagdja ◽  
Ah-hwee Tan ◽  
Chai Quek

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future research in HRL. Furthermore, we outline a few suitable task domains for evaluating the HRL approaches and a few interesting examples of the practical applications of HRL in the Supplementary Material.


Sign in / Sign up

Export Citation Format

Share Document