Bounds and dynamics for empirical game theoretic analysis

Karl Tuyls; Julien Perolat; Marc Lanctot; Edward Hughes; Richard Everett; Joel Z. Leibo; Csaba Szepesvári; Thore Graepel

doi:10.1007/s10458-019-09432-y

Bounds and dynamics for empirical game theoretic analysis

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-019-09432-y ◽

2019 ◽

Vol 34 (1) ◽

Cited By ~ 2

Author(s):

Karl Tuyls ◽

Julien Perolat ◽

Marc Lanctot ◽

Edward Hughes ◽

Richard Everett ◽

...

Keyword(s):

Nash Equilibrium ◽

Evolutionary Dynamics ◽

Learning Algorithm ◽

Agent Interactions ◽

Approximate Nash Equilibrium ◽

Blotto Game ◽

Multi Agent ◽

Payoff Structure ◽

Game Theoretic ◽

Colonel Blotto

AbstractThis paper provides several theoretical results for empirical game theory. Specifically, we introduce bounds for empirical game theoretical analysis of complex multi-agent interactions. In doing so we provide insights in the empirical meta game showing that a Nash equilibrium of the estimated meta-game is an approximate Nash equilibrium of the true underlying meta-game. We investigate and show how many data samples are required to obtain a close enough approximation of the underlying game. Additionally, we extend the evolutionary dynamics analysis of meta-games using heuristic payoff tables (HPTs) to asymmetric games. The state-of-the-art has only considered evolutionary dynamics of symmetric HPTs in which agents have access to the same strategy sets and the payoff structure is symmetric, implying that agents are interchangeable. Finally, we carry out an empirical illustration of the generalised method in several domains, illustrating the theory and evolutionary dynamics of several versions of the AlphaGo algorithm (symmetric), the dynamics of the Colonel Blotto game played by human players on Facebook (symmetric), the dynamics of several teams of players in the capture the flag game (symmetric), and an example of a meta-game in Leduc Poker (asymmetric), generated by the policy-space response oracle multi-agent learning algorithm.

Download Full-text

Expressiveness and Nash Equilibrium in Iterated Boolean Games

ACM Transactions on Computational Logic ◽

10.1145/3439900 ◽

2021 ◽

Vol 22 (2) ◽

pp. 1-38

Author(s):

Julian Gutierrez ◽

Paul Harrenstein ◽

Giuseppe Perelli ◽

Michael Wooldridge

Keyword(s):

Nash Equilibrium ◽

Nash Equilibria ◽

Infinite Sequence ◽

Multi Agent Systems ◽

Temporal Logics ◽

Agent Systems ◽

Temporal Properties ◽

Multi Agent ◽

Game Theoretic ◽

Boolean Games

We define and investigate a novel notion of expressiveness for temporal logics that is based on game theoretic equilibria of multi-agent systems. We use iterated Boolean games as our abstract model of multi-agent systems [Gutierrez et al. 2013, 2015a]. In such a game, each agent has a goal , represented using (a fragment of) Linear Temporal Logic ( ) . The goal captures agent ’s preferences, in the sense that the models of represent system behaviours that would satisfy . Each player controls a subset of Boolean variables , and at each round in the game, player is at liberty to choose values for variables in any way that she sees fit. Play continues for an infinite sequence of rounds, and so as players act they collectively trace out a model for , which for every player will either satisfy or fail to satisfy their goal. Players are assumed to act strategically, taking into account the goals of other players, in an attempt to bring about computations satisfying their goal. In this setting, we apply the standard game-theoretic concept of (pure) Nash equilibria. The (possibly empty) set of Nash equilibria of an iterated Boolean game can be understood as inducing a set of computations, each computation representing one way the system could evolve if players chose strategies that together constitute a Nash equilibrium. Such a set of equilibrium computations expresses a temporal property—which may or may not be expressible within a particular fragment. The new notion of expressiveness that we formally define and investigate is then as follows: What temporal properties are characterised by the Nash equilibria of games in which agent goals are expressed in specific fragments of ? We formally define and investigate this notion of expressiveness for a range of fragments. For example, a very natural question is the following: Suppose we have an iterated Boolean game in which every goal is represented using a particular fragment of : is it then always the case that the equilibria of the game can be characterised within ? We show that this is not true in general.

Download Full-text

A Note on a Comparison of Simultaneous and Sequential Colonel Blotto Games

Peace Economics Peace Science and Public Policy ◽

10.1515/peps-2012-0007 ◽

2012 ◽

Vol 18 (3) ◽

Cited By ~ 1

Author(s):

Yumiko Baba

Keyword(s):

Nash Equilibrium ◽

Theoretical Prediction ◽

Cyber Terrorism ◽

Attack Pattern ◽

Period 2 ◽

Blotto Game ◽

Irrational Behavior ◽

Theoretical Predictions ◽

Colonel Blotto ◽

Colonel Blotto Games

Abstract Clark and Konrad (2007) introduce the weakest link against the best shot property to the Colonel Blotto game where the defendant has to win all the battle fields while the attacker only needs to win at least one battlefield. They characterize the Nash equilibrium assuming that the attacker attacks all the battlefields simultaneously. We construct a two stage model and endogenize the attacker’s attack pattern. We show that the attacker chooses a sequential attack pattern in the subgame perfect Nash equilibrium. Therefore, the game analyzed by Clark and Konrad (2007) is never realized. We also conducted experiments and found that the subjects’ behavior was inconsistent to theoretical predictions. Both players overinvested and the variances were large. In the simultaneous game, the attackers took a guerrira strategy at 30% of the time in which they invested only in one battlefield and the defenders took a surrender strategy at 11% of the time in which they invested nothing in the simultaneous game. Both players invested more in period 1 than in period 2 in the sequential game. Although all of these are inconsistent to the theoretical predictions, the winning probability of a game was consistent to the theoretical prediction in the simultaneous games, but it was lower than the theoretical prediction in the sequential games. We conclude that the subjects’ irrational behavior is mainly a rational response to his/ her opponent’s irrational behavior. Our model can explain terrorism, cyber terrorism, lobbying, and patent trolls and the huge gap between the theory and the experiments are important considering the significance of the problems.

Download Full-text

An Enhanced Model-Free Reinforcement Learning Algorithm to Solve Nash Equilibrium for Multi-Agent Cooperative Game Systems

IEEE Access ◽

10.1109/access.2020.3043806 ◽

2020 ◽

Vol 8 ◽

pp. 223743-223755

Author(s):

Yuannan Jiang ◽

Fuxiao Tan

Keyword(s):

Reinforcement Learning ◽

Nash Equilibrium ◽

Cooperative Game ◽

Learning Algorithm ◽

Model Free ◽

Multi Agent ◽

Reinforcement Learning Algorithm

Download Full-text

Algorithm for Computing Approximate Nash Equilibrium in Continuous Games with Application to Continuous Blotto

Games ◽

10.3390/g12020047 ◽

2021 ◽

Vol 12 (2) ◽

pp. 47

Author(s):

Sam Ganzfried

Keyword(s):

Nash Equilibrium ◽

Imperfect Information ◽

Pure Strategy ◽

Strategy Space ◽

Equilibrium Strategies ◽

Approximate Nash Equilibrium ◽

Blotto Game ◽

Nash Equilibrium Strategies ◽

Zero Sum ◽

Action Spaces

Successful algorithms have been developed for computing Nash equilibrium in a variety of finite game classes. However, solving continuous games—in which the pure strategy space is (potentially uncountably) infinite—is far more challenging. Nonetheless, many real-world domains have continuous action spaces, e.g., where actions refer to an amount of time, money, or other resource that is naturally modeled as being real-valued as opposed to integral. We present a new algorithm for approximating Nash equilibrium strategies in continuous games. In addition to two-player zero-sum games, our algorithm also applies to multiplayer games and games with imperfect information. We experiment with our algorithm on a continuous imperfect-information Blotto game, in which two players distribute resources over multiple battlefields. Blotto games have frequently been used to model national security scenarios and have also been applied to electoral competition and auction theory. Experiments show that our algorithm is able to quickly compute close approximations of Nash equilibrium strategies for this game.

Download Full-text

Evolutionary Dynamics of Resource Allocation in the Colonel Blotto Game

Journal of Statistical Physics ◽

10.1007/s10955-012-0659-7 ◽

2012 ◽

Vol 151 (3-4) ◽

pp. 623-636 ◽

Cited By ~ 3

Author(s):

Damián G. Hernández ◽

Damián H. Zanette

Keyword(s):

Resource Allocation ◽

Evolutionary Dynamics ◽

Blotto Game ◽

Colonel Blotto ◽

Colonel Blotto Game

Download Full-text

Bi-Level Actor-Critic for Multi-Agent Coordination

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6226 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7325-7332

Author(s):

Haifeng Zhang ◽

Weizhe Chen ◽

Zeren Huang ◽

Minne Li ◽

Yaodong Yang ◽

...

Keyword(s):

Reinforcement Learning ◽

Nash Equilibrium ◽

Learning Algorithm ◽

Stackelberg Equilibrium ◽

Multi Agent Systems ◽

Matrix Games ◽

Markov Games ◽

The Arts ◽

Convergence Point ◽

Multi Agent

Coordination is one of the essential problems in multi-agent systems. Typically multi-agent reinforcement learning (MARL) methods treat agents equally and the goal is to solve the Markov game to an arbitrary Nash equilibrium (NE) when multiple equilibra exist, thus lacking a solution for NE selection. In this paper, we treat agents unequally and consider Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority, especially in cooperative environments. Under Markov games, we formally define the bi-level reinforcement learning problem in finding Stackelberg equilibrium. We propose a novel bi-level actor-critic learning method that allows agents to have different knowledge base (thus intelligent), while their actions still can be executed simultaneously and distributedly. The convergence proof is given, while the resulting learning algorithm is tested against the state of the arts. We found that the proposed bi-level actor-critic algorithm successfully converged to the Stackelberg equilibria in matrix games and find a asymmetric solution in a highway merge environment.

Download Full-text