scholarly journals Decentralized No-regret Learning Algorithms for Extensive-form Correlated Equilibria (Extended Abstract)

Author(s):  
Andrea Celli ◽  
Alberto Marchesi ◽  
Gabriele Farina ◽  
Nicola Gatti

The existence of uncoupled no-regret learning dynamics converging to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form games generalize normal-form games by modeling both sequential and simultaneous moves, as well as imperfect information. Because of the sequential nature and the presence of private information, correlation in extensive-form games possesses significantly different properties than in normal-form games. The extensive-form correlated equilibrium (EFCE) is the natural extensive-form counterpart to the classical notion of correlated equilibrium in normal-form games. Compared to the latter, the constraints that define the set of EFCEs are significantly more complex, as the correlation device ({\em a.k.a.} mediator) must take into account the evolution of beliefs of each player as they make observations throughout the game. Due to this additional complexity, the existence of uncoupled learning dynamics leading to an EFCE has remained a challenging open research question for a long time. In this article, we settle that question by giving the first uncoupled no-regret dynamics which provably converge to the set of EFCEs in n-player general-sum extensive-form games with perfect recall. We show that each iterate can be computed in time polynomial in the size of the game tree, and that, when all players play repeatedly according to our learning dynamics, the empirical frequency of play after T game repetitions is guaranteed to be a O(T^-1/2)-approximate EFCE with high probability, and an EFCE almost surely in the limit.

2020 ◽  
Vol 34 (02) ◽  
pp. 1934-1941
Author(s):  
Gabriele Farina ◽  
Tommaso Bianchi ◽  
Tuomas Sandholm

Coarse correlation models strategic interactions of rational agents complemented by a correlation device which is a mediator that can recommend behavior but not enforce it. Despite being a classical concept in the theory of normal-form games since 1978, not much is known about the merits of coarse correlation in extensive-form settings. In this paper, we consider two instantiations of the idea of coarse correlation in extensive-form games: normal-form coarse-correlated equilibrium (NFCCE), already defined in the literature, and extensive-form coarse-correlated equilibrium (EFCCE), a new solution concept that we introduce. We show that EFCCEs are a subset of NFCCEs and a superset of the related extensive-form correlated equilibria. We also show that, in n-player extensive-form games, social-welfare-maximizing EFCCEs and NFCCEs are bilinear saddle points, and give new efficient algorithms for the special case of two-player games with no chance moves. Experimentally, our proposed algorithm for NFCCE is two to four orders of magnitude faster than the prior state of the art.


Author(s):  
Trevor Davis ◽  
Kevin Waugh ◽  
Michael Bowling

Extensive-form games are a common model for multiagent interactions with imperfect information. In two-player zerosum games, the typical solution concept is a Nash equilibrium over the unconstrained strategy set for each player. In many situations, however, we would like to constrain the set of possible strategies. For example, constraints are a natural way to model limited resources, risk mitigation, safety, consistency with past observations of behavior, or other secondary objectives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; however, state-of-the-art algorithms for solving large games cannot handle general constraints. In this work we introduce a generalized form of Counterfactual Regret Minimization that provably finds optimal strategies under any feasible set of convex constraints. We demonstrate the effectiveness of our algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given only partial observations of private information.


2019 ◽  
Vol 67 (3-4) ◽  
pp. 185-195
Author(s):  
Kazuhiro Ohnishi

Which choice will a player make if he can make one of two choices in which his own payoffs are equal, but his rival’s payoffs are not equal, that is, one with a large payoff for his rival and the other with a small payoff for his rival? This paper introduces non-altruistic equilibria for normal-form games and extensive-form non-altruistic equilibria for extensive-form games as equilibrium concepts of non-cooperative games by discussing such a problem and examines the connections between their equilibrium concepts and Nash and subgame perfect equilibria that are important and frequently encountered equilibrium concepts.


2014 ◽  
Vol 51 ◽  
pp. 829-866 ◽  
Author(s):  
B. Bosansky ◽  
C. Kiekintveld ◽  
V. Lisy ◽  
M. Pechoucek

Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensive-form games and (2) the algorithmic framework of double-oracle methods. The main idea of our algorithm is to restrict the game by allowing the players to play only selected sequences of available actions. After solving the restricted game, new sequences are added by finding best responses to the current solution using fast algorithms. We experimentally evaluate our algorithm on a set of games inspired by patrolling scenarios, board, and card games. The results show significant runtime improvements in games admitting an equilibrium with small support, and substantial improvement in memory use even on games with large support. The improvement in memory use is particularly important because it allows our algorithm to solve much larger game instances than existing linear programming methods. Our main contributions include (1) a generic sequence-form double-oracle algorithm for solving zero-sum extensive-form games; (2) fast methods for maintaining a valid restricted game model when adding new sequences; (3) a search algorithm and pruning methods for computing best-response sequences; (4) theoretical guarantees about the convergence of the algorithm to a Nash equilibrium; (5) experimental analysis of our algorithm on several games, including an approximate version of the algorithm.


2014 ◽  
Vol 14 (5&6) ◽  
pp. 493-516
Author(s):  
Alan Deckelbaum

We ask whether players of a classical game can partition a pure quantum state to implement classical correlated equilibrium distributions. The main contribution of this work is an impossibility result: we provide an example of a classical correlated equilibrium that cannot be securely implemented without useful information leaking outside the system. We study the model where players of a classical complete information game initially share an entangled pure quantum state. Players may perform arbitrary local operations on their subsystems, but no direct communication (either quantum or classical) is allowed. We explain why, for the purpose of implementing classical correlated equilibria, it is desirable to restrict the initial state to be pure and to restrict communication. In this framework, we define the concept of pure quantum correlated equilibrium (PQCE) and show that in a normal form game, any outcome distribution implementable by a PQCE can also be implemented by a classical correlated equilibrium (CE), but that the converse is false. We extend our analysis to extensive form games, and compare the power of PQCE to extensive form classical correlated equilibria (EFCE) and immediate-revelation extensive form correlated equilibria (IR-EFCE).


2019 ◽  
Vol 20 (1) ◽  
pp. 52-66
Author(s):  
Dieter Balkenborg ◽  
Christoph Kuzmics ◽  
Josef Hofbauer

Abstract Fixed points of the (most) refined best reply correspondence, introduced in Balkenborg et al. (2013), in the agent normal form of extensive form games with perfect recall have a remarkable property. They induce fixed points of the same correspondence in the agent normal form of every subgame. Furthermore, in a well-defined sense, fixed points of this correspondence refine even trembling hand perfect equilibria, while, on the other hand, reasonable equilibria that are not weak perfect Bayesian equilibria are fixed points of this correspondence.


Sign in / Sign up

Export Citation Format

Share Document