Equilibrium in behavior strategies in infinite extensive form games with imperfect information

1992 ◽  
Vol 2 (4) ◽  
pp. 481-494
Author(s):  
Subir K. Chakrabarti
Author(s):  
Trevor Davis ◽  
Kevin Waugh ◽  
Michael Bowling

Extensive-form games are a common model for multiagent interactions with imperfect information. In two-player zerosum games, the typical solution concept is a Nash equilibrium over the unconstrained strategy set for each player. In many situations, however, we would like to constrain the set of possible strategies. For example, constraints are a natural way to model limited resources, risk mitigation, safety, consistency with past observations of behavior, or other secondary objectives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; however, state-of-the-art algorithms for solving large games cannot handle general constraints. In this work we introduce a generalized form of Counterfactual Regret Minimization that provably finds optimal strategies under any feasible set of convex constraints. We demonstrate the effectiveness of our algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given only partial observations of private information.


2014 ◽  
Vol 51 ◽  
pp. 829-866 ◽  
Author(s):  
B. Bosansky ◽  
C. Kiekintveld ◽  
V. Lisy ◽  
M. Pechoucek

Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensive-form games and (2) the algorithmic framework of double-oracle methods. The main idea of our algorithm is to restrict the game by allowing the players to play only selected sequences of available actions. After solving the restricted game, new sequences are added by finding best responses to the current solution using fast algorithms. We experimentally evaluate our algorithm on a set of games inspired by patrolling scenarios, board, and card games. The results show significant runtime improvements in games admitting an equilibrium with small support, and substantial improvement in memory use even on games with large support. The improvement in memory use is particularly important because it allows our algorithm to solve much larger game instances than existing linear programming methods. Our main contributions include (1) a generic sequence-form double-oracle algorithm for solving zero-sum extensive-form games; (2) fast methods for maintaining a valid restricted game model when adding new sequences; (3) a search algorithm and pruning methods for computing best-response sequences; (4) theoretical guarantees about the convergence of the algorithm to a Nash equilibrium; (5) experimental analysis of our algorithm on several games, including an approximate version of the algorithm.


Games ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 2
Author(s):  
Valeria Zahoransky ◽  
Julian Gutierrez ◽  
Paul Harrenstein ◽  
Michael Wooldridge

We introduce a non-cooperative game model in which players’ decision nodes are partially ordered by a dependence relation, which directly captures informational dependencies in the game. In saying that a decision node v is dependent on decision nodes v1,…,vk, we mean that the information available to a strategy making a choice at v is precisely the choices that were made at v1,…,vk. Although partial order games are no more expressive than extensive form games of imperfect information (we show that any partial order game can be reduced to a strategically equivalent extensive form game of imperfect information, though possibly at the cost of an exponential blowup in the size of the game), they provide a more natural and compact representation for many strategic settings of interest. After introducing the game model, we investigate the relationship to extensive form games of imperfect information, the problem of computing Nash equilibria, and conditions that enable backwards induction in this new model.


Author(s):  
Andrea Celli ◽  
Alberto Marchesi ◽  
Gabriele Farina ◽  
Nicola Gatti

The existence of uncoupled no-regret learning dynamics converging to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form games generalize normal-form games by modeling both sequential and simultaneous moves, as well as imperfect information. Because of the sequential nature and the presence of private information, correlation in extensive-form games possesses significantly different properties than in normal-form games. The extensive-form correlated equilibrium (EFCE) is the natural extensive-form counterpart to the classical notion of correlated equilibrium in normal-form games. Compared to the latter, the constraints that define the set of EFCEs are significantly more complex, as the correlation device ({\em a.k.a.} mediator) must take into account the evolution of beliefs of each player as they make observations throughout the game. Due to this additional complexity, the existence of uncoupled learning dynamics leading to an EFCE has remained a challenging open research question for a long time. In this article, we settle that question by giving the first uncoupled no-regret dynamics which provably converge to the set of EFCEs in n-player general-sum extensive-form games with perfect recall. We show that each iterate can be computed in time polynomial in the size of the game tree, and that, when all players play repeatedly according to our learning dynamics, the empirical frequency of play after T game repetitions is guaranteed to be a O(T^-1/2)-approximate EFCE with high probability, and an EFCE almost surely in the limit.


Author(s):  
Aviad Heifetz ◽  
Martin Meier ◽  
Burkhard C. Schipper

Sign in / Sign up

Export Citation Format

Share Document