scholarly journals Solving Large Extensive-Form Games with Strategy Constraints

Author(s):  
Trevor Davis ◽  
Kevin Waugh ◽  
Michael Bowling

Extensive-form games are a common model for multiagent interactions with imperfect information. In two-player zerosum games, the typical solution concept is a Nash equilibrium over the unconstrained strategy set for each player. In many situations, however, we would like to constrain the set of possible strategies. For example, constraints are a natural way to model limited resources, risk mitigation, safety, consistency with past observations of behavior, or other secondary objectives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; however, state-of-the-art algorithms for solving large games cannot handle general constraints. In this work we introduce a generalized form of Counterfactual Regret Minimization that provably finds optimal strategies under any feasible set of convex constraints. We demonstrate the effectiveness of our algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given only partial observations of private information.

Author(s):  
Andrea Celli ◽  
Alberto Marchesi ◽  
Gabriele Farina ◽  
Nicola Gatti

The existence of uncoupled no-regret learning dynamics converging to correlated equilibria in normal-form games is a celebrated result in the theory of multi-agent systems. Specifically, it has been known for more than 20 years that when all players seek to minimize their internal regret in a repeated normal-form game, the empirical frequency of play converges to a normal-form correlated equilibrium. Extensive-form games generalize normal-form games by modeling both sequential and simultaneous moves, as well as imperfect information. Because of the sequential nature and the presence of private information, correlation in extensive-form games possesses significantly different properties than in normal-form games. The extensive-form correlated equilibrium (EFCE) is the natural extensive-form counterpart to the classical notion of correlated equilibrium in normal-form games. Compared to the latter, the constraints that define the set of EFCEs are significantly more complex, as the correlation device ({\em a.k.a.} mediator) must take into account the evolution of beliefs of each player as they make observations throughout the game. Due to this additional complexity, the existence of uncoupled learning dynamics leading to an EFCE has remained a challenging open research question for a long time. In this article, we settle that question by giving the first uncoupled no-regret dynamics which provably converge to the set of EFCEs in n-player general-sum extensive-form games with perfect recall. We show that each iterate can be computed in time polynomial in the size of the game tree, and that, when all players play repeatedly according to our learning dynamics, the empirical frequency of play after T game repetitions is guaranteed to be a O(T^-1/2)-approximate EFCE with high probability, and an EFCE almost surely in the limit.


2014 ◽  
Vol 51 ◽  
pp. 829-866 ◽  
Author(s):  
B. Bosansky ◽  
C. Kiekintveld ◽  
V. Lisy ◽  
M. Pechoucek

Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensive-form games and (2) the algorithmic framework of double-oracle methods. The main idea of our algorithm is to restrict the game by allowing the players to play only selected sequences of available actions. After solving the restricted game, new sequences are added by finding best responses to the current solution using fast algorithms. We experimentally evaluate our algorithm on a set of games inspired by patrolling scenarios, board, and card games. The results show significant runtime improvements in games admitting an equilibrium with small support, and substantial improvement in memory use even on games with large support. The improvement in memory use is particularly important because it allows our algorithm to solve much larger game instances than existing linear programming methods. Our main contributions include (1) a generic sequence-form double-oracle algorithm for solving zero-sum extensive-form games; (2) fast methods for maintaining a valid restricted game model when adding new sequences; (3) a search algorithm and pruning methods for computing best-response sequences; (4) theoretical guarantees about the convergence of the algorithm to a Nash equilibrium; (5) experimental analysis of our algorithm on several games, including an approximate version of the algorithm.


Author(s):  
Christian Kroer ◽  
Gabriele Farina ◽  
Tuomas Sandholm

Nash equilibrium is a popular solution concept for solving imperfect-information games in practice. However, it has a major drawback: it does not preclude suboptimal play in branches of the game tree that are not reached in equilibrium. Equilibrium refinements can mend this issue, but have experienced little practical adoption. This is largely due to a lack of scalable algorithms.Sparse iterative methods, in particular first-order methods, are known to be among the most effective algorithms for computing Nash equilibria in large-scale two-player zero-sum extensive-form games. In this paper, we provide, to our knowledge, the first extension of these methods to equilibrium refinements. We develop a smoothing approach for behavioral perturbations of the convex polytope that encompasses the strategy spaces of players in an extensive-form game. This enables one to compute an approximate variant of extensive-form perfect equilibria. Experiments show that our smoothing approach leads to solutions with dramatically stronger strategies at information sets that are reached with low probability in approximate Nash equilibria, while retaining the overall convergence rate associated with fast algorithms for Nash equilibrium. This has benefits both in approximate equilibrium finding (such approximation is necessary in practice in large games) where some probabilities are low while possibly heading toward zero in the limit, and exact equilibrium computation where the low probabilities are actually zero.


Author(s):  
Jiri Cermak ◽  
Branislav Bošanský ◽  
Viliam Lisý

We solve large two-player zero-sum extensive-form games with perfect recall. We propose a new algorithm based on fictitious play that significantly reduces memory requirements for storing average strategies. The key feature is exploiting imperfect recall abstractions while preserving the convergence rate and guarantees of fictitious play applied directly to the perfect recall game. The algorithm creates a coarse imperfect recall abstraction of the perfect recall game and automatically refines its information set structure only where the imperfect recall might cause problems. Experimental evaluation shows that our novel algorithm is able to solve a simplified poker game with 7.10^5 information sets using an abstracted game with only 1.8% of information sets of the original game. Additional experiments on poker and randomly generated games suggest that the relative size of the abstraction decreases as the size of the solved games increases.


2014 ◽  
Vol 11 (4) ◽  
pp. 1249-1269
Author(s):  
Luís Teófilo ◽  
Luís Reis ◽  
Henrique Cardoso ◽  
Pedro Mendes

Poker is used to measure progresses in extensive-form games research due to its unique characteristics: it is a game where playing agents have to deal with incomplete information and stochastic scenarios and a large number of decision points. The development of Poker agents has seen significant advances in one-on-one matches but there are still no consistent results in multiplayer and in games against human experts. In order to allow for experts to aid the improvement of the agents? performance, we have created a high-level strategy specification language. To support strategy definition, we have also developed an intuitive graphical tool. Additionally, we have also created a strategy inferring system, based on a dynamically weighted Euclidean distance. This approach was validated through the creation of simple agents and by successfully inferring strategies from 10 human players. The created agents were able to beat previously developed mid-level agents by a good profit margin.


2020 ◽  
Vol 34 (02) ◽  
pp. 2318-2325
Author(s):  
Youzhi Zhang ◽  
Bo An

The study of finding the equilibrium for multiplayer games is challenging. This paper focuses on computing Team-Maxmin Equilibria (TMEs) in zero-sum multiplayer Extensive-Form Games (EFGs), which describes the optimal strategies for a team of players who share the same goal but they take actions independently against an adversary. TMEs can capture many realistic scenarios, including: 1) a team of players play against a target player in poker games; and 2) defense resources schedule and patrol independently in security games. However, the study of efficiently finding TMEs within any given accuracy in EFGs is almost completely unexplored. To fill this gap, we first study the inefficiency caused by computing the equilibrium where team players correlate their strategies and then transforming it into the mixed strategy profile of the team and show that this inefficiency can be arbitrarily large. Second, to efficiently solve the non-convex program for finding TMEs directly, we develop the Associated Recursive Asynchronous Multiparametric Disaggregation Technique (ARAMDT) to approximate multilinear terms in the program with two novel techniques: 1) an asynchronous precision method to reduce the number of constraints and variables for approximation by using different precision levels to approximate these terms; and 2) an associated constraint method to reduce the feasible solution space of the mixed-integer linear program resulting from ARAMDT by exploiting the relation between these terms. Third, we develop a novel iterative algorithm to efficiently compute TMEs within any given accuracy based on ARAMDT. Our algorithm is orders of magnitude faster than baselines in the experimental evaluation.


2020 ◽  
Vol 34 (02) ◽  
pp. 1934-1941
Author(s):  
Gabriele Farina ◽  
Tommaso Bianchi ◽  
Tuomas Sandholm

Coarse correlation models strategic interactions of rational agents complemented by a correlation device which is a mediator that can recommend behavior but not enforce it. Despite being a classical concept in the theory of normal-form games since 1978, not much is known about the merits of coarse correlation in extensive-form settings. In this paper, we consider two instantiations of the idea of coarse correlation in extensive-form games: normal-form coarse-correlated equilibrium (NFCCE), already defined in the literature, and extensive-form coarse-correlated equilibrium (EFCCE), a new solution concept that we introduce. We show that EFCCEs are a subset of NFCCEs and a superset of the related extensive-form correlated equilibria. We also show that, in n-player extensive-form games, social-welfare-maximizing EFCCEs and NFCCEs are bilinear saddle points, and give new efficient algorithms for the special case of two-player games with no chance moves. Experimentally, our proposed algorithm for NFCCE is two to four orders of magnitude faster than the prior state of the art.


Author(s):  
Jiří Čermák ◽  
Viliam Lisý ◽  
Branislav Bošanský

Information abstraction is one of the methods for tackling large extensive-form games (EFGs). Removing some information available to players reduces the memory required for computing and storing strategies. We present novel domain-independent abstraction methods for creating very coarse abstractions of EFGs that still compute strategies that are (near) optimal in the original game. First, the methods start with an arbitrary abstraction of the original game (domain-specific or the coarsest possible). Next, they iteratively detect which information is required in the abstract game so that a (near) optimal strategy in the original game can be found and include this information into the abstract game. Moreover, the methods are able to exploit imperfect-recall abstractions where players can even forget the history of their own actions. We present two algorithms that follow these steps -- FPIRA, based on fictitious play, and CFR+IRA, based on counterfactual regret minimization. The experimental evaluation confirms that our methods can closely approximate Nash equilibrium of large games using abstraction with only 0.9% of information sets of the original game.


Sign in / Sign up

Export Citation Format

Share Document