Computing Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games

Youzhi Zhang; Bo An

doi:10.1609/aaai.v34i02.5610

Computing Team-Maxmin Equilibria in Zero-Sum Multiplayer Extensive-Form Games

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i02.5610 ◽

2020 ◽

Vol 34 (02) ◽

pp. 2318-2325

Author(s):

Youzhi Zhang ◽

Bo An

Keyword(s):

Solution Space ◽

Optimal Strategies ◽

Mixed Integer ◽

Multiplayer Games ◽

Extensive Form ◽

Mixed Integer Linear Program ◽

Extensive Form Games ◽

Security Games ◽

Zero Sum ◽

Constraint Method

The study of finding the equilibrium for multiplayer games is challenging. This paper focuses on computing Team-Maxmin Equilibria (TMEs) in zero-sum multiplayer Extensive-Form Games (EFGs), which describes the optimal strategies for a team of players who share the same goal but they take actions independently against an adversary. TMEs can capture many realistic scenarios, including: 1) a team of players play against a target player in poker games; and 2) defense resources schedule and patrol independently in security games. However, the study of efficiently finding TMEs within any given accuracy in EFGs is almost completely unexplored. To fill this gap, we first study the inefficiency caused by computing the equilibrium where team players correlate their strategies and then transforming it into the mixed strategy profile of the team and show that this inefficiency can be arbitrarily large. Second, to efficiently solve the non-convex program for finding TMEs directly, we develop the Associated Recursive Asynchronous Multiparametric Disaggregation Technique (ARAMDT) to approximate multilinear terms in the program with two novel techniques: 1) an asynchronous precision method to reduce the number of constraints and variables for approximation by using different precision levels to approximate these terms; and 2) an associated constraint method to reduce the feasible solution space of the mixed-integer linear program resulting from ARAMDT by exploiting the relation between these terms. Third, we develop a novel iterative algorithm to efficiently compute TMEs within any given accuracy based on ARAMDT. Our algorithm is orders of magnitude faster than baselines in the experimental evaluation.

Download Full-text

Solving Large Extensive-Form Games with Strategy Constraints

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011861 ◽

2019 ◽

Vol 33 ◽

pp. 1861-1868

Author(s):

Trevor Davis ◽

Kevin Waugh ◽

Michael Bowling

Keyword(s):

Private Information ◽

Imperfect Information ◽

Risk Mitigation ◽

Solution Concept ◽

Optimal Strategies ◽

Linear Constraints ◽

Convex Constraints ◽

Extensive Form ◽

Extensive Form Games ◽

Large Extensive Form Games

Extensive-form games are a common model for multiagent interactions with imperfect information. In two-player zerosum games, the typical solution concept is a Nash equilibrium over the unconstrained strategy set for each player. In many situations, however, we would like to constrain the set of possible strategies. For example, constraints are a natural way to model limited resources, risk mitigation, safety, consistency with past observations of behavior, or other secondary objectives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; however, state-of-the-art algorithms for solving large games cannot handle general constraints. In this work we introduce a generalized form of Counterfactual Regret Minimization that provably finds optimal strategies under any feasible set of convex constraints. We demonstrate the effectiveness of our algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given only partial observations of private information.

Download Full-text

An Exact Double-Oracle Algorithm for Zero-Sum Extensive-Form Games with Imperfect Information

Journal of Artificial Intelligence Research ◽

10.1613/jair.4477 ◽

2014 ◽

Vol 51 ◽

pp. 829-866 ◽

Cited By ~ 14

Author(s):

B. Bosansky ◽

C. Kiekintveld ◽

V. Lisy ◽

M. Pechoucek

Keyword(s):

Nash Equilibrium ◽

Imperfect Information ◽

Search Algorithm ◽

Main Idea ◽

Substantial Improvement ◽

Extensive Form ◽

Extensive Form Games ◽

Solution Algorithms ◽

Restricted Game ◽

Zero Sum

Developing scalable solution algorithms is one of the central problems in computational game theory. We present an iterative algorithm for computing an exact Nash equilibrium for two-player zero-sum extensive-form games with imperfect information. Our approach combines two key elements: (1) the compact sequence-form representation of extensive-form games and (2) the algorithmic framework of double-oracle methods. The main idea of our algorithm is to restrict the game by allowing the players to play only selected sequences of available actions. After solving the restricted game, new sequences are added by finding best responses to the current solution using fast algorithms. We experimentally evaluate our algorithm on a set of games inspired by patrolling scenarios, board, and card games. The results show significant runtime improvements in games admitting an equilibrium with small support, and substantial improvement in memory use even on games with large support. The improvement in memory use is particularly important because it allows our algorithm to solve much larger game instances than existing linear programming methods. Our main contributions include (1) a generic sequence-form double-oracle algorithm for solving zero-sum extensive-form games; (2) fast methods for maintaining a valid restricted game model when adding new sequences; (3) a search algorithm and pruning methods for computing best-response sequences; (4) theoretical guarantees about the convergence of the algorithm to a Nash equilibrium; (5) experimental analysis of our algorithm on several games, including an approximate version of the algorithm.

Download Full-text

Efficient airplane arrival scheduling using a set partitioning-based branch-and-price method

Proceedings of the Institution of Mechanical Engineers Part G Journal of Aerospace Engineering ◽

10.1177/0954410017718566 ◽

2017 ◽

Vol 232 (16) ◽

pp. 2939-2951

Author(s):

Jae-Hoon Song ◽

Han-Lim Choi

Keyword(s):

Large Scale ◽

Time Window ◽

Heuristic Method ◽

Solution Space ◽

Exact Algorithm ◽

Set Partitioning ◽

Mixed Integer ◽

Branch And Price ◽

Mixed Integer Linear Program ◽

Public Data

This article presents an exact algorithm that is combined with a heuristic method to find the optimal solution for an airplane landing problem. For a given set of airplanes and runways, the objective is to minimize the accumulated deviations from the target landing time of the airplanes. A cost associated with landing either earlier or later than the target landing time is incurred for each airplane within its predetermined time window. In order to manage this type of large-scale optimization problem, a set partitioning formulation that results in a mixed integer linear program is proposed. One key contribution of this article is the development of a branch-and-price methodology, in which the column generation method is integrated with the branch-and-bound method in order to find the optimal integer solution. In addition to the exact algorithm, a simple heuristic method is also presented to tighten the solution space. Numerical experiments are undertaken for the proposed algorithm in order to confirm its effectiveness using public data from the OR-Library. As an application in the real-world situation of airplane landing, air traffic data from Incheon International Airport is employed to assure the efficiency of the proposed algorithm.

Download Full-text

Computing optimal strategies to commit to in extensive-form games

Proceedings of the 11th ACM conference on Electronic commerce - EC '10 ◽

10.1145/1807342.1807354 ◽

2010 ◽

Cited By ~ 18

Author(s):

Joshua Letchford ◽

Vincent Conitzer

Keyword(s):

Optimal Strategies ◽

Extensive Form ◽

Extensive Form Games

Download Full-text

An Algorithm for Constructing and Solving Imperfect Recall Abstractions of Large Extensive-Form Games

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/130 ◽

2017 ◽

Cited By ~ 2

Author(s):

Jiri Cermak ◽

Branislav Bošanský ◽

Viliam Lisý

Keyword(s):

Relative Size ◽

Fictitious Play ◽

Perfect Recall ◽

Extensive Form ◽

Extensive Form Games ◽

Imperfect Recall ◽

Information Sets ◽

Information Set ◽

Zero Sum ◽

Large Extensive Form Games

We solve large two-player zero-sum extensive-form games with perfect recall. We propose a new algorithm based on fictitious play that significantly reduces memory requirements for storing average strategies. The key feature is exploiting imperfect recall abstractions while preserving the convergence rate and guarantees of fictitious play applied directly to the perfect recall game. The algorithm creates a coarse imperfect recall abstraction of the perfect recall game and automatically refines its information set structure only where the imperfect recall might cause problems. Experimental evaluation shows that our novel algorithm is able to solve a simplified poker game with 7.10^5 information sets using an abstracted game with only 1.8% of information sets of the original game. Additional experiments on poker and randomly generated games suggest that the relative size of the abstraction decreases as the size of the solved games increases.

Download Full-text

Large Scale Learning of Agent Rationality in Two-Player Zero-Sum Games

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016104 ◽

2019 ◽

Vol 33 ◽

pp. 6104-6111 ◽

Cited By ~ 1

Author(s):

Chun Kai Ling ◽

Fei Fang ◽

J. Zico Kolter

Keyword(s):

Real World ◽

Large Scale ◽

Scale Up ◽

Synthetic Data ◽

Extensive Form ◽

Extensive Form Games ◽

Zero Sum Games ◽

Primal Dual ◽

End To End ◽

Zero Sum

With the recent advances in solving large, zero-sum extensive form games, there is a growing interest in the inverse problem of inferring underlying game parameters given only access to agent actions. Although a recent work provides a powerful differentiable end-to-end learning frameworks which embed a game solver within a deep-learning framework, allowing unknown game parameters to be learned via backpropagation, this framework faces significant limitations when applied to boundedly rational human agents and large scale problems, leading to poor practicality. In this paper, we address these limitations and propose a framework that is applicable for more practical settings. First, seeking to learn the rationality of human agents in complex two-player zero-sum games, we draw upon well-known ideas in decision theory to obtain a concise and interpretable agent behavior model, and derive solvers and gradients for end-to-end learning. Second, to scale up to large, real-world scenarios, we propose an efficient first-order primal-dual method which exploits the structure of extensive-form games, yielding significantly faster computation for both game solving and gradient computation. When tested on randomly generated games, we report speedups of orders of magnitude over previous approaches. We also demonstrate the effectiveness of our model on both real-world one-player settings and synthetic data.

Download Full-text

Attack–Defense Trees and Two-Player Binary Zero-Sum Extensive Form Games Are Equivalent

Lecture Notes in Computer Science - Decision and Game Theory for Security ◽

10.1007/978-3-642-17197-0_17 ◽

2010 ◽

pp. 245-256 ◽

Cited By ~ 18

Author(s):

Barbara Kordy ◽

Sjouke Mauw ◽

Matthijs Melissen ◽

Patrick Schweitzer

Keyword(s):

Extensive Form ◽

Extensive Form Games ◽

Zero Sum

Download Full-text

A Memetic Approach for Sequential Security Games on a Plane with Moving Targets

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301970 ◽

2019 ◽

Vol 33 ◽

pp. 970-977

Author(s):

Jan Karwowski ◽

Jacek Mańdziuk ◽

Adam Żychowski ◽

Filip Grajek ◽

Bo An

Keyword(s):

Linear Time ◽

Approximate Solutions ◽

Mixed Integer ◽

Medium Size ◽

Local Improvement ◽

Straight Line ◽

Equilibrium Profiles ◽

Security Games ◽

New Type ◽

Zero Sum

This paper introduces a new type of Security Games (SG) played on a plane with targets moving along predefined straight line trajectories and its respective Mixed Integer Linear Programming (MILP) formulation. Three approaches for solving the game are proposed and experimentally evaluated: application of an MILP solver to finding exact solutions for small-size games, MILP-based extension of recently published zero-sum SG approach to the case of generalsum games for finding approximate solutions of medium-size games, and the use of Memetic Algorithm (MA) for mediumsize and large-size game instances, which are beyond MILP’s scalability. Utilization of MA is, to the best of our knowledge, a new idea in the field of SG. The novelty of proposed solution lies specifically in efficient chromosome-based game encoding and dedicated local improvement heuristics. In vast majority of test cases with known equilibrium profiles, the method leads to optimal solutions with high stability and approximately linear time scalability. Another advantage is an iteration-based construction of the system, which makes the approach essentially an anytime method. This property is of paramount importance in case of restrictive time limits, which could hinder the possibility of calculating an exact solution. On a general note, we believe that MA-based methods may offer a viable alternative to MILP solvers for complex games that require application of approximate solving methods.

Download Full-text

Designing the Game to Play: Optimizing Payoff Structure in Security Games

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/71 ◽

2018 ◽

Author(s):

Zheyuan Ryan Shi ◽

Ziye Tang ◽

Long Tran-Thanh ◽

Rohit Singh ◽

Fei Fang

Keyword(s):

Polynomial Time ◽

Approximation Scheme ◽

Mixed Integer ◽

Budget Constraints ◽

Mixed Integer Linear Program ◽

Norm Form ◽

Security Games ◽

Payoff Structure ◽

Norm Constraint ◽

Approximation Guarantee

We study Stackelberg Security Games where the defender, in addition to allocating defensive resources to protect targets from the attacker, can strategically manipulate the attacker’s payoff under budget constraints in weighted L^p-norm form regarding the amount of change. For the case of weighted L^1-norm constraint, we present (i) a mixed integer linear program-based algorithm with approximation guarantee; (ii) a branch-and-bound based algorithm with improved efficiency achieved by effective pruning; (iii) a polynomial time approximation scheme for a special but practical class of problems. In addition, we show that problems under budget constraints in L^0 and weighted L^\infty-norm form can be solved in polynomial time.

Download Full-text

Unit commitment using complementarity

10.32920/ryerson.14648058.v1 ◽

2021 ◽

Author(s):

Steven V. Craig

Keyword(s):

Unit Commitment ◽

Discontinuous Solution ◽

Solution Space ◽

Mixed Integer ◽

Nonlinear Constraints ◽

Mixed Integer Linear Program ◽

Integer Variables ◽

Complementarity Theory ◽

Speed And Accuracy ◽

Linear Nature

A need exists to optimally dispatch power generation to meet per-hour requirements on the power grid. This is a well documented and established problem called Unit Commitment (UC). It is commonly formulated as a Mixed Integer Linear Program (MILP), which utilizes intelligent solvers to produce a solution with speed and accuracy. The linear nature of MILP requires linear approximations of nonlinear constraints. This work introduces the Theory of Complementarity in order to remove integer variables, resulting in a continuous rather than a discontinuous solution space. This permits use of classical solution techniques, as well as nonlinear constraints, thereby increasing accuracy. A formulation is developed to demonstrate a proof of concept of the complementarity theory as used in UC. A subset of constraints will be used and the results will be compared against an MILP optimization, for 10-and 26-generator configurations. Similar trends in generator status and total cost are noted.

Download Full-text