Rule based strategies for large extensive-form games: A specification language for No-Limit Texas Hold’em agents

Luís Teófilo; Luís Reis; Henrique Cardoso; Pedro Mendes

doi:10.2298/csis130921029t

Rule based strategies for large extensive-form games: A specification language for No-Limit Texas Hold’em agents

Computer Science and Information Systems ◽

10.2298/csis130921029t ◽

2014 ◽

Vol 11 (4) ◽

pp. 1249-1269

Author(s):

Luís Teófilo ◽

Luís Reis ◽

Henrique Cardoso ◽

Pedro Mendes

Keyword(s):

Euclidean Distance ◽

Specification Language ◽

Extensive Form ◽

Extensive Form Games ◽

Rule Based ◽

Support Strategy ◽

Decision Points ◽

Graphical Tool ◽

High Level ◽

Large Extensive Form Games

Poker is used to measure progresses in extensive-form games research due to its unique characteristics: it is a game where playing agents have to deal with incomplete information and stochastic scenarios and a large number of decision points. The development of Poker agents has seen significant advances in one-on-one matches but there are still no consistent results in multiplayer and in games against human experts. In order to allow for experts to aid the improvement of the agents? performance, we have created a high-level strategy specification language. To support strategy definition, we have also developed an intuitive graphical tool. Additionally, we have also created a strategy inferring system, based on a dynamically weighted Euclidean distance. This approach was validated through the creation of simple agents and by successfully inferring strategies from 10 human players. The created agents were able to beat previously developed mid-level agents by a good profit margin.

Download Full-text

Solving Large Extensive-Form Games with Strategy Constraints

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011861 ◽

2019 ◽

Vol 33 ◽

pp. 1861-1868

Author(s):

Trevor Davis ◽

Kevin Waugh ◽

Michael Bowling

Keyword(s):

Private Information ◽

Imperfect Information ◽

Risk Mitigation ◽

Solution Concept ◽

Optimal Strategies ◽

Linear Constraints ◽

Convex Constraints ◽

Extensive Form ◽

Extensive Form Games ◽

Large Extensive Form Games

Extensive-form games are a common model for multiagent interactions with imperfect information. In two-player zerosum games, the typical solution concept is a Nash equilibrium over the unconstrained strategy set for each player. In many situations, however, we would like to constrain the set of possible strategies. For example, constraints are a natural way to model limited resources, risk mitigation, safety, consistency with past observations of behavior, or other secondary objectives for an agent. In small games, optimal strategies under linear constraints can be found by solving a linear program; however, state-of-the-art algorithms for solving large games cannot handle general constraints. In this work we introduce a generalized form of Counterfactual Regret Minimization that provably finds optimal strategies under any feasible set of convex constraints. We demonstrate the effectiveness of our algorithm for finding strategies that mitigate risk in security games, and for opponent modeling in poker games when given only partial observations of private information.

Download Full-text

An Algorithm for Constructing and Solving Imperfect Recall Abstractions of Large Extensive-Form Games

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/130 ◽

2017 ◽

Cited By ~ 2

Author(s):

Jiri Cermak ◽

Branislav Bošanský ◽

Viliam Lisý

Keyword(s):

Relative Size ◽

Fictitious Play ◽

Perfect Recall ◽

Extensive Form ◽

Extensive Form Games ◽

Imperfect Recall ◽

Information Sets ◽

Information Set ◽

Zero Sum ◽

Large Extensive Form Games

We solve large two-player zero-sum extensive-form games with perfect recall. We propose a new algorithm based on fictitious play that significantly reduces memory requirements for storing average strategies. The key feature is exploiting imperfect recall abstractions while preserving the convergence rate and guarantees of fictitious play applied directly to the perfect recall game. The algorithm creates a coarse imperfect recall abstraction of the perfect recall game and automatically refines its information set structure only where the imperfect recall might cause problems. Experimental evaluation shows that our novel algorithm is able to solve a simplified poker game with 7.10^5 information sets using an abstracted game with only 1.8% of information sets of the original game. Additional experiments on poker and randomly generated games suggest that the relative size of the abstraction decreases as the size of the solved games increases.

Download Full-text

Automated Construction of Bounded-Loss Imperfect-Recall Abstractions in Extensive-Form Games (Extended Abstract)

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/701 ◽

2020 ◽

Author(s):

Jiří Čermák ◽

Viliam Lisý ◽

Branislav Bošanský

Keyword(s):

Extensive Form ◽

Original Game ◽

Extensive Form Games ◽

Regret Minimization ◽

Imperfect Recall ◽

Domain Specific ◽

Approximate Nash Equilibrium ◽

Information Sets ◽

History Of ◽

Large Extensive Form Games

Information abstraction is one of the methods for tackling large extensive-form games (EFGs). Removing some information available to players reduces the memory required for computing and storing strategies. We present novel domain-independent abstraction methods for creating very coarse abstractions of EFGs that still compute strategies that are (near) optimal in the original game. First, the methods start with an arbitrary abstraction of the original game (domain-specific or the coarsest possible). Next, they iteratively detect which information is required in the abstract game so that a (near) optimal strategy in the original game can be found and include this information into the abstract game. Moreover, the methods are able to exploit imperfect-recall abstractions where players can even forget the history of their own actions. We present two algorithms that follow these steps -- FPIRA, based on fictitious play, and CFR+IRA, based on counterfactual regret minimization. The experimental evaluation confirms that our methods can closely approximate Nash equilibrium of large games using abstraction with only 0.9% of information sets of the original game.

Download Full-text

The Value of Large Extensive Form Games

Games and Economic Behavior ◽

10.1006/game.1994.1053 ◽

1994 ◽

Vol 7 (3) ◽

pp. 309-317 ◽

Cited By ~ 1

Author(s):

Jacques Crémer

Keyword(s):

Extensive Form ◽

Extensive Form Games ◽

Large Extensive Form Games

Download Full-text

Large extensive form games

Economic Theory ◽

10.1007/s00199-011-0674-y ◽

2011 ◽

Vol 52 (1) ◽

pp. 75-102 ◽

Cited By ~ 12

Author(s):

Carlos Alós-Ferrer ◽

Klaus Ritzberger

Keyword(s):

Extensive Form ◽

Extensive Form Games ◽

Large Extensive Form Games

Download Full-text

Characterizing existence of equilibrium for large extensive form games: a necessity result

Economic Theory ◽

10.1007/s00199-015-0937-0 ◽

2015 ◽

Vol 63 (2) ◽

pp. 407-430 ◽

Cited By ~ 11

Author(s):

Carlos Alós-Ferrer ◽

Klaus Ritzberger

Keyword(s):

Extensive Form ◽

Extensive Form Games ◽

Existence Of Equilibrium ◽

Large Extensive Form Games

Download Full-text

Chapter Five. Solving Extensive-Form Games: Backwards Induction and Subgame Perfection

Game Theory for Political Scientists ◽

10.1515/9780691213200-007 ◽

1995 ◽

pp. 121-160

Keyword(s):

Extensive Form ◽

Extensive Form Games ◽

Backwards Induction ◽

Subgame Perfection

Download Full-text

Transfer Reinforcement Learning for Autonomous Driving

ACM Transactions on Modeling and Computer Simulation ◽

10.1145/3449356 ◽

2021 ◽

Vol 31 (3) ◽

pp. 1-26

Author(s):

Aravind Balakrishnan ◽

Jaeyoung Lee ◽

Ashish Gaurav ◽

Krzysztof Czarnecki ◽

Sean Sedwards

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Transfer Problem ◽

Autonomous Driving ◽

High Fidelity ◽

Rule Based ◽

High Level ◽

Real Vehicle

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.

Download Full-text

Pareto Optimality and Optimistic Stability in Repeated Extensive Form Games

Journal of Economic Theory ◽

10.1006/jeth.1996.0064 ◽

1996 ◽

Vol 69 (2) ◽

pp. 470-489 ◽

Cited By ~ 6

Author(s):

Steven Tadelis

Keyword(s):

Pareto Optimality ◽

Extensive Form ◽

Extensive Form Games

Download Full-text

ACRIPPER: A New Associative Classification Based on RIPPER Algorithm

Journal of Information & Knowledge Management ◽

10.1142/s0219649221500131 ◽

2021 ◽

Vol 20 (01) ◽

pp. 2150013

Author(s):

Mohammed Abu-Arqoub ◽

Wael Hadi ◽

Abdelraouf Ishtaiwi

Keyword(s):

Decision Making ◽

Class Imbalance ◽

The Other ◽

Associative Classification ◽

Rule Based ◽

New Approach ◽

New Novel ◽

Accuracy Measure ◽

High Level ◽

Substantial Interest

Associative Classification (AC) classifiers are of substantial interest due to their ability to be utilised for mining vast sets of rules. However, researchers over the decades have shown that a large number of these mined rules are trivial, irrelevant, redundant, and sometimes harmful, as they can cause decision-making bias. Accordingly, in our paper, we address these challenges and propose a new novel AC approach based on the RIPPER algorithm, which we refer to as ACRIPPER. Our new approach combines the strength of the RIPPER algorithm with the classical AC method, in order to achieve: (1) a reduction in the number of rules being mined, especially those rules that are largely insignificant; (2) a high level of integration among the confidence and support of the rules on one hand and the class imbalance level in the prediction phase on the other hand. Our experimental results, using 20 different well-known datasets, reveal that the proposed ACRIPPER significantly outperforms the well-known rule-based algorithms RIPPER and J48. Moreover, ACRIPPER significantly outperforms the current AC-based algorithms CBA, CMAR, ECBA, FACA, and ACPRISM. Finally, ACRIPPER is found to achieve the best average and ranking on the accuracy measure.

Download Full-text