Heuristic Sensing: An Uncertainty Exploration Method in Imperfect Information Games

Imperfect information games have served as benchmarks and milestones in fields of artificial intelligence (AI) and game theory for decades. Sensing and exploiting information to effectively describe the game environment is of critical importance for game solving, besides computing or approximating an optimal strategy. Reconnaissance blind chess (RBC), a new variant of chess, is a quintessential game of imperfect information where the player’s actions are definitely unobserved by the opponent. This characteristic of RBC exponentially expands the scale of the information set and extremely invokes uncertainty of the game environment. In this paper, we introduce a novel sense method, Heuristic Search of Uncertainty Control (HSUC), to significantly reduce the uncertainty of real-time information set. The key idea of HSUC is to consider the whole uncertainty of the environment rather than predicting the opponents’ strategy. Furthermore, we realize a practical framework for RBC game that incorporates our HSUC method with Monte Carlo Tree Search (MCTS). In the experiments, HSUC has shown better effectiveness and robustness than comparison opponents in information sensing. It is worth mentioning that our RBC game agent has won the first place in terms of uncertainty management in NeurIPS 2019 RBC tournament.

Download Full-text

ALTERNATIVE SELECTION FUNCTIONS FOR INFORMATION SET MONTE CARLO TREE SEARCH

Acta Polytechnica ◽

10.14311/ap.2014.54.0333 ◽

2014 ◽

Vol 54 (5) ◽

pp. 333-340

Author(s):

Viliam Lisy

Keyword(s):

Monte Carlo ◽

Imperfect Information ◽

Search Algorithm ◽

Superior Performance ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Information Set ◽

Imperfect Information Games ◽

Zero Sum ◽

Tree Search Algorithm

We evaluate the performance of various selection methods for the Monte Carlo Tree Search algorithm in two-player zero-sum extensive-form games with imperfect information. We compare the standard Upper Confident Bounds applied to Trees (UCT) along with the less common Exponential Weights for Exploration and Exploitation (Exp3) and novel Regret matching (RM) selection in two distinct imperfect information games: Imperfect Information Goofspiel and Phantom Tic-Tac-Toe. We show that UCT after initial fast convergence towards a Nash equilibrium computes increasingly worse strategies after some point in time. This is not the case with Exp3 and RM, which also show superior performance in head-to-head matches.

Download Full-text

Deep Reinforcement Learning from Self-Play in No-limit Texas Hold'em Poker

Studia Universitatis Babeș-Bolyai Informatica ◽

10.24193/subbi.2021.2.04 ◽

2021 ◽

Vol 66 (2) ◽

pp. 51

Author(s):

T.-V. Pricope

Keyword(s):

Nash Equilibrium ◽

Imperfect Information ◽

Large Scale ◽

Adaptive Methods ◽

Single Step ◽

Random Factor ◽

Approximate Nash Equilibrium ◽

New Variant ◽

Imperfect Information Games ◽

Information Games

Imperfect information games describe many practical applications found in the real world as the information space is rarely fully available. This particular set of problems is challenging due to the random factor that makes even adaptive methods fail to correctly model the problem and find the best solution. Neural Fictitious Self Play (NFSP) is a powerful algorithm for learning approximate Nash equilibrium of imperfect information games from self-play. However, it uses only crude data as input and its most successful experiment was on the in-limit version of Texas Hold’em Poker. In this paper, we develop a new variant of NFSP that combines the established fictitious self-play with neural gradient play in an attempt to improve the performance on large-scale zero-sum imperfect information games and to solve the more complex no-limit version of Texas Hold’em Poker using powerful handcrafted metrics and heuristics alongside crude, raw data. When applied to no-limit Hold’em Poker, the agents trained through self-play outperformed the ones that used fictitious play with a normal-form single-step approach to the game. Moreover, we showed that our algorithm converges close to a Nash equilibrium within the limited training process of our agents with very limited hardware. Finally, our best self-play-based agent learnt a strategy that rivals expert human level.

Download Full-text

Limited lookahead in imperfect-information games

Artificial Intelligence ◽

10.1016/j.artint.2019.103218 ◽

2020 ◽

Vol 283 ◽

pp. 103218 ◽

Cited By ~ 2

Author(s):

Christian Kroer ◽

Tuomas Sandholm

Keyword(s):

Imperfect Information ◽

Imperfect Information Games ◽

Information Games

Download Full-text

Imperfect Information Games

Microeconomic Theory - Springer Texts in Business and Economics ◽

10.1007/978-981-13-0041-7_7 ◽

2018 ◽

pp. 209-269

Author(s):

Susheng Wang

Keyword(s):

Imperfect Information ◽

Imperfect Information Games ◽

Information Games

Download Full-text

Solving Imperfect-Information Games

SSRN Electronic Journal ◽

10.2139/ssrn.3416010 ◽

2019 ◽

Author(s):

Deepanshu Vasal

Keyword(s):

Imperfect Information ◽

Imperfect Information Games ◽

Information Games

Download Full-text

A hybrid architecture for strategically complex imperfect information games

1999 Third International Conference on Knowledge-Based Intelligent Information Engineering Systems. Proceedings (Cat. No.99TH8410) ◽

10.1109/kes.1999.820115 ◽

2003 ◽

Author(s):

A.E. Bud ◽

A.E. Nicholson ◽

I. Zukerman ◽

D.W. Albrecht

Keyword(s):

Imperfect Information ◽

Hybrid Architecture ◽

Imperfect Information Games ◽

Information Games

Download Full-text

A Decision Making Method Based on Society of Mind Theory in Multi-Player Imperfect Information Games

Deep Learning and Neural Networks ◽

10.4018/978-1-7998-0414-7.ch019 ◽

2020 ◽

pp. 317-329

Author(s):

Mitsuo Wakatsuki ◽

Mari Fujimura ◽

Tetsuro Nishino

Keyword(s):

Machine Learning ◽

Decision Making ◽

Imperfect Information ◽

Card Game ◽

Mind Theory ◽

Imperfect Information Games ◽

Game Players ◽

The University ◽

Society Of Mind ◽

Information Games

The authors are concerned with a card game called Daihinmin (Extreme Needy), which is a multi-player imperfect information game. Using Marvin Minsky's “Society of Mind” theory, they attempt to model the workings of the minds of game players. The UEC Computer Daihinmin Competitions (UECda) have been held at the University of Electro-Communications since 2006, to bring together competitive client programs that correspond to players of Daihinmin, and contest their strengths. In this paper, the authors extract the behavior of client programs from actual competition records of the computer Daihinmin, and propose a method of building a system that determines the parameters of Daihinmin agencies by machine learning.

Download Full-text