scholarly journals Heuristic Sensing: An Uncertainty Exploration Method in Imperfect Information Games

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Zhenyang Guo ◽  
Xuan Wang ◽  
Shuhan Qi ◽  
Tao Qian ◽  
Jiajia Zhang

Imperfect information games have served as benchmarks and milestones in fields of artificial intelligence (AI) and game theory for decades. Sensing and exploiting information to effectively describe the game environment is of critical importance for game solving, besides computing or approximating an optimal strategy. Reconnaissance blind chess (RBC), a new variant of chess, is a quintessential game of imperfect information where the player’s actions are definitely unobserved by the opponent. This characteristic of RBC exponentially expands the scale of the information set and extremely invokes uncertainty of the game environment. In this paper, we introduce a novel sense method, Heuristic Search of Uncertainty Control (HSUC), to significantly reduce the uncertainty of real-time information set. The key idea of HSUC is to consider the whole uncertainty of the environment rather than predicting the opponents’ strategy. Furthermore, we realize a practical framework for RBC game that incorporates our HSUC method with Monte Carlo Tree Search (MCTS). In the experiments, HSUC has shown better effectiveness and robustness than comparison opponents in information sensing. It is worth mentioning that our RBC game agent has won the first place in terms of uncertainty management in NeurIPS 2019 RBC tournament.

2014 ◽  
Vol 54 (5) ◽  
pp. 333-340
Author(s):  
Viliam Lisy

We evaluate the performance of various selection methods for the Monte Carlo Tree Search algorithm in two-player zero-sum extensive-form games with imperfect information. We compare the standard Upper Confident Bounds applied to Trees (UCT) along with the less common Exponential Weights for Exploration and Exploitation (Exp3) and novel Regret matching (RM) selection in two distinct imperfect information games: Imperfect Information Goofspiel and Phantom Tic-Tac-Toe. We show that UCT after initial fast convergence towards a Nash equilibrium computes increasingly worse strategies after some point in time. This is not the case with Exp3 and RM, which also show superior performance in head-to-head matches.


2021 ◽  
Vol 66 (2) ◽  
pp. 51
Author(s):  
T.-V. Pricope

Imperfect information games describe many practical applications found in the real world as the information space is rarely fully available. This particular set of problems is challenging due to the random factor that makes even adaptive methods fail to correctly model the problem and find the best solution. Neural Fictitious Self Play (NFSP) is a powerful algorithm for learning approximate Nash equilibrium of imperfect information games from self-play. However, it uses only crude data as input and its most successful experiment was on the in-limit version of Texas Hold’em Poker. In this paper, we develop a new variant of NFSP that combines the established fictitious self-play with neural gradient play in an attempt to improve the performance on large-scale zero-sum imperfect information games and to solve the more complex no-limit version of Texas Hold’em Poker using powerful handcrafted metrics and heuristics alongside crude, raw data. When applied to no-limit Hold’em Poker, the agents trained through self-play outperformed the ones that used fictitious play with a normal-form single-step approach to the game. Moreover, we showed that our algorithm converges close to a Nash equilibrium within the limited training process of our agents with very limited hardware. Finally, our best self-play-based agent learnt a strategy that rivals expert human level.  


2020 ◽  
Vol 283 ◽  
pp. 103218 ◽  
Author(s):  
Christian Kroer ◽  
Tuomas Sandholm

Author(s):  
Mitsuo Wakatsuki ◽  
Mari Fujimura ◽  
Tetsuro Nishino

The authors are concerned with a card game called Daihinmin (Extreme Needy), which is a multi-player imperfect information game. Using Marvin Minsky's “Society of Mind” theory, they attempt to model the workings of the minds of game players. The UEC Computer Daihinmin Competitions (UECda) have been held at the University of Electro-Communications since 2006, to bring together competitive client programs that correspond to players of Daihinmin, and contest their strengths. In this paper, the authors extract the behavior of client programs from actual competition records of the computer Daihinmin, and propose a method of building a system that determines the parameters of Daihinmin agencies by machine learning.


Author(s):  
Darse Billings ◽  
Aaron Davidson ◽  
Terence Schauenberg ◽  
Neil Burch ◽  
Michael Bowling ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document