imperfect information games
Recently Published Documents


TOTAL DOCUMENTS

36
(FIVE YEARS 16)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 66 (2) ◽  
pp. 51
Author(s):  
T.-V. Pricope

Imperfect information games describe many practical applications found in the real world as the information space is rarely fully available. This particular set of problems is challenging due to the random factor that makes even adaptive methods fail to correctly model the problem and find the best solution. Neural Fictitious Self Play (NFSP) is a powerful algorithm for learning approximate Nash equilibrium of imperfect information games from self-play. However, it uses only crude data as input and its most successful experiment was on the in-limit version of Texas Hold’em Poker. In this paper, we develop a new variant of NFSP that combines the established fictitious self-play with neural gradient play in an attempt to improve the performance on large-scale zero-sum imperfect information games and to solve the more complex no-limit version of Texas Hold’em Poker using powerful handcrafted metrics and heuristics alongside crude, raw data. When applied to no-limit Hold’em Poker, the agents trained through self-play outperformed the ones that used fictitious play with a normal-form single-step approach to the game. Moreover, we showed that our algorithm converges close to a Nash equilibrium within the limited training process of our agents with very limited hardware. Finally, our best self-play-based agent learnt a strategy that rivals expert human level.  


2021 ◽  
Vol 231 ◽  
pp. 107434
Author(s):  
Huale Li ◽  
Xuan Wang ◽  
Kunchi Li ◽  
Fengwei Jia ◽  
Yulin Wu ◽  
...  

Electronics ◽  
2021 ◽  
Vol 10 (17) ◽  
pp. 2087
Author(s):  
Jiahui Xu ◽  
Jing Chen ◽  
Shaofei Chen

In the development of artificial intelligence (AI), games have often served as benchmarks to promote remarkable breakthroughs in models and algorithms. No-limit Texas Hold’em (NLTH) is one of the most popular and challenging poker games. Despite numerous studies having been conducted on this subject, there are still some important problems that remain to be solved, such as opponent exploitation, which means to adaptively and effectively exploit specific opponent strategies; this is acknowledged as a vital issue especially in NLTH and many real-world scenarios. Previous researchers tried to use an off-policy reinforcement learning (RL) method to train agents that directly learn from historical strategy interactions but suffered from challenges of sparse rewards. Other researchers instead adopted neuroevolutionary (NE) method to replace RL for policy parameter updates but suffered from high sample complexity due to the large-scale problem of NLTH. In this work, we propose NE_RL, a novel method combing NE with RL for opponent exploitation in NLTH. Our method contains a hybrid framework that uses NE’s advantage of evolutionary computation with a long-term fitness metric to address the sparse rewards feedback in NLTH and retains RL’s gradient-based method for higher learning efficiency. Experimental results against multiple baseline opponents have proved the feasibility of our method with significant improvement compared to previous methods. We hope this paper provides an effective new approach for opponent exploitation in NLTH and other large-scale imperfect information games.


2021 ◽  
Vol 294 ◽  
pp. 103453
Author(s):  
Guifei Jiang ◽  
Dongmo Zhang ◽  
Laurent Perrussel ◽  
Heng Zhang

2021 ◽  
pp. 1-1
Author(s):  
Rinu Boney ◽  
Alexander Ilin ◽  
Juho Kannala ◽  
Jarno Seppanen

2020 ◽  
Vol 65 (2) ◽  
pp. 31
Author(s):  
T.V. Pricope

Many real-world applications can be described as large-scale games of imperfect information. This kind of games is particularly harder than the deterministic one as the search space is even more sizeable. In this paper, I want to explore the power of reinforcement learning in such an environment; that is why I take a look at one of the most popular game of such type, no limit Texas Hold’em Poker, yet unsolved, developing multiple agents with different learning paradigms and techniques and then comparing their respective performances. When applied to no-limit Hold’em Poker, deep reinforcement learning agents clearly outperform agents with a more traditional approach. Moreover, if these last agents rival a human beginner level of play, the ones based on reinforcement learning compare to an amateur human player. The main algorithm uses Fictitious Play in combination with ANNs and some handcrafted metrics. We also applied the main algorithm to another game of imperfect information, less complex than Poker, in order to show the scalability of this solution and the increase in performance when put neck in neck with established classical approaches from the reinforcement learning literature.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-9
Author(s):  
Zhenyang Guo ◽  
Xuan Wang ◽  
Shuhan Qi ◽  
Tao Qian ◽  
Jiajia Zhang

Imperfect information games have served as benchmarks and milestones in fields of artificial intelligence (AI) and game theory for decades. Sensing and exploiting information to effectively describe the game environment is of critical importance for game solving, besides computing or approximating an optimal strategy. Reconnaissance blind chess (RBC), a new variant of chess, is a quintessential game of imperfect information where the player’s actions are definitely unobserved by the opponent. This characteristic of RBC exponentially expands the scale of the information set and extremely invokes uncertainty of the game environment. In this paper, we introduce a novel sense method, Heuristic Search of Uncertainty Control (HSUC), to significantly reduce the uncertainty of real-time information set. The key idea of HSUC is to consider the whole uncertainty of the environment rather than predicting the opponents’ strategy. Furthermore, we realize a practical framework for RBC game that incorporates our HSUC method with Monte Carlo Tree Search (MCTS). In the experiments, HSUC has shown better effectiveness and robustness than comparison opponents in information sensing. It is worth mentioning that our RBC game agent has won the first place in terms of uncertainty management in NeurIPS 2019 RBC tournament.


Author(s):  
Daochen Zha ◽  
Kwei-Herng Lai ◽  
Songyi Huang ◽  
Yuanpu Cao ◽  
Keerthana Reddy ◽  
...  

We present RLCard, a Python platform for reinforcement learning research and development in card games. RLCard supports various card environments and several baseline algorithms with unified easy-to-use interfaces, aiming at bridging reinforcement learning and imperfect information games. The platform provides flexible configurations of state representation, action encoding, and reward design. RLCard also supports visualizations for algorithm debugging. In this demo, we showcase two representative environments and their visualization results. We conclude this demo with challenges and research opportunities brought by RLCard. A video is available on YouTube.


2020 ◽  
Vol 283 ◽  
pp. 103218 ◽  
Author(s):  
Christian Kroer ◽  
Tuomas Sandholm

Sign in / Sign up

Export Citation Format

Share Document