Monte Carlo Tree Search for Bayesian Reinforcement Learning

Author(s):  
Ngo Anh Vien ◽  
Wolfgang Ertel
2013 ◽  
Vol 39 (2) ◽  
pp. 345-353 ◽  
Author(s):  
Ngo Anh Vien ◽  
Wolfgang Ertel ◽  
Viet-Hung Dang ◽  
TaeChoong Chung

2020 ◽  
Vol 11 (40) ◽  
pp. 10959-10972
Author(s):  
Xiaoxue Wang ◽  
Yujie Qian ◽  
Hanyu Gao ◽  
Connor W. Coley ◽  
Yiming Mo ◽  
...  

A new MCTS variant with a reinforcement learning value network and solvent prediction model proposes shorter synthesis routes with greener solvents.


2021 ◽  
Vol 5 (CHI PLAY) ◽  
pp. 1-17
Author(s):  
Shaghayegh Roohi ◽  
Christian Guckelsberger ◽  
Asko Relas ◽  
Henri Heiskanen ◽  
Jari Takatalo ◽  
...  

This paper presents a novel approach to automated playtesting for the prediction of human player behavior and experience. We have previously demonstrated that Deep Reinforcement Learning (DRL) game-playing agents can predict both game difficulty and player engagement, operationalized as average pass and churn rates. We improve this approach by enhancing DRL with Monte Carlo Tree Search (MCTS). We also motivate an enhanced selection strategy for predictor features, based on the observation that an AI agent's best-case performance can yield stronger correlations with human data than the agent's average performance. Both additions consistently improve the prediction accuracy, and the DRL-enhanced MCTS outperforms both DRL and vanilla MCTS in the hardest levels. We conclude that player modelling via automated playtesting can benefit from combining DRL and MCTS. Moreover, it can be worthwhile to investigate a subset of repeated best AI agent runs, if AI gameplay does not yield good predictions on average.


2013 ◽  
Vol 48 ◽  
pp. 841-883 ◽  
Author(s):  
A. Guez ◽  
D. Silver ◽  
P. Dayan

Bayesian planning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, planning optimally in the face of uncertainty is notoriously taxing, since the search space is enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach avoids expensive applications of Bayes rule within the search tree by sampling models from current beliefs, and furthermore performs this sampling in a lazy manner. This enables it to outperform previous Bayesian model-based reinforcement learning algorithms by a significant margin on several well-known benchmark problems. As we show, our approach can even work in problems with an infinite state space that lie qualitatively out of reach of almost all previous work in Bayesian exploration.


Sign in / Sign up

Export Citation Format

Share Document