scholarly journals Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search

2013 ◽  
Vol 48 ◽  
pp. 841-883 ◽  
Author(s):  
A. Guez ◽  
D. Silver ◽  
P. Dayan

Bayesian planning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, planning optimally in the face of uncertainty is notoriously taxing, since the search space is enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach avoids expensive applications of Bayes rule within the search tree by sampling models from current beliefs, and furthermore performs this sampling in a lazy manner. This enables it to outperform previous Bayesian model-based reinforcement learning algorithms by a significant margin on several well-known benchmark problems. As we show, our approach can even work in problems with an infinite state space that lie qualitatively out of reach of almost all previous work in Bayesian exploration.

Author(s):  
Feng Wu ◽  
Sarvapali D. Ramchurn

We propose a novel algorithm based on Monte-Carlo tree search for the problem of coalition structure generation (CSG). Specifically, we find the optimal solution by sampling the coalition structure graph and incrementally expanding a search tree, which represents the partial space that has been searched. We prove that our algorithm is complete and converges to the optimal given sufficient number of iterations. Moreover, it is anytime and can scale to large CSG problems with many agents. Experimental results on six common CSG benchmark problems and a disaster response domain confirm the advantages of our approach comparing to the state-of-the-art methods.


2020 ◽  
Vol 11 (40) ◽  
pp. 10959-10972
Author(s):  
Xiaoxue Wang ◽  
Yujie Qian ◽  
Hanyu Gao ◽  
Connor W. Coley ◽  
Yiming Mo ◽  
...  

A new MCTS variant with a reinforcement learning value network and solvent prediction model proposes shorter synthesis routes with greener solvents.


2021 ◽  
Vol 15 (1) ◽  
pp. 46-58
Author(s):  
Xuanhe Zhou ◽  
Guoliang Li ◽  
Chengliang Chai ◽  
Jianhua Feng

Query rewrite transforms a SQL query into an equivalent one but with higher performance. However, SQL rewrite is an NP-hard problem, and existing approaches adopt heuristics to rewrite the queries. These heuristics have two main limitations. First, the order of applying different rewrite rules significantly affects the query performance. However, the search space of all possible rewrite orders grows exponentially with the number of query operators and rules and it is rather hard to find the optimal rewrite order. Existing methods apply a pre-defined order to rewrite queries and will fall in a local optimum. Second, different rewrite rules have different benefits for different queries. Existing methods work on single plans but cannot effectively estimate the benefits of rewriting a query. To address these challenges, we propose a policy tree based query rewrite framework, where the root is the input query and each node is a rewritten query from its parent. We aim to explore the tree nodes in the policy tree to find the optimal rewrite query. We propose to use Monte Carlo Tree Search to explore the policy tree, which navigates the policy tree to efficiently get the optimal node. Moreover, we propose a learning-based model to estimate the expected performance improvement of each rewritten query, which guides the tree search more accurately. We also propose a parallel algorithm that can explore the tree search in parallel in order to improve the performance. Experimental results showed that our method significantly outperformed existing approaches.


Author(s):  
Christian Roberson ◽  
Katarina Sperduto

Artificial intelligence in games serves as an excellent platform for facilitating collaborative research with undergraduates. This paper explores several aspects of a research challenge proposed for a newly-developed variant of a solitaire game. We present multiple classes of game states that can be identified as solvable or unsolvable. We present a heuristic for quickly finding goal states in a game state search tree. Finally, we introduce a Monte Carlo Tree Search-based player for the solitaire variant that can win almost any solvable starting deal efficiently.


2021 ◽  
Vol 5 (CHI PLAY) ◽  
pp. 1-17
Author(s):  
Shaghayegh Roohi ◽  
Christian Guckelsberger ◽  
Asko Relas ◽  
Henri Heiskanen ◽  
Jari Takatalo ◽  
...  

This paper presents a novel approach to automated playtesting for the prediction of human player behavior and experience. We have previously demonstrated that Deep Reinforcement Learning (DRL) game-playing agents can predict both game difficulty and player engagement, operationalized as average pass and churn rates. We improve this approach by enhancing DRL with Monte Carlo Tree Search (MCTS). We also motivate an enhanced selection strategy for predictor features, based on the observation that an AI agent's best-case performance can yield stronger correlations with human data than the agent's average performance. Both additions consistently improve the prediction accuracy, and the DRL-enhanced MCTS outperforms both DRL and vanilla MCTS in the hardest levels. We conclude that player modelling via automated playtesting can benefit from combining DRL and MCTS. Moreover, it can be worthwhile to investigate a subset of repeated best AI agent runs, if AI gameplay does not yield good predictions on average.


2013 ◽  
Vol 22 (01) ◽  
pp. 1250035 ◽  
Author(s):  
TRISTAN CAZENAVE

Monte-Carlo Tree Search is a general search algorithm that gives good results in games. Genetic Programming evaluates and combines trees to discover expressions that maximize a given fitness function. In this paper Monte-Carlo Tree Search is used to generate expressions that are evaluated in the same way as in Genetic Programming. Monte-Carlo Tree Search is transformed in order to search expression trees rather than lists of moves. We compare Nested Monte-Carlo Search to UCT (Upper Confidence Bounds for Trees) for various problems. Monte-Carlo Tree Search achieves state of the art results on multiple benchmark problems. The proposed approach is simple to program, does not suffer from expression growth, has a natural restart strategy to avoid local optima and is extremely easy to parallelize.


Sign in / Sign up

Export Citation Format

Share Document