Scalable and Efficient Bayes-Adaptive Reinforcement Learning Based on Monte-Carlo Tree Search

Journal of Artificial Intelligence Research ◽

10.1613/jair.4117 ◽

2013 ◽

Vol 48 ◽

pp. 841-883 ◽

Cited By ~ 11

Author(s):

A. Guez ◽

D. Silver ◽

P. Dayan

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Search Space ◽

Search Tree ◽

Benchmark Problems ◽

Tree Search ◽

Monte Carlo Tree Search ◽

The Face ◽

Almost All ◽

Infinite State

Bayesian planning is a formally elegant approach to learning optimal behaviour under model uncertainty, trading off exploration and exploitation in an ideal way. Unfortunately, planning optimally in the face of uncertainty is notoriously taxing, since the search space is enormous. In this paper we introduce a tractable, sample-based method for approximate Bayes-optimal planning which exploits Monte-Carlo tree search. Our approach avoids expensive applications of Bayes rule within the search tree by sampling models from current beliefs, and furthermore performs this sampling in a lazy manner. This enables it to outperform previous Bayesian model-based reinforcement learning algorithms by a significant margin on several well-known benchmark problems. As we show, our approach can even work in problems with an infinite state space that lie qualitatively out of reach of almost all previous work in Bayesian exploration.

Download Full-text

Monte-Carlo Tree Search for Scalable Coalition Formation

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/57 ◽

2020 ◽

Author(s):

Feng Wu ◽

Sarvapali D. Ramchurn

Keyword(s):

Monte Carlo ◽

Disaster Response ◽

Optimal Solution ◽

Coalition Structure ◽

Search Tree ◽

Benchmark Problems ◽

Tree Search ◽

Structure Generation ◽

Monte Carlo Tree Search ◽

Coalition Structure Generation

We propose a novel algorithm based on Monte-Carlo tree search for the problem of coalition structure generation (CSG). Specifically, we find the optimal solution by sampling the coalition structure graph and incrementally expanding a search tree, which represents the partial space that has been searched. We prove that our algorithm is complete and converges to the optimal given sufficient number of iterations. Moreover, it is anytime and can scale to large CSG problems with many agents. Experimental results on six common CSG benchmark problems and a disaster response domain confirm the advantages of our approach comparing to the state-of-the-art methods.

Download Full-text

Monte Carlo Tree Search for Bayesian Reinforcement Learning

2012 11th International Conference on Machine Learning and Applications ◽

10.1109/icmla.2012.30 ◽

2012 ◽

Cited By ~ 2

Author(s):

Ngo Anh Vien ◽

Wolfgang Ertel

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Bayesian Reinforcement Learning

Download Full-text

Towards efficient discovery of green synthetic pathways with Monte Carlo tree search and reinforcement learning

Chemical Science ◽

10.1039/d0sc04184j ◽

2020 ◽

Vol 11 (40) ◽

pp. 10959-10972

Author(s):

Xiaoxue Wang ◽

Yujie Qian ◽

Hanyu Gao ◽

Connor W. Coley ◽

Yiming Mo ◽

...

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Prediction Model ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Value Network ◽

Synthesis Routes

A new MCTS variant with a reinforcement learning value network and solvent prediction model proposes shorter synthesis routes with greener solvents.

Download Full-text

A learned query rewrite system using Monte Carlo tree search

Proceedings of the VLDB Endowment ◽

10.14778/3485450.3485456 ◽

2021 ◽

Vol 15 (1) ◽

pp. 46-58

Author(s):

Xuanhe Zhou ◽

Guoliang Li ◽

Chengliang Chai ◽

Jianhua Feng

Keyword(s):

Monte Carlo ◽

Search Space ◽

Local Optimum ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Expected Performance ◽

Optimal Node ◽

Np Hard Problem ◽

Rewrite Rules ◽

Sql Query

Query rewrite transforms a SQL query into an equivalent one but with higher performance. However, SQL rewrite is an NP-hard problem, and existing approaches adopt heuristics to rewrite the queries. These heuristics have two main limitations. First, the order of applying different rewrite rules significantly affects the query performance. However, the search space of all possible rewrite orders grows exponentially with the number of query operators and rules and it is rather hard to find the optimal rewrite order. Existing methods apply a pre-defined order to rewrite queries and will fall in a local optimum. Second, different rewrite rules have different benefits for different queries. Existing methods work on single plans but cannot effectively estimate the benefits of rewriting a query. To address these challenges, we propose a policy tree based query rewrite framework, where the root is the input query and each node is a rewritten query from its parent. We aim to explore the tree nodes in the policy tree to find the optimal rewrite query. We propose to use Monte Carlo Tree Search to explore the policy tree, which navigates the policy tree to efficiently get the optimal node. Moreover, we propose a learning-based model to estimate the expected performance improvement of each rewritten query, which guides the tree search more accurately. We also propose a parallel algorithm that can explore the tree search in parallel in order to improve the performance. Experimental results showed that our method significantly outperformed existing approaches.

Download Full-text

A Study of the Artificial Intelligence Factorization Game Agent Using Reinforcement Learning and Monte Carlo Tree Search

Journal of Korean institute of intelligent systems ◽

10.5391/jkiis.2020.30.5.406 ◽

2020 ◽

Vol 30 (5) ◽

pp. 406-415

Author(s):

Byung-Sun Won ◽

Jon-Lark Kim ◽

Sukwon Han

Keyword(s):

Artificial Intelligence ◽

Monte Carlo ◽

Reinforcement Learning ◽

Tree Search ◽

Monte Carlo Tree Search

Download Full-text

Monte-Carlo Tree Search and Reinforcement Learning for Reconfiguring Data Stream Processing on Edge Computing

2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD) ◽

10.1109/sbac-pad.2019.00021 ◽

2019 ◽

Author(s):

Alexandre da Silva Veith ◽

Marcos Dias de Assuncao ◽

Laurent Lefevre

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Data Stream ◽

Stream Processing ◽

Edge Computing ◽

Tree Search ◽

Data Stream Processing ◽

Monte Carlo Tree Search

Download Full-text

A Monte Carlo Tree Search Player for Birds of a Feather Solitaire

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019700 ◽

2019 ◽

Vol 33 ◽

pp. 9700-9705

Author(s):

Christian Roberson ◽

Katarina Sperduto

Keyword(s):

Artificial Intelligence ◽

Monte Carlo ◽

Collaborative Research ◽

Search Tree ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Game State ◽

Research Challenge

Artificial intelligence in games serves as an excellent platform for facilitating collaborative research with undergraduates. This paper explores several aspects of a research challenge proposed for a newly-developed variant of a solitaire game. We present multiple classes of game states that can be identified as solvable or unsolvable. We present a heuristic for quickly finding goal states in a game state search tree. Finally, we introduce a Monte Carlo Tree Search-based player for the solitaire variant that can win almost any solvable starting deal efficiently.

Download Full-text

Predicting Game Difficulty and Engagement Using AI Players

Proceedings of the ACM on Human-Computer Interaction ◽

10.1145/3474658 ◽

2021 ◽

Vol 5 (CHI PLAY) ◽

pp. 1-17

Author(s):

Shaghayegh Roohi ◽

Christian Guckelsberger ◽

Asko Relas ◽

Henri Heiskanen ◽

Jari Takatalo ◽

...

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Prediction Accuracy ◽

Selection Strategy ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Average Performance ◽

Novel Approach ◽

Player Modelling ◽

Human Player

This paper presents a novel approach to automated playtesting for the prediction of human player behavior and experience. We have previously demonstrated that Deep Reinforcement Learning (DRL) game-playing agents can predict both game difficulty and player engagement, operationalized as average pass and churn rates. We improve this approach by enhancing DRL with Monte Carlo Tree Search (MCTS). We also motivate an enhanced selection strategy for predictor features, based on the observation that an AI agent's best-case performance can yield stronger correlations with human data than the agent's average performance. Both additions consistently improve the prediction accuracy, and the DRL-enhanced MCTS outperforms both DRL and vanilla MCTS in the hardest levels. We conclude that player modelling via automated playtesting can benefit from combining DRL and MCTS. Moreover, it can be worthwhile to investigate a subset of repeated best AI agent runs, if AI gameplay does not yield good predictions on average.

Download Full-text

Adaptive Playouts in Monte-Carlo Tree Search with Policy-Gradient Reinforcement Learning

Lecture Notes in Computer Science - Advances in Computer Games ◽

10.1007/978-3-319-27992-3_1 ◽

2015 ◽

pp. 1-11 ◽

Cited By ~ 4

Author(s):

Tobias Graf ◽

Marco Platzner

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Policy Gradient

Download Full-text

MONTE-CARLO EXPRESSION DISCOVERY

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213012500352 ◽

2013 ◽

Vol 22 (01) ◽

pp. 1250035 ◽

Cited By ~ 7

Author(s):

TRISTAN CAZENAVE

Keyword(s):

Monte Carlo ◽

Genetic Programming ◽

Search Algorithm ◽

Fitness Function ◽

Benchmark Problems ◽

Tree Search ◽

Confidence Bounds ◽

Local Optima ◽

Monte Carlo Tree Search ◽

Monte Carlo Search

Monte-Carlo Tree Search is a general search algorithm that gives good results in games. Genetic Programming evaluates and combines trees to discover expressions that maximize a given fitness function. In this paper Monte-Carlo Tree Search is used to generate expressions that are evaluated in the same way as in Genetic Programming. Monte-Carlo Tree Search is transformed in order to search expression trees rather than lists of moves. We compare Nested Monte-Carlo Search to UCT (Upper Confidence Bounds for Trees) for various problems. Monte-Carlo Tree Search achieves state of the art results on multiple benchmark problems. The proposed approach is simple to program, does not suffer from expression growth, has a natural restart strategy to avoid local optima and is extremely easy to parallelize.

Download Full-text