Using Control Theory and Bayesian Reinforcement Learning for Policy Management in Pandemic Situations

Bayesian reinforcement learning has turned out to be an effective solution to the optimal tradeoff between exploration and exploitation. However, in practical applications, the learning parameters with exponential growth are the main impediment for online planning and learning. To overcome this problem, we bring factored representations, model-based learning, and Bayesian reinforcement learning together in a new approach. Firstly, we exploit a factored representation to describe the states to reduce the size of learning parameters, and adopt Bayesian inference method to learn the unknown structure and parameters simultaneously. Then, we use an online point-based value iteration algorithm to plan and learn. The experimental results show that the proposed approach is an effective way for improving the learning efficiency in large-scale state spaces.

Download Full-text

Monte Carlo Tree Search for Bayesian Reinforcement Learning

2012 11th International Conference on Machine Learning and Applications ◽

10.1109/icmla.2012.30 ◽

2012 ◽

Cited By ~ 2

Author(s):

Ngo Anh Vien ◽

Wolfgang Ertel

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Bayesian Reinforcement Learning

Download Full-text

The rational use of causal inference to guide reinforcement learning strengthens with age

10.31234/osf.io/j9zuk ◽

2019 ◽

Author(s):

Alexandra O. Cohen ◽

Kate Nussenbaum ◽

Hayley Dorfman ◽

Samuel J. Gershman ◽

Catherine A. Hartley

Keyword(s):

Reinforcement Learning ◽

Causal Structure ◽

Learning Task ◽

Negative Events ◽

Shape Learning ◽

Adolescents And Adults ◽

Bayesian Reinforcement Learning ◽

External Causes ◽

Reinforcement Learning Models ◽

Best Fit

Beliefs about the controllability of positive or negative events in the environment can shape learning throughout the lifespan. Previous research has shown that adults’ learning is modulated by beliefs about the causal structure of the environment such that they will update their value estimates to a lesser extent when the outcomes can be attributed to hidden causes. The present study examined whether external causes similarly influenced outcome attributions and learning across development. Ninety participants, ages 7 to 25 years, completed a reinforcement learning task in which they chose between two options with fixed reward probabilities. Choices were made in three distinct environments in which different hidden agents occasionally intervened to generate positive, negative, or random outcomes. Participants’ beliefs about hidden-agent intervention aligned well with the true probabilities of positive, negative, or random outcome manipulation in each of the three environments. Computational modeling of the learning data revealed that while the choices made by both adults (ages 18 - 25) and adolescents (ages 13 - 17) were best fit by Bayesian reinforcement learning models that incorporate beliefs about hidden agent intervention, those of children (ages 7 - 12) were best fit by a one learning rate model that updates value estimates based on choice outcomes alone. Together, these results suggest that while children demonstrate explicit awareness of the causal structure of the task environment they do not implicitly use beliefs about the causal structure of the environment to guide reinforcement learning in the same manner as adolescents and adults.

Download Full-text