bayesian reinforcement learning
Recently Published Documents


TOTAL DOCUMENTS

55
(FIVE YEARS 15)

H-INDEX

9
(FIVE YEARS 2)

Energies ◽  
2021 ◽  
Vol 14 (22) ◽  
pp. 7481
Author(s):  
Mohammad Sadeghi ◽  
Shahram Mollahasani ◽  
Melike Erol-Kantarci

Microgrids are empowered by the advances in renewable energy generation, which enable the microgrids to generate the required energy for supplying their loads and trade the surplus energy to other microgrids or the macrogrid. Microgrids need to optimize the scheduling of their demands and energy levels while trading their surplus with others to minimize the overall cost. This can be affected by various factors such as variations in demand, energy generation, and competition among microgrids due to their dynamic nature. Thus, reaching optimal scheduling is challenging due to the uncertainty caused by the generation/consumption of renewable energy and the complexity of interconnected microgrids and their interplay. Previous works mainly rely on modeling-based approaches and the availability of precise information on microgrid dynamics. This paper addresses the energy trading problem among microgrids by minimizing the cost while uncertainty exists in microgrid generation and demand. To this end, a Bayesian coalitional reinforcement learning-based model is introduced to minimize the energy trading cost among microgrids by forming stable coalitions. The results show that the proposed model can minimize the cost up to 23% with respect to the coalitional game theory model.


2021 ◽  
pp. 027836492110376
Author(s):  
Haruki Nishimura ◽  
Mac Schwager

We propose a novel belief space planning technique for continuous dynamics by viewing the belief system as a hybrid dynamical system with time-driven switching. Our approach is based on the perturbation theory of differential equations and extends sequential action control to stochastic dynamics. The resulting algorithm, which we name SACBP, does not require discretization of spaces or time and synthesizes control signals in near real-time. SACBP is an anytime algorithm that can handle general parametric Bayesian filters under certain assumptions. We demonstrate the effectiveness of our approach in an active sensing scenario and a model-based Bayesian reinforcement learning problem. In these challenging problems, we show that the algorithm significantly outperforms other existing solution techniques including approximate dynamic programming and local trajectory optimization.


2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Alexandra O. Cohen ◽  
Kate Nussenbaum ◽  
Hayley M. Dorfman ◽  
Samuel J. Gershman ◽  
Catherine A. Hartley

Abstract Beliefs about the controllability of positive or negative events in the environment can shape learning throughout the lifespan. Previous research has shown that adults’ learning is modulated by beliefs about the causal structure of the environment such that they update their value estimates to a lesser extent when the outcomes can be attributed to hidden causes. This study examined whether external causes similarly influenced outcome attributions and learning across development. Ninety participants, ages 7 to 25 years, completed a reinforcement learning task in which they chose between two options with fixed reward probabilities. Choices were made in three distinct environments in which different hidden agents occasionally intervened to generate positive, negative, or random outcomes. Participants’ beliefs about hidden-agent intervention aligned with the true probabilities of the positive, negative, or random outcome manipulation in each of the three environments. Computational modeling of the learning data revealed that while the choices made by both adults (ages 18–25) and adolescents (ages 13–17) were best fit by Bayesian reinforcement learning models that incorporate beliefs about hidden-agent intervention, those of children (ages 7–12) were best fit by a one learning rate model that updates value estimates based on choice outcomes alone. Together, these results suggest that while children demonstrate explicit awareness of the causal structure of the task environment, they do not implicitly use beliefs about the causal structure of the environment to guide reinforcement learning in the same manner as adolescents and adults.


2020 ◽  
Vol 24 (8) ◽  
pp. 1738-1741
Author(s):  
Hesam Khoshkbari ◽  
Vahid Pourahmadi ◽  
Hamid Sheikhzadeh

2020 ◽  
Vol 44 (5) ◽  
pp. 845-857
Author(s):  
Kei Senda ◽  
Toru Hishinuma ◽  
Yurika Tani

Sign in / Sign up

Export Citation Format

Share Document