A Decision-Making Framework for Load Rating Planning of Aging Bridges Using Deep Reinforcement Learning

2021 ◽  
Vol 35 (6) ◽  
pp. 04021024
Author(s):  
Minghui Cheng ◽  
Dan M. Frangopol
2021 ◽  
Vol 242 ◽  
pp. 112544
Author(s):  
Nicola Caterino ◽  
Iolanda Nuzzo ◽  
Antonio Ianniello ◽  
Giorgio Varchetta ◽  
Edoardo Cosenza

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Batel Yifrah ◽  
Ayelet Ramaty ◽  
Genela Morris ◽  
Avi Mendelsohn

AbstractDecision making can be shaped both by trial-and-error experiences and by memory of unique contextual information. Moreover, these types of information can be acquired either by means of active experience or by observing others behave in similar situations. The interactions between reinforcement learning parameters that inform decision updating and memory formation of declarative information in experienced and observational learning settings are, however, unknown. In the current study, participants took part in a probabilistic decision-making task involving situations that either yielded similar outcomes to those of an observed player or opposed them. By fitting alternative reinforcement learning models to each subject, we discerned participants who learned similarly from experience and observation from those who assigned different weights to learning signals from these two sources. Participants who assigned different weights to their own experience versus those of others displayed enhanced memory performance as well as subjective memory strength for episodes involving significant reward prospects. Conversely, memory performance of participants who did not prioritize their own experience over others did not seem to be influenced by reinforcement learning parameters. These findings demonstrate that interactions between implicit and explicit learning systems depend on the means by which individuals weigh relevant information conveyed via experience and observation.


2021 ◽  
Vol 11 (14) ◽  
pp. 6620
Author(s):  
Arman Alahyari ◽  
David Pozo ◽  
Meisam Farrokhifar

With the recent advent of technology within the smart grid, many conventional concepts of power systems have undergone drastic changes. Owing to technological developments, even small customers can monitor their energy consumption and schedule household applications with the utilization of smart meters and mobile devices. In this paper, we address the power set-point tracking problem for an aggregator that participates in a real-time ancillary program. Fast communication of data and control signal is possible, and the end-user side can exploit the provided signals through demand response programs benefiting both customers and the power grid. However, the existing optimization approaches rely on heavy computation and future parameter predictions, making them ineffective regarding real-time decision-making. As an alternative to the fixed control rules and offline optimization models, we propose the use of an online optimization decision-making framework for the power set-point tracking problem. For the introduced decision-making framework, two types of online algorithms are investigated with and without projections. The former is based on the standard online gradient descent (OGD) algorithm, while the latter is based on the Online Frank–Wolfe (OFW) algorithm. The results demonstrated that both algorithms could achieve sub-linear regret where the OGD approach reached approximately 2.4-times lower average losses. However, the OFW-based demand response algorithm performed up to twenty-nine percent faster when the number of loads increased for each round of optimization.


Author(s):  
Ming-Sheng Ying ◽  
Yuan Feng ◽  
Sheng-Gang Ying

AbstractMarkov decision process (MDP) offers a general framework for modelling sequential decision making where outcomes are random. In particular, it serves as a mathematical framework for reinforcement learning. This paper introduces an extension of MDP, namely quantum MDP (qMDP), that can serve as a mathematical model of decision making about quantum systems. We develop dynamic programming algorithms for policy evaluation and finding optimal policies for qMDPs in the case of finite-horizon. The results obtained in this paper provide some useful mathematical tools for reinforcement learning techniques applied to the quantum world.


2021 ◽  
Vol 35 (2) ◽  
Author(s):  
Nicolas Bougie ◽  
Ryutaro Ichise

AbstractDeep reinforcement learning methods have achieved significant successes in complex decision-making problems. In fact, they traditionally rely on well-designed extrinsic rewards, which limits their applicability to many real-world tasks where rewards are naturally sparse. While cloning behaviors provided by an expert is a promising approach to the exploration problem, learning from a fixed set of demonstrations may be impracticable due to lack of state coverage or distribution mismatch—when the learner’s goal deviates from the demonstrated behaviors. Besides, we are interested in learning how to reach a wide range of goals from the same set of demonstrations. In this work we propose a novel goal-conditioned method that leverages very small sets of goal-driven demonstrations to massively accelerate the learning process. Crucially, we introduce the concept of active goal-driven demonstrations to query the demonstrator only in hard-to-learn and uncertain regions of the state space. We further present a strategy for prioritizing sampling of goals where the disagreement between the expert and the policy is maximized. We evaluate our method on a variety of benchmark environments from the Mujoco domain. Experimental results show that our method outperforms prior imitation learning approaches in most of the tasks in terms of exploration efficiency and average scores.


Sign in / Sign up

Export Citation Format

Share Document