Evaluation of Reinforcement Learning for Optimal Control of Building Active and Passive Thermal Storage Inventory

Solar Energy ◽

10.1115/isec2005-76085 ◽

2005 ◽

Cited By ~ 1

Author(s):

Simeng Liu ◽

Gregor P. Henze

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Cost Savings ◽

Thermal Storage ◽

Learning Control ◽

Learning Performance ◽

Optimal Control Strategy ◽

Model Free ◽

Learning Agent ◽

Electrical Demand

This paper describes an investigation of machine-learning control for the supervisory control of building active and passive thermal storage inventory. Previous studies show that the utilization of either active or passive, or both can yield significant peak cooling load reduction and associated electrical demand and operational cost savings. In this study, a model-free learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to operate the building and cooling plant optimally based on the feedback it receives from past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and TES charging/discharging rate. The controller extracts cues about the environment solely based on the reinforcement feedback it receives, which in this study is the monetary cost of each control action. No prediction or system model is required. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. This presented analysis revealed that learning control is a feasible methodology to find a near-optimal control strategy for exploiting the active and passive building thermal storage capacity, and also shows that the learning performance is affected by the dimensionality of the action and state space, the learning rate and several other factors. Moreover learning speed proved to be relatively low when dealing with tasks associated with large state and action spaces.

Download Full-text

Evaluation of Reinforcement Learning for Optimal Control of Building Active and Passive Thermal Storage Inventory

Journal of Solar Energy Engineering ◽

10.1115/1.2710491 ◽

2006 ◽

Vol 129 (2) ◽

pp. 215-225 ◽

Cited By ~ 24

Author(s):

Simeng Liu ◽

Gregor P. Henze

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Storage Capacity ◽

Control Strategies ◽

Cost Savings ◽

Thermal Storage ◽

Learning Control ◽

Learning Performance ◽

Optimal Control Strategy ◽

Model Free

This paper describes an investigation of machine learning for supervisory control of active and passive thermal storage capacity in buildings. Previous studies show that the utilization of active or passive thermal storage, or both, can yield significant peak cooling load reduction and associated electrical demand and operational cost savings. In this study, a model-free learning control is investigated for the operation of electrically driven chilled water systems in heavy-mass commercial buildings. The reinforcement learning controller learns to operate the building and cooling plant based on the reinforcement feedback (monetary cost of each action, in this study) it receives for past control actions. The learning agent interacts with its environment by commanding the global zone temperature setpoints and thermal energy storage charging∕discharging rate. The controller extracts information about the environment based solely on the reinforcement signal; the controller does not contain a predictive or system model. Over time and by exploring the environment, the reinforcement learning controller establishes a statistical summary of plant operation, which is continuously updated as operation continues. The present analysis shows that learning control is a feasible methodology to find a near-optimal control strategy for exploiting the active and passive building thermal storage capacity, and also shows that the learning performance is affected by the dimensionality of the action and state space, the learning rate and several other factors. It is found that it takes a long time to learn control strategies for tasks associated with large state and action spaces.

Download Full-text

Model-Free Reinforcement Learning for Branching Markov Decision Processes

Computer Aided Verification - Lecture Notes in Computer Science ◽

10.1007/978-3-030-81688-9_30 ◽

2021 ◽

pp. 651-673

Author(s):

Ernst Moritz Hahn ◽

Mateo Perez ◽

Sven Schewe ◽

Fabio Somenzi ◽

Ashutosh Trivedi ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Markov Decision Processes ◽

Control Strategy ◽

Natural Extension ◽

Decision Processes ◽

Optimal Control Strategy ◽

Model Free ◽

Learning Techniques ◽

Markov Decision

AbstractWe study reinforcement learning for the optimal control of Branching Markov Decision Processes (BMDPs), a natural extension of (multitype) Branching Markov Chains (BMCs). The state of a (discrete-time) BMCs is a collection of entities of various types that, while spawning other entities, generate a payoff. In comparison with BMCs, where the evolution of a each entity of the same type follows the same probabilistic pattern, BMDPs allow an external controller to pick from a range of options. This permits us to study the best/worst behaviour of the system. We generalise model-free reinforcement learning techniques to compute an optimal control strategy of an unknown BMDP in the limit. We present results of an implementation that demonstrate the practicality of the approach.

Download Full-text

Model-free control of Lorenz chaos using an approximate optimal control strategy

Communications in Nonlinear Science and Numerical Simulation ◽

10.1016/j.cnsns.2012.05.024 ◽

2012 ◽

Vol 17 (12) ◽

pp. 4891-4900 ◽

Cited By ~ 21

Author(s):

Shuai Li ◽

Yangming Li ◽

Bu Liu ◽

Timmy Murray

Keyword(s):

Optimal Control ◽

Control Strategy ◽

Optimal Control Strategy ◽

Model Free ◽

Model Free Control

Download Full-text

Optimal control strategy for COVID-19 concerning both life and economy based on deep reinforcement learning

Chinese Physics B ◽

10.1088/1674-1056/ac3229 ◽

2021 ◽

Author(s):

Wei Deng ◽

Guoyuan Qi ◽

Xinchen Yu

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Control Strategy ◽

Optimal Control Strategy

Download Full-text

An Improved Reinforcement Learning Based Heuristic Dynamic Programming Algorithm for Model-Free Optimal Control

Artificial Neural Networks and Machine Learning – ICANN 2020 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-61616-8_23 ◽

2020 ◽

pp. 282-294

Author(s):

Jia Li ◽

Zhaolin Yuan ◽

Xiaojuan Ban

Keyword(s):

Optimal Control ◽

Dynamic Programming ◽

Reinforcement Learning ◽

Dynamic Programming Algorithm ◽

Programming Algorithm ◽

Model Free ◽

Heuristic Dynamic Programming

Download Full-text

H∞ Control of Nonaffine Aerial Systems Using Off-policy Reinforcement Learning

Unmanned Systems ◽

10.1142/s2301385016400069 ◽

2016 ◽

Vol 04 (01) ◽

pp. 51-60 ◽

Cited By ~ 5

Author(s):

Bahare Kiumarsi ◽

Wei Kang ◽

Frank L. Lewis

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

System Dynamics ◽

General Class ◽

Nonlinear Function ◽

Admissible Solution ◽

Control Input ◽

Performance Function ◽

Optimal Tracking ◽

Model Free

This paper presents a completely model-free [Formula: see text] optimal tracking solution to the control of a general class of nonlinear nonaffine systems in the presence of the input constraints. The proposed method is motivated by nonaffine unmanned aerial vehicle (UAV) system as a real application. First, a general class of nonlinear nonaffine system dynamics is presented as an affine system in terms of a nonlinear function of the control input. It is shown that the optimal control of nonaffine systems may not have an admissible solution if the utility function is not defined properly. Moreover, the boundness of the optimal control input cannot be guaranteed for standard performance functions. A new performance function is defined and used in the [Formula: see text]-gain condition for this class of nonaffine system. This performance function guarantees the existence of an admissible solution (if any exists) and boundness of the control input solution. An off-policy reinforcement learning (RL) is employed to iteratively solve the [Formula: see text] optimal tracking control online using the measured data along the system trajectories. The proposed off-policy RL does not require any knowledge of the system dynamics. Moreover, the disturbance input does not need to be adjustable in a specific manner.

Download Full-text

Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory

Energy and Buildings ◽

10.1016/j.enbuild.2005.06.002 ◽

2006 ◽

Vol 38 (2) ◽

pp. 142-147 ◽

Cited By ~ 59

Author(s):

Simeng Liu ◽

Gregor P. Henze

Keyword(s):

Reinforcement Learning ◽

Experimental Analysis ◽

Thermal Storage ◽

Learning Control

Download Full-text

An MPC-Based Optimal Control Strategy of Active Thermal Storage in Commercial Buildings during Fast Demand Response Events in Smart Grids

Energy Procedia ◽

10.1016/j.egypro.2019.01.395 ◽

2019 ◽

Vol 158 ◽

pp. 2506-2511 ◽

Cited By ~ 3

Author(s):

Rui Tang ◽

Shengwei Wang ◽

Lei Xu

Keyword(s):

Optimal Control ◽

Demand Response ◽

Control Strategy ◽

Smart Grids ◽

Thermal Storage ◽

Commercial Buildings ◽

Optimal Control Strategy

Download Full-text

Reinforcement learning derived chemotherapeutic schedules for robust patient-specific therapy

Scientific Reports ◽

10.1038/s41598-021-97028-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Brydon Eastman ◽

Michelle Przedborski ◽

Mohammad Kohandel

Keyword(s):

Bone Marrow ◽

Optimal Control ◽

Reinforcement Learning ◽

Dose Schedule ◽

Mean Value ◽

Patient Specific ◽

Control Methods ◽

Drug Dosing ◽

Learning Agent ◽

Optimal Control Methods

AbstractThe in-silico development of a chemotherapeutic dosing schedule for treating cancer relies upon a parameterization of a particular tumour growth model to describe the dynamics of the cancer in response to the dose of the drug. In practice, it is often prohibitively difficult to ensure the validity of patient-specific parameterizations of these models for any particular patient. As a result, sensitivities to these particular parameters can result in therapeutic dosing schedules that are optimal in principle not performing well on particular patients. In this study, we demonstrate that chemotherapeutic dosing strategies learned via reinforcement learning methods are more robust to perturbations in patient-specific parameter values than those learned via classical optimal control methods. By training a reinforcement learning agent on mean-value parameters and allowing the agent periodic access to a more easily measurable metric, relative bone marrow density, for the purpose of optimizing dose schedule while reducing drug toxicity, we are able to develop drug dosing schedules that outperform schedules learned via classical optimal control methods, even when such methods are allowed to leverage the same bone marrow measurements.

Download Full-text

Experimental analysis of simulated reinforcement learning control for active and passive building thermal storage inventory

Energy and Buildings ◽

10.1016/j.enbuild.2005.06.001 ◽

2006 ◽

Vol 38 (2) ◽

pp. 148-161 ◽

Cited By ~ 41

Author(s):

Simeng Liu ◽

Gregor P. Henze

Keyword(s):

Reinforcement Learning ◽

Experimental Analysis ◽

Thermal Storage ◽

Learning Control

Download Full-text