A Deep Reinforcement Learning Method for Model-based Optimal Control of HVAC Systems

Reinforcement learning has been established over the past decade as an effective tool to find optimal control policies for dynamical systems, with recent focus on approaches that guarantee safety during the learning and/or execution phases. In general, safety guarantees are critical in reinforcement learning when the system is safety-critical and/or task restarts are not practically feasible. In optimal control theory, safety requirements are often expressed in terms of state and/or control constraints. In recent years, reinforcement learning approaches that rely on persistent excitation have been combined with a barrier transformation to learn the optimal control policies under state constraints. To soften the excitation requirements, model-based reinforcement learning methods that rely on exact model knowledge have also been integrated with the barrier transformation framework. The objective of this paper is to develop safe reinforcement learning method for deterministic nonlinear systems, with parametric uncertainties in the model, to learn approximate constrained optimal policies without relying on stringent excitation conditions. To that end, a model-based reinforcement learning technique that utilizes a novel filtered concurrent learning method, along with a barrier transformation, is developed in this paper to realize simultaneous learning of unknown model parameters and approximate optimal state-constrained control policies for safety-critical systems.

Download Full-text

A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system

Journal of Process Control ◽

10.1016/j.jprocont.2020.02.003 ◽

2020 ◽

Vol 87 ◽

pp. 166-178 ◽

Cited By ~ 3

Author(s):

Jong Woo Kim ◽

Byung Jun Park ◽

Haeun Yoo ◽

Tae Hoon Oh ◽

Jay H. Lee ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Nonlinear Control ◽

Finite Horizon ◽

Learning Method ◽

Model Based ◽

Affine System

Download Full-text

A model-based reinforcement learning method based on conditional generative adversarial networks

Pattern Recognition Letters ◽

10.1016/j.patrec.2021.08.019 ◽

2021 ◽

Vol 152 ◽

pp. 18-25

Author(s):

Tingting Zhao ◽

Ying Wang ◽

Guixi Li ◽

Le Kong ◽

Yarui Chen ◽

...

Keyword(s):

Reinforcement Learning ◽

Generative Adversarial Networks ◽

Learning Method ◽

Model Based ◽

Adversarial Networks

Download Full-text

Model-based reinforcement learning for output-feedback optimal control of a class of nonlinear systems

2019 American Control Conference (ACC) ◽

10.23919/acc.2019.8814910 ◽

2019 ◽

Author(s):

Ryan Self ◽

Michael Harlan ◽

Rushikesh Kamalapurkar

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Nonlinear Systems ◽

Output Feedback ◽

Model Based ◽

Feedback Optimal Control

Download Full-text

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

10.36227/techrxiv.17205740.v2 ◽

2021 ◽

Author(s):

Xinglong Zhang ◽

Yaoqian Peng ◽

Biao Luo ◽

Wei Pan ◽

Xin Xu ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Control Policy ◽

Intelligent Vehicles ◽

Time Varying ◽

Control Constraints ◽

Model Based ◽

Safety Constraints ◽

And Control ◽

State And Control Constraints

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>

Download Full-text

Model‐based reinforcement learning for nonlinear optimal control with practical asymptotic stability guarantees

AIChE Journal ◽

10.1002/aic.16544 ◽

2020 ◽

Vol 66 (10) ◽

Author(s):

Yeonsoo Kim ◽

Jong Min Lee

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Asymptotic Stability ◽

Nonlinear Optimal Control ◽

Model Based

Download Full-text

Model-based reinforcement learning for approximate optimal control with temporal logic specifications

Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control ◽

10.1145/3447928.3456639 ◽

2021 ◽

Author(s):

Max H. Cohen ◽

Calin Belta

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Temporal Logic ◽

Model Based

Download Full-text

A Model-Based Reinforcement Learning Approach to Time-Optimal Control Problems

Lecture Notes in Computer Science - Advances and Trends in Artificial Intelligence. From Theory to Practice ◽

10.1007/978-3-030-22999-3_56 ◽

2019 ◽

pp. 657-665

Author(s):

Hsuan-Cheng Liao ◽

Jing-Sin Liu

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Optimal Control Problems ◽

Time Optimal Control ◽

Control Problems ◽

Learning Approach ◽

Model Based ◽

Time Optimal

Download Full-text

Model-Based Safe Reinforcement Learning with Time-Varying State and Control Constraints: An Application to Intelligent Vehicles

10.36227/techrxiv.17205740 ◽

2021 ◽

Author(s):

Xinglong Zhang ◽

Yaoqian Peng ◽

Biao Luo ◽

Wei Pan ◽

Xin Xu ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Control Policy ◽

Intelligent Vehicles ◽

Time Varying ◽

Control Constraints ◽

Model Based ◽

Safety Constraints ◽

And Control ◽

State And Control Constraints

<div>Recently, barrier function-based safe reinforcement learning (RL) with the actor-critic structure for continuous control tasks has received increasing attention. It is still challenging to learn a near-optimal control policy with safety and convergence guarantees. Also, few works have addressed the safe RL algorithm design under time-varying safety constraints. This paper proposes a model-based safe RL algorithm for optimal control of nonlinear systems with time-varying state and control constraints. In the proposed approach, we construct a novel barrier-based control policy structure that can guarantee control safety. A multi-step policy evaluation mechanism is proposed to predict the policy's safety risk under time-varying safety constraints and guide the policy to update safely. Theoretical results on stability and robustness are proven. Also, the convergence of the actor-critic learning algorithm is analyzed. The performance of the proposed algorithm outperforms several state-of-the-art RL algorithms in the simulated Safety Gym environment. Furthermore, the approach is applied to the integrated path following and collision avoidance problem for two real-world intelligent vehicles. A differential-drive vehicle and an Ackermann-drive one are used to verify the offline deployment performance and the online learning performance, respectively. Our approach shows an impressive sim-to-real transfer capability and a satisfactory online control performance in the experiment.</div>

Download Full-text

Model-Based Reinforcement Learning for Approximate Optimal Control

Communications and Control Engineering - Reinforcement Learning for Optimal Feedback Control ◽

10.1007/978-3-319-78384-0_4 ◽

2018 ◽

pp. 99-148 ◽

Cited By ~ 1

Author(s):

Rushikesh Kamalapurkar ◽

Patrick Walters ◽

Joel Rosenfeld ◽

Warren Dixon

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Model Based

Download Full-text