Reinforcement Learning-based HVAC Control Agent for Optimal Control of Particulate Matter in Railway Stations

AbstractIn this paper, we will deal with a linear quadratic optimal control problem with unknown dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution $$\pi $$ π on the space of matrices. Furthermore, we will assume that such a probability measure is opportunely updated to take into account the increased experience that the agent obtains while exploring the environment, approximating with increasing accuracy the underlying dynamics. Under these assumptions, we will show that the optimal control obtained by solving the “average” linear quadratic optimal control problem with respect to a certain $$\pi $$ π converges to the optimal control driven related to the linear quadratic optimal control problem governed by the actual, underlying dynamics. This approach is closely related to model-based reinforcement learning algorithms where prior and posterior probability distributions describing the knowledge on the uncertain system are recursively updated. In the last section, we will show a numerical test that confirms the theoretical results.

Download Full-text

Robust Quadrotor Control through Reinforcement Learning with Disturbance Compensation

Applied Sciences ◽

10.3390/app11073257 ◽

2021 ◽

Vol 11 (7) ◽

pp. 3257

Author(s):

Chen-Huan Pi ◽

Wei-Yuan Ye ◽

Stone Cheng

Keyword(s):

Neural Network ◽

Reinforcement Learning ◽

Control Strategy ◽

External Disturbance ◽

Control Agent ◽

Network Control ◽

Outdoor Environment ◽

Disturbance Compensation ◽

Tracking Accuracy ◽

Control Scheme

In this paper, a novel control strategy is presented for reinforcement learning with disturbance compensation to solve the problem of quadrotor positioning under external disturbance. The proposed control scheme applies a trained neural-network-based reinforcement learning agent to control the quadrotor, and its output is directly mapped to four actuators in an end-to-end manner. The proposed control scheme constructs a disturbance observer to estimate the external forces exerted on the three axes of the quadrotor, such as wind gusts in an outdoor environment. By introducing an interference compensator into the neural network control agent, the tracking accuracy and robustness were significantly increased in indoor and outdoor experiments. The experimental results indicate that the proposed control strategy is highly robust to external disturbances. In the experiments, compensation improved control accuracy and reduced positioning error by 75%. To the best of our knowledge, this study is the first to achieve quadrotor positioning control through low-level reinforcement learning by using a global positioning system in an outdoor environment.

Download Full-text

Data-driven dynamic multi-objective optimal control: A Hamiltonian-inequality driven satisficing reinforcement learning approach

IFAC-PapersOnLine ◽

10.1016/j.ifacol.2020.12.2275 ◽

2020 ◽

Vol 53 (2) ◽

pp. 8070-8075

Author(s):

Majid Mazouchi ◽

Yongliang Yang ◽

Hamidreza Modares

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Data Driven ◽

Learning Approach ◽

Multi Objective

Download Full-text

Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control

Applied Energy ◽

10.1016/j.apenergy.2021.117164 ◽

2021 ◽

Vol 298 ◽

pp. 117164

Author(s):

Marco Biemann ◽

Fabian Scheller ◽

Xiufeng Liu ◽

Lizhen Huang

Keyword(s):

Reinforcement Learning ◽

Experimental Evaluation ◽

Learning Algorithms ◽

Model Free ◽

Hvac Control

Download Full-text

Reinforcement Learning and Adaptive Optimal Control for Continuous-Time Nonlinear Systems: A Value Iteration Approach

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2020.3045087 ◽

2021 ◽

pp. 1-10

Author(s):

Tao Bian ◽

Zhong-Ping Jiang

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Nonlinear Systems ◽

Continuous Time ◽

Value Iteration ◽

Adaptive Optimal Control ◽

A Value

Download Full-text

An Hybrid Model-Free Reinforcement Learning Approach for HVAC Control

10.1109/eeeic/icpseurope51590.2021.9584805 ◽

2021 ◽

Author(s):

Francesco M. Solinas ◽

Andrea Bellagarda ◽

Enrico Macii ◽

Edoardo Patti ◽

Lorenzo Bottaccioli

Keyword(s):

Reinforcement Learning ◽

Hybrid Model ◽

Learning Approach ◽

Model Free ◽

Hvac Control

Download Full-text

Hierarchical Terrain-Aware Control for Quadrupedal Locomotion by Combining Deep Reinforcement Learning and Optimal Control

10.1109/iros51168.2021.9636738 ◽

2021 ◽

Author(s):

Qingfeng Yao ◽

Jilong Wang ◽

Donglin Wang ◽

Shuyu Yang ◽

Hongyin Zhang ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Quadrupedal Locomotion

Download Full-text

Reinforcement Learning-Based Approximate Optimal Control for Attitude Reorientation Under State Constraints

IEEE Transactions on Control Systems Technology ◽

10.1109/tcst.2020.3007401 ◽

2020 ◽

pp. 1-10

Author(s):

Hongyang Dong ◽

Xiaowei Zhao ◽

Haoyang Yang

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

State Constraints

Download Full-text

Online Optimal Control of Robotic Systems with Single Critic NN-Based Reinforcement Learning

Complexity ◽

10.1155/2021/8839391 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

Xiaoyi Long ◽

Zheng He ◽

Zhongyuan Wang

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Tracking Control ◽

Learning Algorithm ◽

Tracking Error ◽

Adaptive Dynamic Programming ◽

Robotic Systems ◽

Control Synthesis ◽

Optimal Tracking ◽

Optimal Tracking Control

This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network (NN)-based reinforcement learning (RL) method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an optimal sense. To solve the obtained optimal control via the framework of adaptive dynamic programming (ADP), the command trajectory to be tracked and the modified tracking Hamilton-Jacobi-Bellman (HJB) are all formulated. An online RL algorithm is the developed to address the HJB equation using a critic NN with online learning algorithm. Simulation results are given to verify the effectiveness of the proposed method.

Download Full-text

Towards optimal HVAC control in non-stationary building environments combining active change detection and deep reinforcement learning

Building and Environment ◽

10.1016/j.buildenv.2021.108680 ◽

2022 ◽

pp. 108680

Author(s):

Xiangtian Deng ◽

Yi Zhang ◽

He Qi

Keyword(s):

Reinforcement Learning ◽

Change Detection ◽

Hvac Control ◽

Active Change

Download Full-text