scholarly journals Differences and similarities between reinforcement learning and the classical optimal control framework

PAMM ◽  
2019 ◽  
Vol 19 (1) ◽  
Author(s):  
Simon Gottschalk ◽  
Michael Burger
Author(s):  
Andrea Pesare ◽  
Michele Palladino ◽  
Maurizio Falcone

AbstractIn this paper, we will deal with a linear quadratic optimal control problem with unknown dynamics. As a modeling assumption, we will suppose that the knowledge that an agent has on the current system is represented by a probability distribution $$\pi $$ π on the space of matrices. Furthermore, we will assume that such a probability measure is opportunely updated to take into account the increased experience that the agent obtains while exploring the environment, approximating with increasing accuracy the underlying dynamics. Under these assumptions, we will show that the optimal control obtained by solving the “average” linear quadratic optimal control problem with respect to a certain $$\pi $$ π converges to the optimal control driven related to the linear quadratic optimal control problem governed by the actual, underlying dynamics. This approach is closely related to model-based reinforcement learning algorithms where prior and posterior probability distributions describing the knowledge on the uncertain system are recursively updated. In the last section, we will show a numerical test that confirms the theoretical results.


2021 ◽  
Author(s):  
Qingfeng Yao ◽  
Jilong Wang ◽  
Donglin Wang ◽  
Shuyu Yang ◽  
Hongyin Zhang ◽  
...  

Author(s):  
Mohamed M. Alhneaish ◽  
Mohamed L. Shaltout ◽  
Sayed M. Metwalli

An economic model predictive control framework is presented in this study for an integrated wind turbine and flywheel energy storage system. The control objective is to smooth wind power output and mitigate tower fatigue load. The optimal control problem within the model predictive control framework has been formulated as a convex optimal control problem with linear dynamics and convex constraints that can be solved globally. The performance of the proposed control algorithm is compared to that of a standard wind turbine controller. The effect of the proposed control actions on the fatigue loads acting on the tower and blades is studied. The simulation results, with various wind scenarios, showed the ability of the proposed control algorithm to achieve the aforementioned objectives in terms of smoothing output power and mitigating tower fatigue load at the cost of a minimal reduction of the wind energy harvested.


Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xiaoyi Long ◽  
Zheng He ◽  
Zhongyuan Wang

This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network (NN)-based reinforcement learning (RL) method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an optimal sense. To solve the obtained optimal control via the framework of adaptive dynamic programming (ADP), the command trajectory to be tracked and the modified tracking Hamilton-Jacobi-Bellman (HJB) are all formulated. An online RL algorithm is the developed to address the HJB equation using a critic NN with online learning algorithm. Simulation results are given to verify the effectiveness of the proposed method.


Sign in / Sign up

Export Citation Format

Share Document