scholarly journals Online Optimal Control of Robotic Systems with Single Critic NN-Based Reinforcement Learning

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xiaoyi Long ◽  
Zheng He ◽  
Zhongyuan Wang

This paper suggests an online solution for the optimal tracking control of robotic systems based on a single critic neural network (NN)-based reinforcement learning (RL) method. To this end, we rewrite the robotic system model as a state-space form, which will facilitate the realization of optimal tracking control synthesis. To maintain the tracking response, a steady-state control is designed, and then an adaptive optimal tracking control is used to ensure that the tracking error can achieve convergence in an optimal sense. To solve the obtained optimal control via the framework of adaptive dynamic programming (ADP), the command trajectory to be tracked and the modified tracking Hamilton-Jacobi-Bellman (HJB) are all formulated. An online RL algorithm is the developed to address the HJB equation using a critic NN with online learning algorithm. Simulation results are given to verify the effectiveness of the proposed method.

2020 ◽  
Vol 53 (5-6) ◽  
pp. 778-787
Author(s):  
Jingren Zhang ◽  
Qingfeng Wang ◽  
Tao Wang

In this article, a novel continuous-time optimal tracking controller is proposed for the single-input-single-output linear system with completely unknown dynamics. Unlike those existing solutions to the optimal tracking control problem, the proposed controller introduces an integral compensation to reduce the steady-state error and regulates the feedforward part simultaneously with the feedback part. An augmented system composed of the integral compensation, error dynamics, and desired trajectory is established to formulate the optimal tracking control problem. The input energy and tracking error of the optimal controller are minimized according to the objective function in the infinite horizon. With the application of reinforcement learning techniques, the proposed controller does not require any prior knowledge of the system drift or input dynamics. The integral reinforcement learning method is employed to approximate the Q-function and update the critic network on-line. And the actor network is updated with the deterministic learning method. The Lyapunov stability is proved under the persistence of excitation condition. A case study on a hydraulic loading system has shown the effectiveness of the proposed controller by simulation and experiment.


1995 ◽  
Vol 117 (3) ◽  
pp. 292-303 ◽  
Author(s):  
M. Zaheer-uddin ◽  
R. V. Patel

Optimal control of indoor environmental spaces is explored. A physical model of the system consisting of a heating system, a distribution system and an environmental zone is considered and a seventh order bilinear system model is developed. From the physical characteristics and open-loop response of the system, it is shown that the overall system consists of a fast subsystem and a slow subsystem. By including the effects of the slow subsystem in the fast subsystem, a reduced order model is developed. An optimal control law is designed based on the reduced order model and it is implemented on the full order nonlinear system. Both local and global linearization techniques are used to design optimal control laws. Results showing the disturbance rejection characteristics of the resulting closed-loop system are presented. The use of optimal tracking control to implement large changes in setpoints, in a prescribed manner, is also examined. A general model to describe environmental zones is proposed and its application to multi-zone spaces is illustrated. A multiple-input optimal tracking control law with output error integrators is designed. The resulting closed-loop system response to step-like disturbances is shown to be good.


Automatica ◽  
2014 ◽  
Vol 50 (4) ◽  
pp. 1167-1175 ◽  
Author(s):  
Bahare Kiumarsi ◽  
Frank L. Lewis ◽  
Hamidreza Modares ◽  
Ali Karimpour ◽  
Mohammad-Bagher Naghibi-Sistani

2013 ◽  
Vol 2013 ◽  
pp. 1-16 ◽  
Author(s):  
Bo Dong ◽  
Yuanchun Li

A novel decentralized reinforcement learning robust optimal tracking control theory for time varying constrained reconfigurable modular robots based on action-critic-identifier (ACI) and state-action value function (Q-function) has been presented to solve the problem of the continuous time nonlinear optimal control policy for strongly coupled uncertainty robotic system. The dynamics of time varying constrained reconfigurable modular robot is described as a synthesis of interconnected subsystem, and continuous time state equation andQ-function have been designed in this paper. Combining with ACI and RBF network, the global uncertainty of the subsystem and the HJB (Hamilton-Jacobi-Bellman) equation have been estimated, where critic-NN and action-NN are used to approximate the optimalQ-function and the optimal control policy, and the identifier is adopted to identify the global uncertainty as well as RBF-NN which is used to update the weights of ACI-NN. On this basis, a novel decentralized robust optimal tracking controller of the subsystem is proposed, so that the subsystem can track the desired trajectory and the tracking error can converge to zero in a finite time. The stability of ACI and the robust optimal tracking controller are confirmed by Lyapunov theory. Finally, comparative simulation examples are presented to illustrate the effectiveness of the proposed ACI and decentralized control theory.


Sign in / Sign up

Export Citation Format

Share Document