An Improved Reinforcement Learning Based Heuristic Dynamic Programming Algorithm for Model-Free Optimal Control

Author(s):  
Jia Li ◽  
Zhaolin Yuan ◽  
Xiaojuan Ban
2016 ◽  
Vol 817 ◽  
pp. 150-161 ◽  
Author(s):  
Marcin Szuster ◽  
Piotr Gierlak

The article focuses on the implementation of the globalized dual-heuristic dynamic programming algorithm in the discrete tracking control system of the three degrees of freedom robotic manipulator. The globalized dual-heuristic dynamic programming algorithm is included in the approximate dynamic programming algorithms family, that bases on the Bellman’s dynamic programming idea. These algorithms generally consist of the actor and the critic structures realized in a form of artificial neural networks. Moreover, the control system includes the PD controller, the supervisory term and an additional control signal. The structure of the supervisory term derives from the stability analysis, which was realized using the Lyapunov stability theorem. The control system works on-line and the neural networks’ weight adaptation process is realized in every iteration step. A series of computer simulations was realized in Matlab/Simulink software to confirm performance of the control system.


2014 ◽  
Vol 2014 ◽  
pp. 1-16 ◽  
Author(s):  
Marcin Szuster ◽  
Zenon Hendzel

Network-based control systems have been emerging technologies in the control of nonlinear systems over the past few years. This paper focuses on the implementation of the approximate dynamic programming algorithm in the network-based tracking control system of the two-wheeled mobile robot, Pioneer 2-DX. The proposed discrete tracking control system consists of the globalised dual heuristic dynamic programming algorithm, the PD controller, the supervisory term, and an additional control signal. The structure of the supervisory term derives from the stability analysis realised using the Lyapunov stability theorem. The globalised dual heuristic dynamic programming algorithm consists of two structures: the actor and the critic, realised in a form of neural networks. The actor generates the suboptimal control law, while the critic evaluates the realised control strategy by approximation of value function from the Bellman’s equation. The presented discrete tracking control system works online, the neural networks’ weights adaptation process is realised in every iteration step, and the neural networks preliminary learning procedure is not required. The performance of the proposed control system was verified by a series of computer simulations and experiments realised using the wheeled mobile robot Pioneer 2-DX.


Sign in / Sign up

Export Citation Format

Share Document