Vague Neural Network Based Reinforcement Learning Control System for Inverted Pendulum

A real-time HIL control system on rotary inverted pendulum hardware platform based on double deep Q-network

Measurement and Control ◽

10.1177/00202940211000380 ◽

2021 ◽

Vol 54 (3-4) ◽

pp. 417-428

Author(s):

Yanyan Dai ◽

KiDong Lee ◽

SukGyu Lee

Keyword(s):

Control System ◽

Reinforcement Learning ◽

Inverted Pendulum ◽

Learning Algorithm ◽

Deep Understanding ◽

Control Engineering ◽

Experience Replay ◽

Real Hardware ◽

Rotary Inverted Pendulum ◽

Reinforcement Learning Algorithm

For real applications, rotary inverted pendulum systems have been known as the basic model in nonlinear control systems. If researchers have no deep understanding of control, it is difficult to control a rotary inverted pendulum platform using classic control engineering models, as shown in section 2.1. Therefore, without classic control theory, this paper controls the platform by training and testing reinforcement learning algorithm. Many recent achievements in reinforcement learning (RL) have become possible, but there is a lack of research to quickly test high-frequency RL algorithms using real hardware environment. In this paper, we propose a real-time Hardware-in-the-loop (HIL) control system to train and test the deep reinforcement learning algorithm from simulation to real hardware implementation. The Double Deep Q-Network (DDQN) with prioritized experience replay reinforcement learning algorithm, without a deep understanding of classical control engineering, is used to implement the agent. For the real experiment, to swing up the rotary inverted pendulum and make the pendulum smoothly move, we define 21 actions to swing up and balance the pendulum. Comparing Deep Q-Network (DQN), the DDQN with prioritized experience replay algorithm removes the overestimate of Q value and decreases the training time. Finally, this paper shows the experiment results with comparisons of classic control theory and different reinforcement learning algorithms.

Download Full-text

A Neural Network Based Self-learning Control System for Underwater Robots.

Journal of the Robotics Society of Japan ◽

10.7210/jrsj.13.1006 ◽

1995 ◽

Vol 13 (7) ◽

pp. 1006-1019 ◽

Cited By ~ 1

Author(s):

Teruo Fujii ◽

Tamaki Ura ◽

Taku Sutoh ◽

Kazuo Ishii

Keyword(s):

Neural Network ◽

Control System ◽

Learning Control ◽

Underwater Robots ◽

Self Learning

Download Full-text

Learning control of an inverted pendulum using a neural network

Proceedings IECON '91: 1991 International Conference on Industrial Electronics, Control and Instrumentation ◽

10.1109/iecon.1991.239063 ◽

2002 ◽

Cited By ~ 4

Author(s):

T. Ishida ◽

N. Shiokawa ◽

T. Nagado ◽

S. Ganeko

Keyword(s):

Neural Network ◽

Inverted Pendulum ◽

Learning Control

Download Full-text

A neural-fuzzy BOXES control system with reinforcement learning and its applications to inverted pendulum

1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century ◽

10.1109/icsmc.1995.537943 ◽

2002 ◽

Cited By ~ 1

Author(s):

Zhidong Dong ◽

Zaixing Zhang ◽

Peifa Jia

Keyword(s):

Control System ◽

Reinforcement Learning ◽

Inverted Pendulum ◽

Neural Fuzzy

Download Full-text

Adaptive learning control system based on the designed multilevel stochastic supervisor of an artificial neural network

Proceedings of ICNN'95 - International Conference on Neural Networks ◽

10.1109/icnn.1995.487756 ◽

2002 ◽

Author(s):

A. Al Khani

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Control System ◽

Adaptive Learning ◽

Learning Control ◽

Artificial Neural ◽

Adaptive Learning Control

Download Full-text

A two-layer networked learning control system using actor–critic neural network

Applied Mathematics and Computation ◽

10.1016/j.amc.2008.05.062 ◽

2008 ◽

Vol 205 (1) ◽

pp. 26-36 ◽

Cited By ~ 13

Author(s):

Dajun Du ◽

Minrui Fei

Keyword(s):

Neural Network ◽

Control System ◽

Learning Control ◽

Networked Learning ◽

Critic Neural Network

Download Full-text

APPLICATION OF AN EVOLUTIONARY ACTOR–CRITIC REINFORCEMENT LEARNING METHOD FOR THE CONTROL OF A THREE-LINK MUSCULOSKELETAL ARM DURING A REACHING MOVEMENT

Journal of Mechanics in Medicine and Biology ◽

10.1142/s0219519413500401 ◽

2013 ◽

Vol 13 (02) ◽

pp. 1350040

Author(s):

EHSAN TAHAMI ◽

AMIR HOMAYOUN JAFARI ◽

ALI FALLAH

Keyword(s):

Neural Networks ◽

Control System ◽

Reinforcement Learning ◽

Single Layer ◽

Control Policy ◽

Learning Control ◽

Hill Model ◽

Stationary Target ◽

Reaching Movement ◽

Learning Rates

In this paper, the control of a planar three-link musculoskeletal arm by using a revolutionary actor–critic reinforcement learning (RL) method during a reaching movement to a stationary target is presented. The arm model used in this study included three skeletal links (wrist, forearm, and upper arm), three joints (wrist, elbow, and shoulder without redundancy), and six non-linear monoarticular muscles (with redundancy), which were based on the Hill model. The learning control system was composed of actor, critic, and genetic algorithm (GA) parts. Two single-layer neural networks were used for each part of the actor and critic. This learning control system was used to apply six activation commands to six monoarticular muscles at each instant of time. It also used a reinforcement (reward) feedback for the learning process and controlling the direction of arm movement. Also, the GA was implemented to select the best learning rates for actor–critic neural networks. The results showed that mean square error (MSE) and average episode time gradually decrease and average reward gradually increases to constant values during the learning of the control policy. Furthermore, when learning was complete, optimal values of learning rates were selected.

Download Full-text

Implementation of a Fuzzy Control System for Two-Wheeled Inverted Pendulum Robot based on Artificial Neural Network

The Journal of the Korean Institute of Information and Communication Engineering ◽

10.6109/jkiice.2013.17.1.8 ◽

2013 ◽

Vol 17 (1) ◽

pp. 8-14 ◽

Cited By ~ 3

Author(s):

Geon-Wu Jeong ◽

Young-Kiu Choi

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Control System ◽

Fuzzy Control ◽

Inverted Pendulum ◽

Fuzzy Control System ◽

Wheeled Inverted Pendulum ◽

Artificial Neural

Download Full-text

Learning Control System of Biped Locomotive Robot Using Neural Networks

Journal of Robotics and Mechatronics ◽

10.20965/jrm.1993.p0542 ◽

1993 ◽

Vol 5 (6) ◽

pp. 542-547

Author(s):

Yasuo Kurematsu ◽

◽

Takashi Murai ◽

Takuji Maeda ◽

Shinzo Kitamura ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Control System ◽

Inverted Pendulum ◽

Trajectory Generation ◽

Weight Coefficient ◽

Generation System ◽

Initial Weight ◽

Simulation Test ◽

Pendulum Equation

The authors are studying the autonomous walking trajectory generation of a biped locomotive robot using the system consisting of an inverted pendulum equation and neural networks. This paper uses the trajectory generation system to simulate and to verify how the robot reacts to a change in its initial posture or the initial weight coefficient of a multi-layered neural network or an addition of disturbances during walking. The simulation test showed that the initial posture of the robot mainly determined a success in walking as well as a gait and that some disturbances did not prevent the robot from walking.

Download Full-text

Relative control of an underactuated spacecraft using reinforcement learning

Technical mechanics ◽

10.15407/itm2020.04.043 ◽

2020 ◽

Vol 2020 (4) ◽

pp. 43-54

Author(s):

S.V. Khoroshylov ◽

◽

M.O. Redka ◽

Keyword(s):

Neural Network ◽

Control System ◽

Reinforcement Learning ◽

Control Theory ◽

Learning Algorithm ◽

Control Algorithms ◽

Iteration Algorithm ◽

Quality Of Control ◽

Control Actions

The aim of the article is to approximate optimal relative control of an underactuated spacecraft using reinforcement learning and to study the influence of various factors on the quality of such a solution. In the course of this study, methods of theoretical mechanics, control theory, stability theory, machine learning, and computer modeling were used. The problem of in-plane spacecraft relative control using only control actions applied tangentially to the orbit is considered. This approach makes it possible to reduce the propellant consumption of reactive actuators and to simplify the architecture of the control system. However, in some cases, methods of the classical control theory do not allow one to obtain acceptable results. In this regard, the possibility of solving this problem by reinforcement learning methods has been investigated, which allows designers to find control algorithms close to optimal ones as a result of interactions of the control system with the plant using a reinforcement signal characterizing the quality of control actions. The well-known quadratic criterion is used as a reinforcement signal, which makes it possible to take into account both the accuracy requirements and the control costs. A search for control actions based on reinforcement learning is made using the policy iteration algorithm. This algorithm is implemented using the actor–critic architecture. Various representations of the actor for control law implementation and the critic for obtaining value function estimates using neural network approximators are considered. It is shown that the optimal control approximation accuracy depends on a number of features, namely, an appropriate structure of the approximators, the neural network parameter updating method, and the learning algorithm parameters. The investigated approach makes it possible to solve the considered class of control problems for controllers of different structures. Moreover, the approach allows the control system to refine its control algorithms during the spacecraft operation.

Download Full-text