Overview of Reinforcement Learning and its Application in Control Theory

A novel decentralized reinforcement learning robust optimal tracking control theory for time varying constrained reconfigurable modular robots based on action-critic-identifier (ACI) and state-action value function (Q-function) has been presented to solve the problem of the continuous time nonlinear optimal control policy for strongly coupled uncertainty robotic system. The dynamics of time varying constrained reconfigurable modular robot is described as a synthesis of interconnected subsystem, and continuous time state equation andQ-function have been designed in this paper. Combining with ACI and RBF network, the global uncertainty of the subsystem and the HJB (Hamilton-Jacobi-Bellman) equation have been estimated, where critic-NN and action-NN are used to approximate the optimalQ-function and the optimal control policy, and the identifier is adopted to identify the global uncertainty as well as RBF-NN which is used to update the weights of ACI-NN. On this basis, a novel decentralized robust optimal tracking controller of the subsystem is proposed, so that the subsystem can track the desired trajectory and the tracking error can converge to zero in a finite time. The stability of ACI and the robust optimal tracking controller are confirmed by Lyapunov theory. Finally, comparative simulation examples are presented to illustrate the effectiveness of the proposed ACI and decentralized control theory.

Download Full-text

Implementation of Hypersonic Motion Primitives for Reinforcement Learning Using Optimal Control Theory.

10.2172/1837136 ◽

2020 ◽

Author(s):

Sean Nolan ◽

Matthew Lanier ◽

Andrew Haines ◽

Linus Mockus ◽

Kristopher Ezra ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Control Theory ◽

Optimal Control Theory ◽

Motion Primitives ◽

Hypersonic Motion

Download Full-text

An introduction to stochastic control theory, path integrals and reinforcement learning

10.1063/1.2709596 ◽

2007 ◽

Cited By ~ 33

Author(s):

Hilbert J. Kappen

Keyword(s):

Reinforcement Learning ◽

Control Theory ◽

Stochastic Control ◽

Path Integrals ◽

Stochastic Control Theory

Download Full-text

Implementation of Hypersonic Motion Primitives for Reinforcement Learning Using Optimal Control Theory.

10.2172/1837137 ◽

2020 ◽

Author(s):

Sean Nolan ◽

Matthew Lanier ◽

Andrew Haines ◽

Linus Mockus ◽

Kristopher Ezra ◽

...

Keyword(s):

Optimal Control ◽

Reinforcement Learning ◽

Control Theory ◽

Optimal Control Theory ◽

Motion Primitives ◽

Hypersonic Motion

Download Full-text

Using Control Theory for Analysis of Reinforcement Learning and Optimal Policy Properties in Grid-World Problems

Emerging Intelligent Computing Technology and Applications. With Aspects of Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-642-04020-7_30 ◽

2009 ◽

pp. 276-285

Author(s):

S. Mostapha Kalami Heris ◽

Mohammad-Bagher Naghibi Sistani ◽

Naser Pariz

Keyword(s):

Reinforcement Learning ◽

Control Theory ◽

Optimal Policy

Download Full-text

Using Control Theory and Bayesian Reinforcement Learning for Policy Management in Pandemic Situations

2021 IEEE International Conference on Communications Workshops (ICC Workshops) ◽

10.1109/iccworkshops50388.2021.9473604 ◽

2021 ◽

Author(s):

Heena Rathore ◽

Abhay Samant

Keyword(s):

Reinforcement Learning ◽

Control Theory ◽

Policy Management ◽

Bayesian Reinforcement Learning

Download Full-text

The Threat of Intelligent Attackers Using Deep Learning

Deep Learning Strategies for Security Enhancement in Wireless Sensor Networks - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-5068-7.ch006 ◽

2020 ◽

pp. 110-133

Author(s):

Juan Parras ◽

Santiago Zazo

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Reinforcement Learning ◽

Control Theory ◽

Defense Mechanism ◽

Ratio Test ◽

Network Vulnerabilities ◽

Recent Developments ◽

Sequential Probability ◽

Full Knowledge

The significant increase in the number of interconnected devices has brought new services and applications, as well as new network vulnerabilities. The increasing hardware capacities of these devices and the developments in the artificial intelligence field mean that new and complex attack methods are being developed. This chapter focuses on the backoff attack in a wireless network using CSMA/CA multiple access, and it shows that an intelligent attacker, making use of control theory, can successfully exploit a sequential probability ratio test-based defense mechanism. Also, recent developments in the deep reinforcement learning field allows that attackers that do not have full knowledge of the defense mechanism are able to successfully learn to attack it. Thus, this chapter illustrates by means of the backoff attack, the possibilities that the recent advances in the artificial intelligence field bring to intelligent attackers, and highlights the importance of researching in intelligent defense methods able to cope with such attackers.

Download Full-text

Changing Not Just Analyzing: Control Theory and Reinforcement Learning

Realtime Data Mining - Applied and Numerical Harmonic Analysis ◽

10.1007/978-3-319-01321-3_3 ◽

2013 ◽

pp. 15-40

Author(s):

Alexander Paprotny ◽

Michael Thess

Keyword(s):

Reinforcement Learning ◽

Control Theory

Download Full-text

Relative control of an underactuated spacecraft using reinforcement learning

Technical mechanics ◽

10.15407/itm2020.04.043 ◽

2020 ◽

Vol 2020 (4) ◽

pp. 43-54

Author(s):

S.V. Khoroshylov ◽

◽

M.O. Redka ◽

Keyword(s):

Neural Network ◽

Control System ◽

Reinforcement Learning ◽

Control Theory ◽

Learning Algorithm ◽

Control Algorithms ◽

Iteration Algorithm ◽

Quality Of Control ◽

Control Actions

The aim of the article is to approximate optimal relative control of an underactuated spacecraft using reinforcement learning and to study the influence of various factors on the quality of such a solution. In the course of this study, methods of theoretical mechanics, control theory, stability theory, machine learning, and computer modeling were used. The problem of in-plane spacecraft relative control using only control actions applied tangentially to the orbit is considered. This approach makes it possible to reduce the propellant consumption of reactive actuators and to simplify the architecture of the control system. However, in some cases, methods of the classical control theory do not allow one to obtain acceptable results. In this regard, the possibility of solving this problem by reinforcement learning methods has been investigated, which allows designers to find control algorithms close to optimal ones as a result of interactions of the control system with the plant using a reinforcement signal characterizing the quality of control actions. The well-known quadratic criterion is used as a reinforcement signal, which makes it possible to take into account both the accuracy requirements and the control costs. A search for control actions based on reinforcement learning is made using the policy iteration algorithm. This algorithm is implemented using the actor–critic architecture. Various representations of the actor for control law implementation and the critic for obtaining value function estimates using neural network approximators are considered. It is shown that the optimal control approximation accuracy depends on a number of features, namely, an appropriate structure of the approximators, the neural network parameter updating method, and the learning algorithm parameters. The investigated approach makes it possible to solve the considered class of control problems for controllers of different structures. Moreover, the approach allows the control system to refine its control algorithms during the spacecraft operation.

Download Full-text

Modeling an Inverted Pendulum via Differential Equations and Reinforcement Learning Techniques

10.20944/preprints202005.0181.v1 ◽

2020 ◽

Author(s):

Siddharth Sharma

Keyword(s):

Reinforcement Learning ◽

Differential Equations ◽

Control Theory ◽

Inverted Pendulum ◽

Control Mechanism ◽

Dynamic Environment ◽

Mathematical Technique ◽

Q Learning ◽

Research Systems ◽

Learning Techniques

The prevalence of differential equations as a mathematical technique has refined the fields of control theory and constrained optimization due to the newfound ability to accurately model chaotic, unbalanced systems. However, in recent research, systems are increasingly more nonlinear and difficult to model using Differential Equations only. Thus, a newer technique is to use policy iteration and Reinforcement Learning, techniques that center around an action and reward sequence for a controller. Reinforcement Learning (RL) can be applied to control theory problems since a system can robustly apply RL in a dynamic environment such as the cartpole system (an inverted pendulum). This solution successfully avoids use of PID or other dynamics optimization systems, in favor of a more robust, reward-based control mechanism. This paper applies RL and Q-Learning to the classic cartpole problem, while also discussing the mathematical background and differential equations which are used to model the aforementioned system.

Download Full-text