scholarly journals Design and Comparison of Reinforcement-Learning-Based Time-Varying PID Controllers with Gain-Scheduled Actions

Machines ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 319
Author(s):  
Yi-Liang Yeh ◽  
Po-Kai Yang

This paper presents innovative reinforcement learning methods for automatically tuning the parameters of a proportional integral derivative controller. Conventionally, the high dimension of the Q-table is a primary drawback when implementing a reinforcement learning algorithm. To overcome the obstacle, the idea underlying the n-armed bandit problem is used in this paper. Moreover, gain-scheduled actions are presented to tune the algorithms to improve the overall system behavior; therefore, the proposed controllers fulfill the multiple performance requirements. An experiment was conducted for the piezo-actuated stage to illustrate the effectiveness of the proposed control designs relative to competing algorithms.

2020 ◽  
Vol 34 (04) ◽  
pp. 6518-6525
Author(s):  
Xiao Xu ◽  
Fang Dong ◽  
Yanghua Li ◽  
Shaojian He ◽  
Xin Li

A contextual bandit problem is studied in a highly non-stationary environment, which is ubiquitous in various recommender systems due to the time-varying interests of users. Two models with disjoint and hybrid payoffs are considered to characterize the phenomenon that users' preferences towards different items vary differently over time. In the disjoint payoff model, the reward of playing an arm is determined by an arm-specific preference vector, which is piecewise-stationary with asynchronous and distinct changes across different arms. An efficient learning algorithm that is adaptive to abrupt reward changes is proposed and theoretical regret analysis is provided to show that a sublinear scaling of regret in the time length T is achieved. The algorithm is further extended to a more general setting with hybrid payoffs where the reward of playing an arm is determined by both an arm-specific preference vector and a joint coefficient vector shared by all arms. Empirical experiments are conducted on real-world datasets to verify the advantages of the proposed learning algorithms against baseline ones in both settings.


Processes ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 196 ◽  
Author(s):  
Shiquan Zhao ◽  
Sheng Liu ◽  
Robain De Keyser ◽  
Clara-Mihaela Ionescu

In large scale ships, the most used controllers for the steam/water loop are still the proportional-integral-derivative (PID) controllers. However, the tuning rules for the PID parameters are based on empirical knowledge and the performance for the loops is not satisfying. In order to improve the control performance of the steam/water loop, the application of a recently developed PID autotuning method is studied. Firstly, a ‘forbidden region’ on the Nyquist plane can be obtained based on user-defined performance requirements such as robustness or gain margin and phase margin. Secondly, the dynamic of the system can be obtained with a sine test around the operation point. Finally, the PID controller’s parameters can be obtained by locating the frequency response of the controlled system at the edge of the ‘forbidden region’. To verify the effectiveness of the new PID autotuning method, comparisons are presented with other PID autotuning methods, as well as the model predictive control. The results show the superiority of the new PID autotuning method.


Author(s):  
Arman Zandi Nia ◽  
Ryozo Nagamune

This paper proposes an application of the switching gain-scheduled (S-GS) proportional–integral–derivative (PID) control technique to the electronic throttle control (ETC) problem in automotive engines. For the S-GS PID controller design, a published linear parameter-varying (LPV) model of the electronic throttle valve (ETV) is adopted whose dynamics change with both the throttle valve velocity variation and the battery voltage fluctuation. The designed controller consists of multiple GS PID controllers assigned to local subregions defined for varying throttle valve velocity and battery voltage. Hysteresis switching logic is employed for switching between local GS PID controllers based on the operating point. The S-GS PID controller design problem is formulated as a nonconvex optimization problem and tackled by solving its convex subproblems iteratively. Experimental results demonstrate overall superiority of the S-GS PID controller to conventional controllers in reference tracking performance of the throttle valve under various scenarios.


Symmetry ◽  
2021 ◽  
Vol 13 (3) ◽  
pp. 471
Author(s):  
Jai Hoon Park ◽  
Kang Hoon Lee

Designing novel robots that can cope with a specific task is a challenging problem because of the enormous design space that involves both morphological structures and control mechanisms. To this end, we present a computational method for automating the design of modular robots. Our method employs a genetic algorithm to evolve robotic structures as an outer optimization, and it applies a reinforcement learning algorithm to each candidate structure to train its behavior and evaluate its potential learning ability as an inner optimization. The size of the design space is reduced significantly by evolving only the robotic structure and by performing behavioral optimization using a separate training algorithm compared to that when both the structure and behavior are evolved simultaneously. Mutual dependence between evolution and learning is achieved by regarding the mean cumulative rewards of a candidate structure in the reinforcement learning as its fitness in the genetic algorithm. Therefore, our method searches for prospective robotic structures that can potentially lead to near-optimal behaviors if trained sufficiently. We demonstrate the usefulness of our method through several effective design results that were automatically generated in the process of experimenting with actual modular robotics kit.


Sign in / Sign up

Export Citation Format

Share Document