Online Tuning of a PID Controller with a Fuzzy Reinforcement Learning MAS for Flow Rate Control of a Desalination Unit

Panagiotis Kofinas; Anastasios I. Dounis

doi:10.3390/electronics8020231

Online Tuning of a PID Controller with a Fuzzy Reinforcement Learning MAS for Flow Rate Control of a Desalination Unit

Electronics ◽

10.3390/electronics8020231 ◽

2019 ◽

Vol 8 (2) ◽

pp. 231 ◽

Cited By ~ 2

Author(s):

Panagiotis Kofinas ◽

Anastasios I. Dounis

Keyword(s):

Reinforcement Learning ◽

Flow Rate ◽

Pid Controller ◽

Hybrid Control ◽

Q Learning ◽

State Action ◽

Continuous State ◽

Multi Agent ◽

Flow Rate Control ◽

Online Tuning

This paper proposes a hybrid Zeigler-Nichols (Z-N) fuzzy reinforcement learning MAS (Multi-Agent System) approach for online tuning of a Proportional Integral Derivative (PID) controller in order to control the flow rate of a desalination unit. The PID gains are set by the Z-N method and then are adapted online through the fuzzy Q-learning MAS. The fuzzy Q-learning is introduced in each agent in order to confront with the continuous state-action space. The global state of the MAS is defined by the value of the error and the derivative of error. The MAS consists of three agents and the output signal of each agent defines the percentage change of each gain. The increment or the reduction of each gain can be in the range of 0% to 100% of its initial value. The simulation results highlight the performance of the suggested hybrid control strategy through comparison with the conventional PID controller tuned by Z-N.

Download Full-text

Fuzzy Q-Learning Agent for Online Tuning of PID Controller for DC Motor Speed Control

Algorithms ◽

10.3390/a11100148 ◽

2018 ◽

Vol 11 (10) ◽

pp. 148 ◽

Cited By ~ 2

Author(s):

Panagiotis Kofinas ◽

Anastasios I. Dounis

Keyword(s):

Pid Controller ◽

Dc Motor ◽

Proportional Integral Derivative ◽

Motor Speed ◽

Initial Value ◽

Q Learning ◽

State Action ◽

Learning Agent ◽

Continuous State ◽

Online Tuning

This paper proposes a hybrid Zeigler-Nichols (Z-N) reinforcement learning approach for online tuning of the parameters of the Proportional Integral Derivative (PID) for controlling the speed of a DC motor. The PID gains are set by the Z-N method, and are then adapted online through the fuzzy Q-Learning agent. The fuzzy Q-Learning agent is used instead of the conventional Q-Learning, in order to deal with the continuous state-action space. The fuzzy Q-Learning agent defines its state according to the value of the error. The output signal of the agent consists of three output variables, in which each one defines the percentage change of each gain. Each gain can be increased or decreased from 0% to 50% of its initial value. Through this method, the gains of the controller are adjusted online via the interaction of the environment. The knowledge of the expert is not a necessity during the setup process. The simulation results highlight the performance of the proposed control strategy. After the exploration phase, the settling time is reduced in the steady states. In the transient states, the response has less amplitude oscillations and reaches the equilibrium point faster than the conventional PID controller.

Download Full-text

Multi-agent cooperation Q-learning algorithm based on constrained Markov Game

Computer Science and Information Systems ◽

10.2298/csis191220009g ◽

2020 ◽

Vol 17 (2) ◽

pp. 647-664

Author(s):

Yangyang Ge ◽

Fei Zhu ◽

Wei Huang ◽

Peiyao Zhao ◽

Quan Liu

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Multi Agent System ◽

Agent System ◽

Action Function ◽

Q Learning ◽

State Action ◽

Markov Game ◽

Safety Constraints ◽

Multi Agent

Multi-Agent system has broad application in real world, whose security performance, however, is barely considered. Reinforcement learning is one of the most important methods to resolve Multi-Agent problems. At present, certain progress has been made in applying Multi-Agent reinforcement learning to robot system, man-machine match, and automatic, etc. However, in the above area, an agent may fall into unsafe states where the agent may find it difficult to bypass obstacles, to receive information from other agents and so on. Ensuring the safety of Multi-Agent system is of great importance in the above areas where an agent may fall into dangerous states that are irreversible, causing great damage. To solve the safety problem, in this paper we introduce a Multi-Agent Cooperation Q-Learning Algorithm based on Constrained Markov Game. In this method, safety constraints are added to the set of actions, and each agent, when interacting with the environment to search for optimal values, should be restricted by the safety rules, so as to obtain an optimal policy that satisfies the security requirements. Since traditional Multi-Agent reinforcement learning algorithm is no more suitable for the proposed model in this paper, a new solution is introduced for calculating the global optimum state-action function that satisfies the safety constraints. We take advantage of the Lagrange multiplier method to determine the optimal action that can be performed in the current state based on the premise of linearizing constraint functions, under conditions that the state-action function and the constraint function are both differentiable, which not only improves the efficiency and accuracy of the algorithm, but also guarantees to obtain the global optimal solution. The experiments verify the effectiveness of the algorithm.

Download Full-text

Multiagent reinforcement learning using Non-Parametric Approximation

Respuestas ◽

10.22463/0122820x.1738 ◽

2018 ◽

Vol 23 (2) ◽

pp. 53-61

Author(s):

David Luviano Cruz ◽

Francesco José García Luna ◽

Luis Asunción Pérez Domínguez

Keyword(s):

Reinforcement Learning ◽

Hybrid Control ◽

Learning Algorithm ◽

Multi Agent Systems ◽

Generation Task ◽

Q Learning ◽

Agent Systems ◽

Multi Agent ◽

Optimal Set ◽

Parametric Approximation

This paper presents a hybrid control proposal for multi-agent systems, where the advantages of the reinforcement learning and nonparametric functions are exploited. A modified version of the Q-learning algorithm is used which will provide data training for a Kernel, this approach will provide a sub optimal set of actions to be used by the agents. The proposed algorithm is experimentally tested in a path generation task in an unknown environment for mobile robots.

Download Full-text

Automated Driving Highway Traffic Merging using Deep Multi-Agent Reinforcement Learning in Continuous State-Action Spaces

10.1109/iv48863.2021.9575676 ◽

2021 ◽

Author(s):

Larry Schester ◽

Luis E. Ortiz

Keyword(s):

Reinforcement Learning ◽

Highway Traffic ◽

Automated Driving ◽

State Action ◽

Continuous State ◽

Multi Agent ◽

Action Spaces

Download Full-text

Improvement on Supporting Machine Learning Algorithm for Solving Problem in Immediate Decision Making

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.566.572 ◽

2012 ◽

Vol 566 ◽

pp. 572-579

Author(s):

Abdolkarim Niazi ◽

Norizah Redzuan ◽

Raja Ishak Raja Hamzah ◽

Sara Esfandiari

Keyword(s):

Reinforcement Learning ◽

Learning Algorithm ◽

Multi Agent Systems ◽

Combined Model ◽

Q Learning ◽

Agent Systems ◽

Multi Agent ◽

Case Base ◽

Case Base Reasoning ◽

Robotic Tool

In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the new combined model using case base reasoning systems and a new optimized function is proposed to select the action, which led to an increase in algorithms based on Q-learning. The algorithm mentioned was used for solving the problem of cooperative Markov’s games as one of the models of Markov based multi-agent systems. The results of experiments Indicated that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.

Download Full-text

Cloud Load Balancing and Reinforcement Learning

Advances in Business Information Systems and Analytics - Cloud Computing Technologies for Green Enterprises ◽

10.4018/978-1-5225-3038-1.ch011 ◽

2018 ◽

pp. 266-291

Author(s):

Abdelghafour Harraz ◽

Mostapha Zbakh

Keyword(s):

Artificial Intelligence ◽

Reinforcement Learning ◽

Load Balancing ◽

Decision Process ◽

Cloud System ◽

Human Intervention ◽

Q Learning ◽

State Action ◽

Learning Techniques ◽

Markov Decision

Artificial Intelligence allows to create engines that are able to explore, learn environments and therefore create policies that permit to control them in real time with no human intervention. It can be applied, through its Reinforcement Learning techniques component, using frameworks such as temporal differences, State-Action-Reward-State-Action (SARSA), Q Learning to name a few, to systems that are be perceived as a Markov Decision Process, this opens door in front of applying Reinforcement Learning to Cloud Load Balancing to be able to dispatch load dynamically to a given Cloud System. The authors will describe different techniques that can used to implement a Reinforcement Learning based engine in a cloud system.

Download Full-text

Deep Reinforcement Learning by Balancing Offline Monte Carlo and Online Temporal Difference Use Based on Environment Experiences

Symmetry ◽

10.3390/sym12101685 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1685 ◽

Cited By ~ 1

Author(s):

Chayoung Kim

Keyword(s):

Monte Carlo ◽

Reinforcement Learning ◽

Real Time ◽

Temporal Difference ◽

Q Learning ◽

State Action ◽

Proposed Model ◽

Reward Functions ◽

And Performance ◽

The Internet Of Things

Owing to the complexity involved in training an agent in a real-time environment, e.g., using the Internet of Things (IoT), reinforcement learning (RL) using a deep neural network, i.e., deep reinforcement learning (DRL) has been widely adopted on an online basis without prior knowledge and complicated reward functions. DRL can handle a symmetrical balance between bias and variance—this indicates that the RL agents are competently trained in real-world applications. The approach of the proposed model considers the combinations of basic RL algorithms with online and offline use based on the empirical balances of bias–variance. Therefore, we exploited the balance between the offline Monte Carlo (MC) technique and online temporal difference (TD) with on-policy (state-action–reward-state-action, Sarsa) and an off-policy (Q-learning) in terms of a DRL. The proposed balance of MC (offline) and TD (online) use, which is simple and applicable without a well-designed reward, is suitable for real-time online learning. We demonstrated that, for a simple control task, the balance between online and offline use without an on- and off-policy shows satisfactory results. However, in complex tasks, the results clearly indicate the effectiveness of the combined method in improving the convergence speed and performance in a deep Q-network.

Download Full-text

Energy Optimization of Solar Micro-Grid Using Multi Agent Reinforcement Learning

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.787.843 ◽

2015 ◽

Vol 787 ◽

pp. 843-847

Author(s):

Leo Raju ◽

R.S. Milton ◽

S. Sakthiyanandan

Keyword(s):

Reinforcement Learning ◽

Energy Savings ◽

Learning Method ◽

Solar Pv ◽

Q Learning ◽

Pv Systems ◽

Model Free ◽

Individual Unit ◽

Multi Agent ◽

Micro Grid

In this paper, two solar Photovoltaic (PV) systems are considered; one in the department with capacity of 100 kW and the other in the hostel with capacity of 200 kW. Each one has battery and load. The capital cost and energy savings by conventional methods are compared and it is proved that the energy dependency from grid is reduced in solar micro-grid element, operating in distributed environment. In the smart grid frame work, the grid energy consumption is further reduced by optimal scheduling of the battery, using Reinforcement Learning. Individual unit optimization is done by a model free reinforcement learning method, called Q-Learning and it is compared with distributed operations of solar micro-grid using a Multi Agent Reinforcement Learning method, called Joint Q-Learning. The energy planning is designed according to the prediction of solar PV energy production and observed load pattern of department and the hostel. A simulation model was developed using Python programming.

Download Full-text

Reinforcement Distribution in Continuous State Action Space Fuzzy Q–Learning: A Novel Approach

Fuzzy Logic and Applications - Lecture Notes in Computer Science ◽

10.1007/11676935_5 ◽

2006 ◽

pp. 40-45

Author(s):

Andrea Bonarini ◽

Francesco Montrone ◽

Marcello Restelli

Keyword(s):

Action Space ◽

Q Learning ◽

State Action ◽

Novel Approach ◽

Continuous State ◽

Reinforcement Distribution

Download Full-text

FUZZY STATE AGGREGATION AND POLICY HILL CLIMBING FOR STOCHASTIC ENVIRONMENTS

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026806001903 ◽

2006 ◽

Vol 06 (03) ◽

pp. 413-428 ◽

Cited By ~ 1

Author(s):

DEAN C. WARDELL ◽

GILBERT L. PETERSON

Keyword(s):

Reinforcement Learning ◽

Function Approximation ◽

Hill Climbing ◽

Individual Agent ◽

Q Learning ◽

Stochastic Environments ◽

State Aggregation ◽

Multi Agent ◽

Fuzzy State ◽

Better Than

Reinforcement learning is one of the more attractive machine learning technologies, due to its unsupervised learning structure and ability to continually learn even as the operating environment changes. Additionally, by applying reinforcement learning to multiple cooperative software agents (a multi-agent system) not only allows each individual agent to learn from its own experience, but also opens up the opportunity for the individual agents to learn from the other agents in the system, thus accelerating the rate of learning. This research presents the novel use of fuzzy state aggregation, as the means of function approximation, combined with the fastest policy hill climbing methods of Win or Lose Fast (WoLF) and policy-dynamics based WoLF (PD-WoLF). The combination of fast policy hill climbing and fuzzy state aggregation function approximation is tested in two stochastic environments: Tileworld and the simulated robot soccer domain, RoboCup. The Tileworld results demonstrate that a single agent using the combination of FSA and PHC learns quicker and performs better than combined fuzzy state aggregation and Q-learning reinforcement learning alone. Results from the multi-agent RoboCup domain again illustrate that the policy hill climbing algorithms perform better than Q-learning alone in a multi-agent environment. The learning is further enhanced by allowing the agents to share their experience through a weighted strategy sharing.

Download Full-text