Decentralized Incremental Fuzzy Reinforcement Learning for Multi-Agent Systems

Author(s):  
Sam Hamzeloo ◽  
Mansoor Zolghadri Jahromi

We present a new incremental fuzzy reinforcement learning algorithm to find a sub-optimal policy for infinite-horizon Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs). The algorithm addresses the high computational complexity of solving large Dec-POMDPs by generating a compact fuzzy rule-base for each agent. In our method, each agent uses its own fuzzy rule-base to make the decisions. The fuzzy rules in these rule-bases are incrementally created and tuned according to experiences of the agents. Reinforcement learning is used to tune the behavior of each agent in such a way that maximum global reward is achieved. In addition, we propose a method to construct the initial rule-base for each agent using the solution of the underlying MDP. This drastically improves the performance of the algorithm in comparison with random initialization of the rule-base. We assess the performance of our proposed method using several benchmark problems in comparison with some state-of-the-art methods. Experimental results show that our algorithm achieves better or similar reward when compared with other methods. However, from the runtime point of view, our method is superior to all previous methods. Using a compact fuzzy rule-base not only decreases the amount of memory used but also significantly speeds up the learning phase.

2008 ◽  
Vol 18 (1) ◽  
pp. 23-27 ◽  
Author(s):  
Hamid Boubertakh ◽  
Mohamed Tadjine ◽  
Pierre-Yves Glorennec ◽  
Salim Labiod

This paper proposes a new fuzzy logic-based navigation method for a mobile robot moving in an unknown environment. This method allows the robot obstacles avoidance and goal seeking without being stuck in local minima. A simple Fuzzy controller is constructed based on the human sense and a fuzzy reinforcement learning algorithm is used to fine tune the fuzzy rule base parameters. The advantages of the proposed method are its simplicity, its easy implementation for industrial applications, and the robot joins its objective despite the environment complexity. Some simulation results of the proposed method and a comparison with previous works are provided.


2016 ◽  
Author(s):  
Leonardo G. Melo ◽  
Luís A. Lucas ◽  
Myriam R. Delgado

2021 ◽  
pp. 2150011
Author(s):  
Wei Dong ◽  
Jianan Wang ◽  
Chunyan Wang ◽  
Zhenqiang Qi ◽  
Zhengtao Ding

In this paper, the optimal consensus control problem is investigated for heterogeneous linear multi-agent systems (MASs) with spanning tree condition based on game theory and reinforcement learning. First, the graphical minimax game algebraic Riccati equation (ARE) is derived by converting the consensus problem into a zero-sum game problem between each agent and its neighbors. The asymptotic stability and minimax validation of the closed-loop systems are proved theoretically. Then, a data-driven off-policy reinforcement learning algorithm is proposed to online learn the optimal control policy without the information of the system dynamics. A certain rank condition is established to guarantee the convergence of the proposed algorithm to the unique solution of the ARE. Finally, the effectiveness of the proposed method is demonstrated through a numerical simulation.


2012 ◽  
Vol 566 ◽  
pp. 572-579
Author(s):  
Abdolkarim Niazi ◽  
Norizah Redzuan ◽  
Raja Ishak Raja Hamzah ◽  
Sara Esfandiari

In this paper, a new algorithm based on case base reasoning and reinforcement learning (RL) is proposed to increase the convergence rate of the reinforcement learning algorithms. RL algorithms are very useful for solving wide variety decision problems when their models are not available and they must make decision correctly in every state of system, such as multi agent systems, artificial control systems, robotic, tool condition monitoring and etc. In the propose method, we investigate how making improved action selection in reinforcement learning (RL) algorithm. In the proposed method, the new combined model using case base reasoning systems and a new optimized function is proposed to select the action, which led to an increase in algorithms based on Q-learning. The algorithm mentioned was used for solving the problem of cooperative Markov’s games as one of the models of Markov based multi-agent systems. The results of experiments Indicated that the proposed algorithms perform better than the existing algorithms in terms of speed and accuracy of reaching the optimal policy.


Sign in / Sign up

Export Citation Format

Share Document