Study on Statistics Based Q-Learning Algorithm for Multi-agent System

Author(s):  
Xie Ya ◽  
Huang Zhonghua
2020 ◽  
Vol 17 (2) ◽  
pp. 647-664
Author(s):  
Yangyang Ge ◽  
Fei Zhu ◽  
Wei Huang ◽  
Peiyao Zhao ◽  
Quan Liu

Multi-Agent system has broad application in real world, whose security performance, however, is barely considered. Reinforcement learning is one of the most important methods to resolve Multi-Agent problems. At present, certain progress has been made in applying Multi-Agent reinforcement learning to robot system, man-machine match, and automatic, etc. However, in the above area, an agent may fall into unsafe states where the agent may find it difficult to bypass obstacles, to receive information from other agents and so on. Ensuring the safety of Multi-Agent system is of great importance in the above areas where an agent may fall into dangerous states that are irreversible, causing great damage. To solve the safety problem, in this paper we introduce a Multi-Agent Cooperation Q-Learning Algorithm based on Constrained Markov Game. In this method, safety constraints are added to the set of actions, and each agent, when interacting with the environment to search for optimal values, should be restricted by the safety rules, so as to obtain an optimal policy that satisfies the security requirements. Since traditional Multi-Agent reinforcement learning algorithm is no more suitable for the proposed model in this paper, a new solution is introduced for calculating the global optimum state-action function that satisfies the safety constraints. We take advantage of the Lagrange multiplier method to determine the optimal action that can be performed in the current state based on the premise of linearizing constraint functions, under conditions that the state-action function and the constraint function are both differentiable, which not only improves the efficiency and accuracy of the algorithm, but also guarantees to obtain the global optimal solution. The experiments verify the effectiveness of the algorithm.


2017 ◽  
Vol 52 ◽  
pp. 519-531 ◽  
Author(s):  
Farhad Pourpanah ◽  
Choo Jun Tan ◽  
Chee Peng Lim ◽  
Junita Mohamad-Saleh

2020 ◽  
Vol 8 (3) ◽  
pp. 201-224
Author(s):  
Faqihza Mukhlish ◽  
John Page ◽  
Michael Bain

PurposeThis paper aims to propose a novel epigenetic learning (EpiLearn) algorithm, which is designed specifically for a decentralised multi-agent system such as swarm robotics.Design/methodology/approachFirst, this paper begins with overview of swarm robotics and the challenges in designing swarm behaviour automatically. This should indicate the direction of improvements required to enhance an automatic swarm design. Second, the evolutionary learning (EpiLearn) algorithm for a swarm system using an epigenetic layer is formulated and discussed. The algorithm is then tested through various test functions to investigate its performance. Finally, the results are discussed along with possible future research directions.FindingsThrough various test functions, the algorithm can solve non-local and many local minima problems. This article also shows that by using a reward system, the algorithm can handle the deceptive problem which often occurs in dynamic problems. Moreover, utilization of rewards from the environment in the form of a methylation process on the epigenetic layer improves the performance of traditional evolutionary algorithms applied to automatic swarm design. Finally, this article shows that a regeneration process that embeds an epigenetic layer in the inheritance process performs better than a traditional crossover operator in a swarm system.Originality/valueThis paper proposes a novel method for automatic swarm design by taking into account the importance of multi-agent settings and environmental characteristics surrounding the swarm. The novel evolutionary learning (EpiLearn) algorithm using an epigenetic layer gives the swarm the ability to perform co-evolution and co-learning.


Sign in / Sign up

Export Citation Format

Share Document