Cooperative Strategy Learning in Multi-Agent Environment with Continuous State Space

Author(s):  
Jun-yuan Tao ◽  
De-sheng Li
Symmetry ◽  
2018 ◽  
Vol 10 (10) ◽  
pp. 461 ◽  
Author(s):  
David Luviano-Cruz ◽  
Francesco Garcia-Luna ◽  
Luis Pérez-Domínguez ◽  
S. Gadi

A multi-agent system (MAS) is suitable for addressing tasks in a variety of domains without any programmed behaviors, which makes it ideal for the problems associated with the mobile robots. Reinforcement learning (RL) is a successful approach used in the MASs to acquire new behaviors; most of these select exact Q-values in small discrete state space and action space. This article presents a joint Q-function linearly fuzzified for a MAS’ continuous state space, which overcomes the dimensionality problem. Also, this article gives a proof for the convergence and existence of the solution proposed by the algorithm presented. This article also discusses the numerical simulations and experimental results that were carried out to validate the proposed algorithm.


2021 ◽  
Vol 3 (6) ◽  
Author(s):  
Ogbonnaya Anicho ◽  
Philip B. Charlesworth ◽  
Gurvinder S. Baicher ◽  
Atulya K. Nagar

AbstractThis work analyses the performance of Reinforcement Learning (RL) versus Swarm Intelligence (SI) for coordinating multiple unmanned High Altitude Platform Stations (HAPS) for communications area coverage. It builds upon previous work which looked at various elements of both algorithms. The main aim of this paper is to address the continuous state-space challenge within this work by using partitioning to manage the high dimensionality problem. This enabled comparing the performance of the classical cases of both RL and SI establishing a baseline for future comparisons of improved versions. From previous work, SI was observed to perform better across various key performance indicators. However, after tuning parameters and empirically choosing suitable partitioning ratio for the RL state space, it was observed that the SI algorithm still maintained superior coordination capability by achieving higher mean overall user coverage (about 20% better than the RL algorithm), in addition to faster convergence rates. Though the RL technique showed better average peak user coverage, the unpredictable coverage dip was a key weakness, making SI a more suitable algorithm within the context of this work.


NeuroImage ◽  
2017 ◽  
Vol 162 ◽  
pp. 344-352 ◽  
Author(s):  
Jacob C.W. Billings ◽  
Alessio Medda ◽  
Sadia Shakil ◽  
Xiaohong Shen ◽  
Amrit Kashyap ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document