Learning fuzzy classifier systems for multi-agent coordination

2001 ◽  
Vol 136 (1-4) ◽  
pp. 215-239 ◽  
Author(s):  
Andrea Bonarini ◽  
Vito Trianni
2021 ◽  
Vol 9 (6) ◽  
pp. 560
Author(s):  
Sulemana Nantogma ◽  
Keyu Pan ◽  
Weilong Song ◽  
Renwei Luo ◽  
Yang Xu

Unmanned autonomous vehicles for various civilian and military applications have become a particularly interesting research area. Despite their many potential applications, a related technological challenge is realizing realistic coordinated autonomous control and decision making in complex and multi-agent environments. Machine learning approaches have been largely employed in simplified simulations to acquire intelligent control systems in multi-agent settings. However, the complexity of the physical environment, unrealistic assumptions, and lack of abstract physical environments derail the process of transition from simulation to real systems. This work presents a modular framework for automated data acquisition, training, and the evaluation of multiple unmanned surface vehicles controllers that facilitate prior knowledge integration and human-guided learning in a closed-loop. To realize this, we first present a digital maritime environment of multiple unmanned surface vehicles that abstracts the real-world dynamics in our application domain. Then, a behavior-driven artificial immune-inspired fuzzy classifier systems approach that is capable of optimizing agents’ behaviors and action selection in a multi-agent environment is presented. Evaluation scenarios of different combat missions are presented to demonstrate the performance of the system. Simulation results show that the resulting controllers can achieved an average wining rate between 52% and 98% in all test cases, indicating the effectiveness of the proposed approach and its feasibility in realizing adaptive controllers for efficient multiple unmanned systems’ cooperative decision making. We believe that this system can facilitate the simulation, data acquisition, training, and evaluation of practical cooperative unmanned vehicles’ controllers in a closed-loop.


2020 ◽  
Vol 16 (3) ◽  
pp. 255-269
Author(s):  
Enrico Bozzo ◽  
Paolo Vidoni ◽  
Massimo Franceschet

AbstractWe study the stability of a time-aware version of the popular Massey method, previously introduced by Franceschet, M., E. Bozzo, and P. Vidoni. 2017. “The Temporalized Massey’s Method.” Journal of Quantitative Analysis in Sports 13: 37–48, for rating teams in sport competitions. To this end, we embed the temporal Massey method in the theory of time-varying averaging algorithms, which are dynamic systems mainly used in control theory for multi-agent coordination. We also introduce a parametric family of Massey-type methods and show that the original and time-aware Massey versions are, in some sense, particular instances of it. Finally, we discuss the key features of this general family of rating procedures, focusing on inferential and predictive issues and on sensitivity to upsets and modifications of the schedule.


Author(s):  
Daxue Liu ◽  
Jun Wu ◽  
Xin Xu

Multi-agent reinforcement learning (MARL) provides a useful and flexible framework for multi-agent coordination in uncertain dynamic environments. However, the generalization ability and scalability of algorithms to large problem sizes, already problematic in single-agent RL, is an even more formidable obstacle in MARL applications. In this paper, a new MARL method based on ordinal action selection and approximate policy iteration called OAPI (Ordinal Approximate Policy Iteration), is presented to address the scalability issue of MARL algorithms in common-interest Markov Games. In OAPI, an ordinal action selection and learning strategy is integrated with distributed approximate policy iteration not only to simplify the policy space and eliminate the conflicts in multi-agent coordination, but also to realize the approximation of near-optimal policies for Markov Games with large state spaces. Based on the simplified policy space using ordinal action selection, the OAPI algorithm implements distributed approximate policy iteration utilizing online least-squares policy iteration (LSPI). This resulted in multi-agent coordination with good convergence properties with reduced computational complexity. The simulation results of a coordinated multi-robot navigation task illustrate the feasibility and effectiveness of the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document