scholarly journals Cooperative Multi-Agent Learning: The State of the Art

2005 ◽  
Vol 11 (3) ◽  
pp. 387-434 ◽  
Author(s):  
Liviu Panait ◽  
Sean Luke
2020 ◽  
Vol 34 (05) ◽  
pp. 7253-7260 ◽  
Author(s):  
Yuhang Song ◽  
Andrzej Wojcicki ◽  
Thomas Lukasiewicz ◽  
Jianyi Wang ◽  
Abi Aryan ◽  
...  

Learning agents that are not only capable of taking tests, but also innovating is becoming a hot topic in AI. One of the most promising paths towards this vision is multi-agent learning, where agents act as the environment for each other, and improving each agent means proposing new problems for others. However, existing evaluation platforms are either not compatible with multi-agent settings, or limited to a specific game. That is, there is not yet a general evaluation platform for research on multi-agent intelligence. To this end, we introduce Arena, a general evaluation platform for multi-agent intelligence with 35 games of diverse logics and representations. Furthermore, multi-agent intelligence is still at the stage where many problems remain unexplored. Therefore, we provide a building toolkit for researchers to easily invent and build novel multi-agent problems from the provided game set based on a GUI-configurable social tree and five basic multi-agent reward schemes. Finally, we provide Python implementations of five state-of-the-art deep multi-agent reinforcement learning baselines. Along with the baseline implementations, we release a set of 100 best agents/teams that we can train with different training schemes for each game, as the base for evaluating agents with population performance. As such, the research community can perform comparisons under a stable and uniform standard. All the implementations and accompanied tutorials have been open-sourced for the community at https://sites.google.com/view/arena-unity/.


Author(s):  
Yanchen Deng ◽  
Bo An

Incomplete GDL-based algorithms including Max-sum and its variants are important methods for multi-agent optimization. However, they face a significant scalability challenge as the computational overhead grows exponentially with respect to the arity of each utility function. Generic Domain Pruning (GDP) technique reduces the computational effort by performing a one-shot pruning to filter out suboptimal entries. Unfortunately, GDP could perform poorly when dealing with dense local utilities and ties which widely exist in many domains. In this paper, we present several novel sorting-based acceleration algorithms by alleviating the effect of densely distributed local utilities. Specifically, instead of one-shot pruning in GDP, we propose to integrate both search and pruning to iteratively reduce the search space. Besides, we cope with the utility ties by organizing the search space of tied utilities into AND/OR trees to enable branch-and-bound. Finally, we propose a discretization mechanism to offer a tradeoff between the reconstruction overhead and the pruning efficiency. We demonstrate the superiorities of our algorithms over the state-of-the-art from both theoretical and experimental perspectives.


Author(s):  
Pavel Surynek

We unify search-based and compilation-based approaches to multi-agent path finding (MAPF) through satisfiability modulo theories (SMT). The task in MAPF is to navigate agents in an undirected graph to given goal vertices so that they do not collide. We rephrase Conflict-Based Search (CBS), one of the state-of-the-art algorithms for optimal MAPF solving, in the terms of SMT. This idea combines SAT-based solving known from MDD-SAT, a SAT-based optimal MAPF solver, at the low-level with conflict elimination of CBS at the high-level. Where the standard CBS branches the search after a conflict, we refine the propositional model with a disjunctive constraint. Our novel algorithm called SMT-CBS hence does not branch at the high-level but incrementally extends the propositional model. We experimentally compare SMT-CBS with CBS, ICBS, and MDD-SAT.


Author(s):  
Edward Lam ◽  
Pierre Le Bodic ◽  
Daniel D. Harabor ◽  
Peter J. Stuckey

There are currently two broad strategies for optimal Multi-agent Pathfinding (MAPF): (1) search-based methods, which model and solve MAPF directly, and (2) compilation-based solvers, which reduce MAPF to instances of well-known combinatorial problems, and thus, can benefit from advances in solver techniques. In this work, we present an optimal algorithm, BCP, that hybridizes both approaches using Branch-and-Cut-and-Price, a decomposition framework developed for mathematical optimization. We formalize BCP and compare it empirically against CBSH and CBSH-RM, two leading search-based solvers. Conclusive results on standard benchmarks indicate that its performance exceeds the state-of-the-art: solving more instances on smaller grids and scaling reliably to 100 or more agents on larger game maps.


2020 ◽  
Vol 34 (09) ◽  
pp. 13534-13538
Author(s):  
Sarit Kraus ◽  
Amos Azaria ◽  
Jelena Fiosina ◽  
Maike Greve ◽  
Noam Hazon ◽  
...  

Explanation is necessary for humans to understand and accept decisions made by an AI system when the system's goal is known. It is even more important when the AI system makes decisions in multi-agent environments where the human does not know the systems' goals since they may depend on other agents' preferences. In such situations, explanations should aim to increase user satisfaction, taking into account the system's decision, the user's and the other agents' preferences, the environment settings and properties such as fairness, envy and privacy. Generating explanations that will increase user satisfaction is very challenging; to this end, we propose a new research direction: Explainable decisions in Multi-Agent Environments (xMASE). We then review the state of the art and discuss research directions towards efficient methodologies and algorithms for generating explanations that will increase users' satisfaction from AI systems' decisions in multi-agent environments.


Author(s):  
Anton Filatov ◽  
Kirill Krinkin

Limitation of computational resources is a challenging problem for moving agents that launch such algorithms as simultaneous localization and mapping (SLAM). To increase the accuracy on limited resources one may add more computing agents that might explore the environment quicker than one and thus to decrease the load of each agent. In this article, the state-of-the-art in multi-agent SLAM algorithms is presented, and an approach that extends laser 2D single hypothesis SLAM for multiple agents is introduced. The article contains a description of problems that are faced in front of a developer of such approach including questions about map merging, relative pose calculation, and roles of agents.


2020 ◽  
Vol 12 (5) ◽  
pp. 1935
Author(s):  
Roberto Dominguez ◽  
Salvatore Cannella

In this paper, we review relevant literature on the development of multi-agent systems applications for supply chain management. We give a general picture of the state of the art, showing the main applications developed using this novel methodology for analyzing diverse problems in industry. We also analyze generic frameworks for supply chain modelling, showing their main characteristics. We discuss the main topics addressed with this technique and the degree of development of the contributions.


Author(s):  
Wei Qiu ◽  
Haipeng Chen ◽  
Bo An

Over the past decades, Electronic Toll Collection (ETC) systems have been proved the capability of alleviating traffic congestion in urban areas. Dynamic Electronic Toll Collection (DETC) was recently proposed to further improve the efficiency of ETC, where tolls are dynamically set based on traffic dynamics. However, computing the optimal DETC scheme is computationally difficult and existing approaches are limited to small scale or partial road networks, which significantly restricts the adoption of DETC. To this end, we propose a novel multi-agent reinforcement learning (RL) approach for DETC. We make several key contributions: i) an enhancement over the state-of-the-art RL-based method with a deep neural network representation of the policy and value functions and a temporal difference learning framework to accelerate the update of target values, ii) a novel edge-based graph convolutional neural network (eGCN) to extract the spatio-temporal correlations of the road network state features, iii) a novel cooperative multi-agent reinforcement learning (MARL) which divides the whole road network into partitions according to their geographic and economic characteristics and trains a tolling agent for each partition. Experimental results show that our approach can scale up to realistic-sized problems with robust performance and significantly outperform the state-of-the-art method.


Sign in / Sign up

Export Citation Format

Share Document