scholarly journals Multi-Agent Pathfinding with Continuous Time

Author(s):  
Anton Andreychuk ◽  
Konstantin Yakovlev ◽  
Dor Atzmon ◽  
Roni Stern

Multi-Agent Pathfinding (MAPF) is the problem of finding paths for multiple agents such that every agent reaches its goal and the agents do not collide. Most prior work on MAPF were on grids, assumed agents' actions have uniform duration, and that time is discretized into timesteps. In this work, we propose a MAPF algorithm that do not assume any of these assumptions, is complete, and provides provably optimal solutions. This algorithm is based on a novel combination of Safe Interval Path Planning (SIPP), a continuous time single agent planning algorithms, and Conflict-Based Search (CBS). We analyze this algorithm, discuss its pros and cons, and evaluate it experimentally on several standard benchmarks.

2021 ◽  
Vol 11 (11) ◽  
pp. 4948
Author(s):  
Lorenzo Canese ◽  
Gian Carlo Cardarilli ◽  
Luca Di Di Nunzio ◽  
Rocco Fazzolari ◽  
Daniele Giardino ◽  
...  

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.


2000 ◽  
Vol 13 ◽  
pp. 189-226 ◽  
Author(s):  
R. M. Jensen ◽  
M. M. Veloso

Recently model checking representation and search techniques were shown to be efficiently applicable to planning, in particular to non-deterministic planning. Such planning approaches use Ordered Binary Decision Diagrams (OBDDs) to encode a planning domain as a non-deterministic finite automaton and then apply fast algorithms from model checking to search for a solution. OBDDs can effectively scale and can provide universal plans for complex planning domains. We are particularly interested in addressing the complexities arising in non-deterministic, multi-agent domains. In this article, we present UMOP, a new universal OBDD-based planning framework for non-deterministic, multi-agent domains. We introduce a new planning domain description language, NADL, to specify non-deterministic, multi-agent domains. The language contributes the explicit definition of controllable agents and uncontrollable environment agents. We describe the syntax and semantics of NADL and show how to build an efficient OBDD-based representation of an NADL description. The UMOP planning system uses NADL and different OBDD-based universal planning algorithms. It includes the previously developed strong and strong cyclic planning algorithms. In addition, we introduce our new optimistic planning algorithm that relaxes optimality guarantees and generates plausible universal plans in some domains where no strong nor strong cyclic solution exists. We present empirical results applying UMOP to domains ranging from deterministic and single-agent with no environment actions to non-deterministic and multi-agent with complex environment actions. UMOP is shown to be a rich and efficient planning system.


Author(s):  
Hang Ma ◽  
Wolfgang Hönig ◽  
T. K. Satish Kumar ◽  
Nora Ayanian ◽  
Sven Koenig

The Multi-Agent Pickup and Delivery (MAPD) problem models applications where a large number of agents attend to a stream of incoming pickup-and-delivery tasks. Token Passing (TP) is a recent MAPD algorithm that is efficient and effective. We make TP even more efficient and effective by using a novel combinatorial search algorithm, called Safe Interval Path Planning with Reservation Table (SIPPwRT), for single-agent path planning. SIPPwRT uses an advanced data structure that allows for fast updates and lookups of the current paths of all agents in an online setting. The resulting MAPD algorithm TP-SIPPwRT takes kinematic constraints of real robots into account directly during planning, computes continuous agent movements with given velocities that work on non-holonomic robots rather than discrete agent movements with uniform velocity, and is complete for wellformed MAPD instances. We demonstrate its benefits for automated warehouses using both an agent simulator and a standard robot simulator. For example, we demonstrate that it can compute paths for hundreds of agents and thousands of tasks in seconds and is more efficient and effective than existing MAPD algorithms that use a post-processing step to adapt their paths to continuous agent movements with given velocities.


Author(s):  
Sven Gronauer ◽  
Klaus Diepold

AbstractThe advances in reinforcement learning have recorded sublime success in various domains. Although the multi-agent domain has been overshadowed by its single-agent counterpart during this progress, multi-agent reinforcement learning gains rapid traction, and the latest accomplishments address problems with real-world complexity. This article provides an overview of the current developments in the field of multi-agent deep reinforcement learning. We focus primarily on literature from recent years that combines deep reinforcement learning methods with a multi-agent scenario. To survey the works that constitute the contemporary landscape, the main contents are divided into three parts. First, we analyze the structure of training schemes that are applied to train multiple agents. Second, we consider the emergent patterns of agent behavior in cooperative, competitive and mixed scenarios. Third, we systematically enumerate challenges that exclusively arise in the multi-agent domain and review methods that are leveraged to cope with these challenges. To conclude this survey, we discuss advances, identify trends, and outline possible directions for future work in this research area.


2021 ◽  
Vol 11 (21) ◽  
pp. 10227
Author(s):  
Asad Ali Shahid ◽  
Jorge Said Vidal Sesin ◽  
Damjan Pecioski ◽  
Francesco Braghin ◽  
Dario Piga ◽  
...  

Many real-world tasks require multiple agents to work together. When talking about multiple agents in robotics, it is usually referenced to multiple manipulators in collaboration to solve a given task, where each one is controlled by a single agent. However, due to the increasing development of modular and re-configurable robots, it is also important to investigate the possibility of implementing multi-agent controllers that learn how to manage the manipulator’s degrees of freedom (DoF) in separated clusters for the execution of a given application (e.g., being able to face faults or, partially, new kinematics configurations). Within this context, this paper focuses on the decentralization of the robot control action learning and (re)execution considering a generic multi-DoF manipulator. Indeed, the proposed framework employs a multi-agent paradigm and investigates how such a framework impacts the control action learning process. Multiple variations of the multi-agent framework have been proposed and tested in this research, comparing the achieved performance w.r.t. a centralized (i.e., single-agent) control action learning framework, previously proposed by some of the authors. As a case study, a manipulation task (i.e., grasping and lifting) of an unknown object (to the robot controller) has been considered for validation, employing a Franka EMIKA panda robot. The MuJoCo environment has been employed to implement and test the proposed multi-agent framework. The achieved results show that the proposed decentralized approach is capable of accelerating the learning process at the beginning with respect to the single-agent framework while also reducing the computational effort. In fact, when decentralizing the controller, it is shown that the number of variables involved in the action space can be efficiently separated into several groups and several agents. This simplifies the original complex problem into multiple ones, efficiently improving the task learning process.


Hematology ◽  
2009 ◽  
Vol 2009 (1) ◽  
pp. 578-586 ◽  
Author(s):  
A. Keith Stewart

AbstractTreatment of myeloma relapse needs to be individualized to reflect the effectiveness and toxicities of prior therapies, with consideration given to pragmatic issues such as the tempo of relapse, age of the patient, access to drugs and patient preference. In general, combination therapies have been associated with higher response rates and improved progression-free survival and may be preferable when a rapid response is required. Nevertheless, in a slower-tempo relapse it is unclear at this juncture whether sequencing of drugs or multi-agent combinations offer superior overall survival results. Fortunately, active novel agents that offer further possibilities for some myeloma patients have become available in clinical trials. In this review we will describe the various classes of novel drugs being tested and the pros and cons of preclinical testing, and will particularly focus on two agents with single-agent activity in myeloma: carfilzomib, a proteasome inhibitor, and pomalidomide, a member of the immunomodulatory class of drugs.


Sign in / Sign up

Export Citation Format

Share Document