Training Coordination Proxy Agents Using Reinforcement Learning

Multi Agent Systems ◽

Adjustable Autonomy ◽

Novel Approach ◽

Learning Techniques ◽

Multi Agent ◽

Mixed Initiative ◽

Learning Team

In heterogeneous multi-agent systems, where human and non-human agents coexist, intelligent proxy agents can help smooth out fundamental differences. In this context, delegating the coordination role to proxy agents can improve the overall outcome of a task at the expense of human cognitive overload due to switching subtasks. Stability and commitment are characteristics of human teamwork, but must not prevent the detection of better opportunities. In addition, coordination proxy agents must be trained from examples as a single agent, but must interact with multiple agents. We apply machine learning techniques to the task of learning team preferences from mixed-initiative interactions and compare the outcome results of different simulated user patterns. This chapter introduces a novel approach for the adjustable autonomy of coordination proxies based on the reinforcement learning of abstract actions. In conclusion, some consequences of the symbiotic relationship that such an approach suggests are discussed.

A Novel Heterogeneous Swarm Reinforcement Learning Method for Sequential Decision Making Problems

Machine Learning and Knowledge Extraction ◽

10.3390/make1020035 ◽

2019 ◽

Vol 1 (2) ◽

pp. 590-610

Author(s):

Zohreh Akbari ◽

Rainer Unland

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Single Agent ◽

Sequential Decision Making ◽

Multi Agent Systems ◽

Sequential Decision ◽

Agent Systems ◽

Novel Approach ◽

Markov Decision ◽

Sequential Decision Making Problems (SDMPs) that can be modeled as Markov Decision Processes can be solved using methods that combine Dynamic Programming (DP) and Reinforcement Learning (RL). Depending on the problem scenarios and the available Decision Makers (DMs), such RL algorithms may be designed for single-agent systems or multi-agent systems that either consist of agents with individual goals and decision making capabilities, which are influenced by other agent’s decisions, or behave as a swarm of agents that collaboratively learn a single objective. Many studies have been conducted in this area; however, when concentrating on available swarm RL algorithms, one obtains a clear view of the areas that still require attention. Most of the studies in this area focus on homogeneous swarms and so far, systems introduced as Heterogeneous Swarms (HetSs) merely include very few, i.e., two or three sub-swarms of homogeneous agents, which either, according to their capabilities, deal with a specific sub-problem of the general problem or exhibit different behaviors in order to reduce the risk of bias. This study introduces a novel approach that allows agents, which are originally designed to solve different problems and hence have higher degrees of heterogeneity, to behave as a swarm when addressing identical sub-problems. In fact, the affinity between two agents, which measures the compatibility of agents to work together towards solving a specific sub-problem, is used in designing a Heterogeneous Swarm RL (HetSRL) algorithm that allows HetSs to solve the intended SDMPs.

EFFECTS OF COMMUNICATION ON GROUP LEARNING RATES IN A MULTI-AGENT ENVIRONMENT

Advances in Complex Systems ◽

10.1142/s0219525903000979 ◽

2003 ◽

Vol 06 (03) ◽

pp. 405-426 ◽

Cited By ~ 1

Author(s):

PAUL DARBYSHIRE

Keyword(s):

Reinforcement Learning ◽

Cognitive Abilities ◽

Complex Adaptive System ◽

Simulation Techniques ◽

Learning Rates ◽

Learning Techniques ◽

Complex Adaptive ◽

Rate Of Learning ◽

Distillations utilize multi-agent based modeling and simulation techniques to study warfare as a complex adaptive system at the conceptual level. The focus is placed on the interactions between the agents to facilitate study of cause and effect between individual interactions and overall system behavior. Current distillations do not utilize machine-learning techniques to model the cognitive abilities of individual combatants but employ agent control paradigms to represent agents as highly instinctual entities. For a team of agents implementing a reinforcement-learning paradigm, the rate of learning is not sufficient for agents to adapt to this hostile environment. However, by allowing the agents to communicate their respective rewards for actions performed as the simulation progresses, the rate of learning can be increased sufficiently to significantly increase the teams chances of survival. This paper presents the results of trials to measure the success of a team-based approach to the reinforcement-learning problem in a distillation, using reward communication to increase learning rates.

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns

10.24963/ijcai.2019/65 ◽

2019 ◽

Cited By ~ 2

Author(s):

Yong Liu ◽

Yujing Hu ◽

Yang Gao ◽

Yingfeng Chen ◽

Changjie Fan

Keyword(s):

Reinforcement Learning ◽

Knowledge Transfer ◽

Value Function ◽

Single Agent ◽

Multi Agent Systems ◽

Agent Systems ◽

Markov Decision ◽

Dimensional State Space ◽

Multi Agent ◽

Function Transfer

Many real-world problems, such as robot control and soccer game, are naturally modeled as sparse-interaction multi-agent systems. Reutilizing single-agent knowledge in multi-agent systems with sparse interactions can greatly accelerate the multi-agent learning process. Previous works rely on bisimulation metric to define Markov decision process (MDP) similarity for controlling knowledge transfer. However, bisimulation metric is costly to compute and is not suitable for high-dimensional state space problems. In this work, we propose more scalable transfer learning methods based on a novel MDP similarity concept. We start by defining the MDP similarity based on the N-step return (NSR) values of an MDP. Then, we propose two knowledge transfer methods based on deep neural networks called direct value function transfer and NSR-based value function transfer. We conduct experiments in image-based grid world, multi-agent particle environment (MPE) and Ms. Pac-Man game. The results indicate that the proposed methods can significantly accelerate multi-agent reinforcement learning and meanwhile get better asymptotic performance.

Machine Learning for Agents and Multi-Agent Systems

Intelligent Agent Software Engineering ◽

10.4018/978-1-59140-046-2.ch001 ◽

2011 ◽

pp. 1-26

Author(s):

Daniel Kudenko ◽

Dimitar Kazakov ◽

Eduardo Alonso

Keyword(s):

Machine Learning ◽

Autonomous Agents ◽

Multi Agent Systems ◽

Learning Agents ◽

Agent Systems ◽

Learning Techniques ◽

Comprehensive Survey ◽

Multi Agent ◽

The Relationship

In order to be truly autonomous, agents need the ability to learn from and adapt to the environment and other agents. This chapter introduces key concepts of machine learning and how they apply to agent and multi-agent systems. Rather than present a comprehensive survey, we discuss a number of issues that we believe are important in the design of learning agents and multi-agent systems. Specifically, we focus on the challenges involved in adapting (originally disembodied) machine learning techniques to situated agents, the relationship between learning and communication, learning to collaborate and compete, learning of roles, evolution and natural selection, and distributed learning. In the second part of the chapter, we focus on some practicalities and present two case studies.

Machine Learning for Agents and Multi-Agent Systems

Intelligent Information Technologies ◽

10.4018/978-1-59904-941-0.ch023 ◽

2011 ◽

pp. 403-420

Author(s):

Daniel Kudenko ◽

Dimitar Kazakov ◽

Eduardo Alonso

Keyword(s):

Machine Learning ◽

Autonomous Agents ◽

Multi Agent Systems ◽

Learning Agents ◽

Agent Systems ◽

Learning Techniques ◽

Comprehensive Survey ◽

Multi Agent ◽

The Relationship

Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning

Autonomous Agents and Multi-Agent Systems ◽

10.1007/s10458-021-09506-w ◽

2021 ◽

Vol 35 (2) ◽

Author(s):

Jacopo Castellini ◽

Frans A. Oliehoek ◽

Rahul Savani ◽

Shimon Whiteson

Keyword(s):

Reinforcement Learning ◽

Multiagent Systems ◽

Autonomous Agents ◽

Coordination Mechanism ◽

Network Architectures ◽

Value Functions ◽

Multi Agent Systems ◽

Learning Techniques ◽

Coordination Requirements ◽

AbstractRecent years have seen the application of deep reinforcement learning techniques to cooperative multi-agent systems, with great empirical success. However, given the lack of theoretical insight, it remains unclear what the employed neural networks are learning, or how we should enhance their learning power to address the problems on which they fail. In this work, we empirically investigate the learning power of various network architectures on a series of one-shot games. Despite their simplicity, these games capture many of the crucial problems that arise in the multi-agent setting, such as an exponential number of joint actions or the lack of an explicit coordination mechanism. Our results extend those in Castellini et al. (Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, AAMAS’19.International Foundation for Autonomous Agents and Multiagent Systems, pp 1862–1864, 2019) and quantify how well various approaches can represent the requisite value functions, and help us identify the reasons that can impede good performance, like sparsity of the values or too tight coordination requirements.

Cooperative Multi-Agent Systems Using Distributed Reinforcement Learning Techniques

Procedia Computer Science ◽

10.1016/j.procs.2018.07.286 ◽

2018 ◽

Vol 126 ◽

pp. 517-526 ◽

Cited By ~ 2

Author(s):

Wiem Zemzem ◽

Moncef Tagina

Keyword(s):

Reinforcement Learning ◽

Multi Agent Systems ◽

Agent Systems ◽

Learning Techniques ◽

Multi Agent ◽

Distributed Reinforcement

Special Issue on Machine Learning for Robotics and Swarm Systems

Journal of Robotics and Mechatronics ◽

10.20965/jrm.2019.p0519 ◽

2019 ◽

Vol 31 (4) ◽

pp. 519-519

Author(s):

Masahito Yamamoto ◽

Takashi Kawakami ◽

Keitaro Naruse

Keyword(s):

Machine Learning ◽

Autonomous Agents ◽

Autonomous Robots ◽

Multi Agent Systems ◽

Special Issue ◽

Learning Techniques ◽

Machine Learning Applications ◽

Multi Agent ◽

The Relationship

In recent years, machine-learning applications have been rapidly expanding in the fields of robotics and swarm systems, including multi-agent systems. Swarm systems were developed in the field of robotics as a kind of distributed autonomous robotic systems, imbibing the concepts of the emergent methodology for extremely redundant systems. They typically consist of homogeneous autonomous robots, which resemble living animals that build swarms. Machine-learning techniques such as deep learning have played a remarkable role in controlling robotic behaviors in the real world or multi-agents in the simulation environment. In this special issue, we highlight five interesting papers that cover topics ranging from the analysis of the relationship between the congestion among autonomous robots and the task performances, to the decision making process among multiple autonomous agents. We thank the authors and reviewers of the papers and hope that this special issue encourages readers to explore recent topics and future studies in machine-learning applications for robotics and swarm systems.

Reinforcement Learning Based Hierarchical Multi-Agent Robotic Search Team in Uncertain Environment

Mehran University Research Journal of Engineering and Technology ◽

10.22581/muet1982.2103.17 ◽

2021 ◽

Vol 40 (3) ◽

pp. 645-662

Author(s):

Shahzaib Hamid ◽

Ali Nasir ◽

Yasir Saleem

Keyword(s):

Reinforcement Learning ◽

Multi Agent Systems ◽

Qualitative Comparison ◽

Q Learning ◽

Novel Approach ◽

Learning Agent ◽

Markov Decision ◽

Multi Agent ◽

Efficient Learning ◽

Prior Models

Field of robotics has been under the limelight because of recent advances in Artificial Intelligence (AI). Due to increased diversity in multi-agent systems, new models are being developed to handle complexity of such systems. However, most of these models do not address problems such as; uncertainty handling, efficient learning, agent coordination and fault detection. This paper presents a novel approach of implementing Reinforcement Learning (RL) on hierarchical robotic search teams. The proposed algorithm handles uncertainties in the system by implementing Q-learning and depicts enhanced efficiency as well as better time consumption compared to prior models. The reason for that is each agent can take action on its own thus there is less dependency on leader agent for RL policy. The performance of this algorithm is measured by introducing agents in an unknown environment with both Markov Decision Process (MDP) and RL policies at their disposal. Simulation-based comparison of the agent motion is presented using the results from of MDP and RL policies. Furthermore, qualitative comparison of the proposed model with prior models is also presented.

A Novel Ship Collision Avoidance Awareness Approach for Cooperating Ships Using Multi-Agent Deep Reinforcement Learning

Journal of Marine Science and Engineering ◽

10.3390/jmse9101056 ◽

2021 ◽

Vol 9 (10) ◽

pp. 1056

Author(s):

Chen Chen ◽

Feng Ma ◽

Xiaobin Xu ◽

Yuwang Chen ◽

Jin Wang

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Driving Forces ◽

Practical Significance ◽

Individual Agent ◽

Reward Function ◽

Learning Techniques ◽

International Regulations ◽

Ships are special machineries with large inertias and relatively weak driving forces. Simulating the manual operations of manipulating ships with artificial intelligence (AI) and machine learning techniques becomes more and more common, in which avoiding collisions in crowded waters may be the most challenging task. This research proposes a cooperative collision avoidance approach for multiple ships using a multi-agent deep reinforcement learning (MADRL) algorithm. Specifically, each ship is modeled as an individual agent, controlled by a Deep Q-Network (DQN) method and described by a dedicated ship motion model. Each agent observes the state of itself and other ships as well as the surrounding environment. Then, agents analyze the navigation situation and make motion decisions accordingly. In particular, specific reward function schemas are designed to simulate the degree of cooperation among agents. According to the International Regulations for Preventing Collisions at Sea (COLREGs), three typical scenarios of simulation, which are head-on, overtaking and crossing, are established to validate the proposed approach. With sufficient training of MADRL, the ship agents were capable of avoiding collisions through cooperation in narrow crowded waters. This method provides new insights for bionic modeling of ship operations, which is of important theoretical and practical significance.