Idempotent algebra models of single-agent and multi-agent dynamics

Author(s):  
Dmitry Nikolayev
2021 ◽  
Vol 11 (11) ◽  
pp. 4948
Author(s):  
Lorenzo Canese ◽  
Gian Carlo Cardarilli ◽  
Luca Di Di Nunzio ◽  
Rocco Fazzolari ◽  
Daniele Giardino ◽  
...  

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.


Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2789 ◽  
Author(s):  
Hang Qi ◽  
Hao Huang ◽  
Zhiqun Hu ◽  
Xiangming Wen ◽  
Zhaoming Lu

In order to meet the ever-increasing traffic demand of Wireless Local Area Networks (WLANs), channel bonding is introduced in IEEE 802.11 standards. Although channel bonding effectively increases the transmission rate, the wider channel reduces the number of non-overlapping channels and is more susceptible to interference. Meanwhile, the traffic load differs from one access point (AP) to another and changes significantly depending on the time of day. Therefore, the primary channel and channel bonding bandwidth should be carefully selected to meet traffic demand and guarantee the performance gain. In this paper, we proposed an On-Demand Channel Bonding (O-DCB) algorithm based on Deep Reinforcement Learning (DRL) for heterogeneous WLANs to reduce transmission delay, where the APs have different channel bonding capabilities. In this problem, the state space is continuous and the action space is discrete. However, the size of action space increases exponentially with the number of APs by using single-agent DRL, which severely affects the learning rate. To accelerate learning, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) is used to train O-DCB. Real traffic traces collected from a campus WLAN are used to train and test O-DCB. Simulation results reveal that the proposed algorithm has good convergence and lower delay than other algorithms.


2017 ◽  
Vol 13 (1) ◽  
pp. 155014771668484 ◽  
Author(s):  
Huthiafa Q Qadori ◽  
Zuriati A Zulkarnain ◽  
Zurina Mohd Hanapi ◽  
Shamala Subramaniam

Recently, wireless sensor networks have employed the concept of mobile agent to reduce energy consumption and obtain effective data gathering. Typically, in data gathering based on mobile agent, it is an important and essential step to find out the optimal itinerary planning for the mobile agent. However, single-agent itinerary planning suffers from two primary disadvantages: task delay and large size of mobile agent as the scale of the network is expanded. Thus, using multi-agent itinerary planning overcomes the drawbacks of single-agent itinerary planning. Despite the advantages of multi-agent itinerary planning, finding the optimal number of distributed mobile agents, source nodes grouping, and optimal itinerary of each mobile agent for simultaneous data gathering are still regarded as critical issues in wireless sensor network. Therefore, in this article, the existing algorithms that have been identified in the literature to address the above issues are reviewed. The review shows that most of the algorithms used one parameter to find the optimal number of mobile agents in multi-agent itinerary planning without utilizing other parameters. More importantly, the review showed that theses algorithms did not take into account the security of the data gathered by the mobile agent. Accordingly, we indicated the limitations of each proposed algorithm and new directions are provided for future research.


2016 ◽  
Vol 24 (6) ◽  
pp. 446-463 ◽  
Author(s):  
Mansoor Shaukat ◽  
Mandar Chitre

In this paper, the role of adaptive group cohesion in a cooperative multi-agent source localization problem is investigated. A distributed source localization algorithm is presented for a homogeneous team of simple agents. An agent uses a single sensor to sense the gradient and two sensors to sense its neighbors. The algorithm is a set of individualistic and social behaviors where the individualistic behavior is as simple as an agent keeping its previous heading and is not self-sufficient in localizing the source. Source localization is achieved as an emergent property through agent’s adaptive interactions with the neighbors and the environment. Given a single agent is incapable of localizing the source, maintaining team connectivity at all times is crucial. Two simple temporal sampling behaviors, intensity-based-adaptation and connectivity-based-adaptation, ensure an efficient localization strategy with minimal agent breakaways. The agent behaviors are simultaneously optimized using a two phase evolutionary optimization process. The optimized behaviors are estimated with analytical models and the resulting collective behavior is validated against the agent’s sensor and actuator noise, strong multi-path interference due to environment variability, initialization distance sensitivity and loss of source signal.


Author(s):  
Daxue Liu ◽  
Jun Wu ◽  
Xin Xu

Multi-agent reinforcement learning (MARL) provides a useful and flexible framework for multi-agent coordination in uncertain dynamic environments. However, the generalization ability and scalability of algorithms to large problem sizes, already problematic in single-agent RL, is an even more formidable obstacle in MARL applications. In this paper, a new MARL method based on ordinal action selection and approximate policy iteration called OAPI (Ordinal Approximate Policy Iteration), is presented to address the scalability issue of MARL algorithms in common-interest Markov Games. In OAPI, an ordinal action selection and learning strategy is integrated with distributed approximate policy iteration not only to simplify the policy space and eliminate the conflicts in multi-agent coordination, but also to realize the approximation of near-optimal policies for Markov Games with large state spaces. Based on the simplified policy space using ordinal action selection, the OAPI algorithm implements distributed approximate policy iteration utilizing online least-squares policy iteration (LSPI). This resulted in multi-agent coordination with good convergence properties with reduced computational complexity. The simulation results of a coordinated multi-robot navigation task illustrate the feasibility and effectiveness of the proposed approach.


Author(s):  
Maryam Ebrahimi

The main purpose of this study is to describe and analyze an agent from a distributed multi-agent based system (ABS) according to the BDI architecture. This agent is capable of autonomous action to propose general technology strategies (TSs) in renewable energy SMEs based on a set of rules and interacts with a core agent in multi ABS. The recognition of internal strengths and weaknesses as well as external opportunities and threats takes place on the basis of technological SWOT-analysis. Proposed TSs are categorized into four types: aggressive strategy, turnaround oriented strategy, diversification strategy, and defensive strategy. Agent architecture in terms of three abstraction layers called psychological, theoretical, and implementation is explained. And after system validation by experts, some program codes and output results of this agent are presented. This system provides information to facilitate the TS planning process to be carried out effectively.


Author(s):  
František Capkovic

The Petri nets (PN)-based analytical approach to describing both the single agent behaviour as well as the cooperation of several agents in MAS (multi agent systems) is presented. PN yield the possibility to express the agent behaviour and cooperation by means of the vector state equation in the form of linear discrete system. Hence, the modular approach to the creation of the MAS model can be successfully used too. Three different interconnections of modules (agents, interfaces, environment) expressed by PN subnets are introduced. The approach makes possible to use methods of linear algebra. Moreover, it can be successfully used at the system analysis (e.g. the reachability of states), at testing the system properties, and even at the system control synthesis.


2002 ◽  
pp. 98-108
Author(s):  
Rahul Singh ◽  
Mark A. Gill

Intelligent agents and multi-agent technologies are an emerging technology in computing and communications that hold much promise for a wide variety of applications in Information Technology. Agent-based systems range from the simple, single agent system performing tasks such as email filtering, to a very complex, distributed system of multiple agents each involved in individual and system wide goal-oriented activity. With the tremendous growth in the Internet and Internet-based computing and the explosion of commercial activity on the Internet in recent years, intelligent agent-based systems are being applied in a wide variety of electronic commerce applications. In order to be able to act autonomously in a market environment, agents must be able to establish and maintain trust relationships. Without trust, commerce will not take place. This research extends previous work in intelligent agents to include a mechanism for handling the trust relationship and shows how agents can be fully used as intermediaries in commerce.


Author(s):  
Chengzhi Yuan

This paper addresses the problem of leader-following consensus control of general linear multi-agent systems (MASs) with diverse time-varying input delays under the integral quadratic constraint (IQC) framework. A novel exact-memory distributed output-feedback delay controller structure is proposed, which utilizes not only relative estimation state information from neighboring agents but also local real-time information of time delays and the associated dynamic IQC-induced states from the agent itself for feedback control. As a result, the distributed consensus problem can be decomposed into H∞ stabilization subproblems for a set of independent linear fractional transformation (LFT) systems, whose dimensions are equal to that of a single agent plant plus the associated local IQC dynamics. New delay control synthesis conditions for each subproblem are fully characterized as linear matrix inequalities (LMIs). A numerical example is used to demonstrate the proposed approach.


Sign in / Sign up

Export Citation Format

Share Document