Neurocomputational mechanism of controllability inference under a multi-agent setting

Controllability perception significantly influences motivated behavior and emotion and requires an estimation of one’s influence on an environment. Previous studies have shown that an agent can infer controllability by observing contingency between one’s own action and outcome if there are no other outcome-relevant agents in an environment. However, if there are multiple agents who can influence the outcome, estimation of one’s genuine controllability requires exclusion of other agents’ possible influence. Here, we first investigated a computational and neural mechanism of controllability inference in a multi-agent setting. Our novel multi-agent Bayesian controllability inference model showed that other people’s action-outcome contingency information is integrated with one’s own action-outcome contingency to infer controllability, which can be explained as a Bayesian inference. Model-based functional MRI analyses showed that multi-agent Bayesian controllability inference recruits the temporoparietal junction (TPJ) and striatum. Then, this inferred controllability information was leveraged to increase motivated behavior in the vmPFC. These results generalize the previously known role of the striatum and vmPFC in single-agent controllability to multi-agent controllability, and this generalized role requires the TPJ in addition to the striatum of single-agent controllability to integrate both self- and other-related information. Finally, we identified an innate positive bias toward the self during the multi-agent controllability inference, which facilitated behavioral adaptation under volatile controllability. Furthermore, low positive bias and high negative bias were associated with increased daily feelings of guilt. Our results provide a mechanism of how our sense of controllability fluctuates due to other people in our lives, which might be related to social learned helplessness and depression.

Download Full-text

Multi-Agent Reinforcement Learning: A Review of Challenges and Applications

Applied Sciences ◽

10.3390/app11114948 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4948

Author(s):

Lorenzo Canese ◽

Gian Carlo Cardarilli ◽

Luca Di Di Nunzio ◽

Rocco Fazzolari ◽

Daniele Giardino ◽

...

Keyword(s):

Reinforcement Learning ◽

Mathematical Models ◽

Learning Algorithms ◽

Single Agent ◽

Critical Issues ◽

Multi Agent ◽

Pros And Cons ◽

Application Fields

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.

Download Full-text

On-Demand Channel Bonding in Heterogeneous WLANs: A Multi-Agent Deep Reinforcement Learning Approach

Sensors ◽

10.3390/s20102789 ◽

2020 ◽

Vol 20 (10) ◽

pp. 2789 ◽

Cited By ~ 1

Author(s):

Hang Qi ◽

Hao Huang ◽

Zhiqun Hu ◽

Xiangming Wen ◽

Zhaoming Lu

Keyword(s):

Reinforcement Learning ◽

Transmission Rate ◽

Single Agent ◽

Time Of Day ◽

Action Space ◽

Traffic Load ◽

Traffic Demand ◽

Channel Bonding ◽

On Demand ◽

Multi Agent

In order to meet the ever-increasing traffic demand of Wireless Local Area Networks (WLANs), channel bonding is introduced in IEEE 802.11 standards. Although channel bonding effectively increases the transmission rate, the wider channel reduces the number of non-overlapping channels and is more susceptible to interference. Meanwhile, the traffic load differs from one access point (AP) to another and changes significantly depending on the time of day. Therefore, the primary channel and channel bonding bandwidth should be carefully selected to meet traffic demand and guarantee the performance gain. In this paper, we proposed an On-Demand Channel Bonding (O-DCB) algorithm based on Deep Reinforcement Learning (DRL) for heterogeneous WLANs to reduce transmission delay, where the APs have different channel bonding capabilities. In this problem, the state space is continuous and the action space is discrete. However, the size of action space increases exponentially with the number of APs by using single-agent DRL, which severely affects the learning rate. To accelerate learning, Multi-Agent Deep Deterministic Policy Gradient (MADDPG) is used to train O-DCB. Real traffic traces collected from a campus WLAN are used to train and test O-DCB. Simulation results reveal that the proposed algorithm has good convergence and lower delay than other algorithms.

Download Full-text

Multi-mobile agent itinerary planning algorithms for data gathering in wireless sensor networks: A review paper

International Journal of Distributed Sensor Networks ◽

10.1177/1550147716684841 ◽

2017 ◽

Vol 13 (1) ◽

pp. 155014771668484 ◽

Cited By ~ 10

Author(s):

Huthiafa Q Qadori ◽

Zuriati A Zulkarnain ◽

Zurina Mohd Hanapi ◽

Shamala Subramaniam

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Mobile Agent ◽

Mobile Agents ◽

Single Agent ◽

Data Gathering ◽

Optimal Number ◽

Wireless Sensor ◽

Itinerary Planning ◽

Multi Agent

Recently, wireless sensor networks have employed the concept of mobile agent to reduce energy consumption and obtain effective data gathering. Typically, in data gathering based on mobile agent, it is an important and essential step to find out the optimal itinerary planning for the mobile agent. However, single-agent itinerary planning suffers from two primary disadvantages: task delay and large size of mobile agent as the scale of the network is expanded. Thus, using multi-agent itinerary planning overcomes the drawbacks of single-agent itinerary planning. Despite the advantages of multi-agent itinerary planning, finding the optimal number of distributed mobile agents, source nodes grouping, and optimal itinerary of each mobile agent for simultaneous data gathering are still regarded as critical issues in wireless sensor network. Therefore, in this article, the existing algorithms that have been identified in the literature to address the above issues are reviewed. The review shows that most of the algorithms used one parameter to find the optimal number of mobile agents in multi-agent itinerary planning without utilizing other parameters. More importantly, the review showed that theses algorithms did not take into account the security of the data gathered by the mobile agent. Accordingly, we indicated the limitations of each proposed algorithm and new directions are provided for future research.

Download Full-text

Dynamic and stable brain connectivity during movie watching as revealed by functional MRI

10.1101/2021.09.14.460293 ◽

2021 ◽

Author(s):

Xin Di ◽

Zhiguo Zhang ◽

Ting Xu ◽

Bharat B. Biswal

Keyword(s):

Functional Mri ◽

Brain Connectivity ◽

Brain Regions ◽

Posterior Cingulate Cortex ◽

Temporoparietal Junction ◽

Related Information ◽

Movie Clip ◽

Brain Changes ◽

Brain Processing ◽

Dynamic Connectivity

AbstractSpatially remote brain regions show synchronized activity as typically revealed by correlated functional MRI (fMRI) signals. An emerging line of research has focused on the temporal fluctuations of connectivity, however, its relationships with stable connectivity have not been clearly illustrated. We examined the stable and dynamic connectivity from fMRI data when the participants watched four different movie clips. Using inter-individual correlation, we were able to estimate functionally meaningful dynamic connectivity associated with different movies. Widespread consistent dynamic connectivity was observed for each movie clip as well as their differences between clips. A cartoon movie clip showed higher consistent dynamic connectivity with the posterior cingulate cortex and supramarginal gyrus, while a court drama clip showed higher dynamic connectivity with the auditory cortex and temporoparietal junction, which suggest the involvement of specific brain processing for different movie contents. In contrast, stable connectivity was highly similar among the movie clips, and showed fewer statistical significant differences. The patterns of dynamic connectivity had higher accuracy for classifications of different movie clips than the stable connectivity and regional activity. These results support the functional significance of dynamic connectivity in reflecting functional brain changes, which could provide more functionally related information than stable connectivity.

Download Full-text

When honest people cheat, and cheaters are honest: Cognitive control processes override our moral default

10.1101/2020.01.23.907634 ◽

2020 ◽

Cited By ~ 1

Author(s):

Sebastian P.H. Speer ◽

Ale Smidts ◽

Maarten A.S. Boksem

Keyword(s):

Cognitive Control ◽

Inferior Frontal Gyrus ◽

Neural Mechanism ◽

Cingulate Cortex ◽

Posterior Cingulate Cortex ◽

Anterior Cingulate ◽

Temporoparietal Junction ◽

Self Image ◽

Trial Basis

AbstractEvery day, we are faced with the conflict between the temptation to cheat for financial gains and maintaining a positive image of ourselves as being a ‘good person’. While it has been proposed that cognitive control is needed to mediate this conflict between reward and our moral self-image, the exact role of cognitive control in (dis)honesty remains elusive. Here, we identify this role, by investigating the neural mechanism underlying cheating. We developed a novel task which allows for inconspicuously measuring spontaneous cheating on a trial-by-trial basis in the MRI scanner. We found that activity in the Nucleus Accumbens promotes cheating, particularly for individuals who cheat a lot, while a network consisting of Posterior Cingulate Cortex, Temporoparietal Junction and Medial Prefrontal Cortex promotes honesty, particularly in individuals who are generally honest. Finally, activity in areas associated with Cognitive Control (Anterior Cingulate Cortex and Inferior Frontal Gyrus) helped dishonest participants to be honest, whereas it promoted cheating for honest participants. Thus, our results suggest that cognitive control is not needed to be honest or dishonest per se, but that it depends on an individual’s moral default.

Download Full-text

Adaptive behaviors in multi-agent source localization using passive sensing

Adaptive Behavior ◽

10.1177/1059712316664120 ◽

2016 ◽

Vol 24 (6) ◽

pp. 446-463 ◽

Cited By ~ 4

Author(s):

Mansoor Shaukat ◽

Mandar Chitre

Keyword(s):

Source Localization ◽

Single Agent ◽

Analytical Models ◽

Localization Algorithm ◽

Two Phase ◽

Temporal Sampling ◽

Passive Sensing ◽

Source Signal ◽

Localization Strategy ◽

Multi Agent

In this paper, the role of adaptive group cohesion in a cooperative multi-agent source localization problem is investigated. A distributed source localization algorithm is presented for a homogeneous team of simple agents. An agent uses a single sensor to sense the gradient and two sensors to sense its neighbors. The algorithm is a set of individualistic and social behaviors where the individualistic behavior is as simple as an agent keeping its previous heading and is not self-sufficient in localizing the source. Source localization is achieved as an emergent property through agent’s adaptive interactions with the neighbors and the environment. Given a single agent is incapable of localizing the source, maintaining team connectivity at all times is crucial. Two simple temporal sampling behaviors, intensity-based-adaptation and connectivity-based-adaptation, ensure an efficient localization strategy with minimal agent breakaways. The agent behaviors are simultaneously optimized using a two phase evolutionary optimization process. The optimized behaviors are estimated with analytical models and the resulting collective behavior is validated against the agent’s sensor and actuator noise, strong multi-path interference due to environment variability, initialization distance sensitivity and loss of source signal.

Download Full-text

Multi-agent reinforcement learning using ordinal action selection and approximate policy iteration

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691316500533 ◽

2016 ◽

Vol 14 (06) ◽

pp. 1650053

Author(s):

Daxue Liu ◽

Jun Wu ◽

Xin Xu

Keyword(s):

Reinforcement Learning ◽

Single Agent ◽

Action Selection ◽

Policy Iteration ◽

Common Interest ◽

Policy Space ◽

Markov Games ◽

Approximate Policy Iteration ◽

Multi Agent ◽

Agent Coordination

Multi-agent reinforcement learning (MARL) provides a useful and flexible framework for multi-agent coordination in uncertain dynamic environments. However, the generalization ability and scalability of algorithms to large problem sizes, already problematic in single-agent RL, is an even more formidable obstacle in MARL applications. In this paper, a new MARL method based on ordinal action selection and approximate policy iteration called OAPI (Ordinal Approximate Policy Iteration), is presented to address the scalability issue of MARL algorithms in common-interest Markov Games. In OAPI, an ordinal action selection and learning strategy is integrated with distributed approximate policy iteration not only to simplify the policy space and eliminate the conflicts in multi-agent coordination, but also to realize the approximation of near-optimal policies for Markov Games with large state spaces. Based on the simplified policy space using ordinal action selection, the OAPI algorithm implements distributed approximate policy iteration utilizing online least-squares policy iteration (LSPI). This resulted in multi-agent coordination with good convergence properties with reduced computational complexity. The simulation results of a coordinated multi-robot navigation task illustrate the feasibility and effectiveness of the proposed approach.

Download Full-text

BDI Approach to Build a Single Agent of a Distributed Multi-Agent System

Advances in Business Information Systems and Analytics - Modeling and Simulation Techniques for Improved Business Processes ◽

10.4018/978-1-5225-3226-2.ch002 ◽

2018 ◽

pp. 24-49

Author(s):

Maryam Ebrahimi

Keyword(s):

Swot Analysis ◽

Planning Process ◽

Single Agent ◽

Defensive Strategy ◽

Diversification Strategy ◽

Agent Based ◽

Autonomous Action ◽

System Validation ◽

Multi Agent ◽

Technology Strategies

The main purpose of this study is to describe and analyze an agent from a distributed multi-agent based system (ABS) according to the BDI architecture. This agent is capable of autonomous action to propose general technology strategies (TSs) in renewable energy SMEs based on a set of rules and interacts with a core agent in multi ABS. The recognition of internal strengths and weaknesses as well as external opportunities and threats takes place on the basis of technological SWOT-analysis. Proposed TSs are categorized into four types: aggressive strategy, turnaround oriented strategy, diversification strategy, and defensive strategy. Agent architecture in terms of three abstraction layers called psychological, theoretical, and implementation is explained. And after system validation by experts, some program codes and output results of this agent are presented. This system provides information to facilitate the TS planning process to be carried out effectively.

Download Full-text

A System Approach to Describing, Analysing and Control of the Behaviour of Agents in MAS

Cybernetics and Systems Theory in Management ◽

10.4018/978-1-61520-668-1.ch014 ◽

2010 ◽

pp. 253-273

Author(s):

František Capkovic

Keyword(s):

System Analysis ◽

Single Agent ◽

State Equation ◽

Control Synthesis ◽

System Control ◽

Multi Agent Systems ◽

Linear Discrete System ◽

Multi Agent ◽

System Properties ◽

And Control

The Petri nets (PN)-based analytical approach to describing both the single agent behaviour as well as the cooperation of several agents in MAS (multi agent systems) is presented. PN yield the possibility to express the agent behaviour and cooperation by means of the vector state equation in the form of linear discrete system. Hence, the modular approach to the creation of the MAS model can be successfully used too. Three different interconnections of modules (agents, interfaces, environment) expressed by PN subnets are introduced. The approach makes possible to use methods of linear algebra. Moreover, it can be successfully used at the system analysis (e.g. the reachability of states), at testing the system properties, and even at the system control synthesis.

Download Full-text

Intelligent Agents in a Trust Environment

Intelligent Support Systems ◽

10.4018/978-1-931777-00-1.ch007 ◽

2002 ◽

pp. 98-108

Author(s):

Rahul Singh ◽

Mark A. Gill

Keyword(s):

Distributed System ◽

Intelligent Agents ◽

Single Agent ◽

Intelligent Agent ◽

The Internet ◽

Commercial Activity ◽

Agent Based ◽

Trust Relationships ◽

Multi Agent ◽

Oriented Activity

Intelligent agents and multi-agent technologies are an emerging technology in computing and communications that hold much promise for a wide variety of applications in Information Technology. Agent-based systems range from the simple, single agent system performing tasks such as email filtering, to a very complex, distributed system of multiple agents each involved in individual and system wide goal-oriented activity. With the tremendous growth in the Internet and Internet-based computing and the explosion of commercial activity on the Internet in recent years, intelligent agent-based systems are being applied in a wide variety of electronic commerce applications. In order to be able to act autonomously in a market environment, agents must be able to establish and maintain trust relationships. Without trust, commerce will not take place. This research extends previous work in intelligent agents to include a mechanism for handling the trust relationship and shows how agents can be fully used as intermediaries in commerce.

Download Full-text