scholarly journals Inducing selfish agents towards social efficient solutions

2020 ◽  
Author(s):  
João Schapke ◽  
Ana Bazzan

Many multi-agent reinforcement learning (MARL) scenarios lead towards Nash equilibria, which is known to not always be socially efficient. In this study we aim to align the social optimization objective of the system with the individual objectives of the agents by adopting a central controller which can interact with the agents. In details, our approach establishes a communication channel between reinforcement learning agents, and a controller implemented with metaheuristics. The interaction benefit the convergence of both algorithms. Further, we evaluate our method in repeated games with high price of anarchy and show that our approach is able to overcome much of the issues caused by the non-cooperative behaviour of the agents and the non-stationary effects they cause.

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Niklas Rach ◽  
Klaus Weber ◽  
Yuchi Yang ◽  
Stefan Ultes ◽  
Elisabeth André ◽  
...  

Abstract Persuasive argumentation depends on multiple aspects, which include not only the content of the individual arguments, but also the way they are presented. The presentation of arguments is crucial – in particular in the context of dialogical argumentation. However, the effects of different discussion styles on the listener are hard to isolate in human dialogues. In order to demonstrate and investigate various styles of argumentation, we propose a multi-agent system in which different aspects of persuasion can be modelled and investigated separately. Our system utilizes argument structures extracted from text-based reviews for which a minimal bias of the user can be assumed. The persuasive dialogue is modelled as a dialogue game for argumentation that was motivated by the objective to enable both natural and flexible interactions between the agents. In order to support a comparison of factual against affective persuasion approaches, we implemented two fundamentally different strategies for both agents: The logical policy utilizes deep Reinforcement Learning in a multi-agent setup to optimize the strategy with respect to the game formalism and the available argument. In contrast, the emotional policy selects the next move in compliance with an agent emotion that is adapted to user feedback to persuade on an emotional level. The resulting interaction is presented to the user via virtual avatars and can be rated through an intuitive interface.


Author(s):  
Rex Oleson ◽  
D. J. Kaup ◽  
Thomas L. Clarke ◽  
Linda C. Malone ◽  
Ladislau Bölöni

The “Social Potential”, which the authors will refer to as the SP, is the name given to a technique of implementing multi-agent movement in simulations by representing behaviors, goals, and motivations as artificial social forces. These forces then determine the movement of the individual agents. Several SP models, including the Flocking, Helbing-Molnar–Farkas-Visek (HMFV), and Lakoba-Kaup-Finkelstein (LKF) models, are commonly used to describe pedestrian movement. A systematic procedure is described here, whereby one can construct and use these and other SP models. The theories behind these models are discussed along with the application of the procedure. Through the use of these techniques, it has been possible to represent schools of fish swimming, flocks of birds flying, crowds exiting rooms, crowds walking through hallways, and individuals wandering in open fields. Once one has an understanding of these models, more complex and specific scenarios could be constructed by applying additional constraints and parameters. The models along with the procedure give a guideline for understanding and implementing simulations using SP techniques.


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4739
Author(s):  
Ory Walker ◽  
Fernando Vanegas ◽  
Felipe Gonzalez

The problem of multi-agent remote sensing for the purposes of finding survivors or surveying points of interest in GPS-denied and partially observable environments remains a challenge. This paper presents a framework for multi-agent target-finding using a combination of online POMDP based planning and Deep Reinforcement Learning based control. The framework is implemented considering planning and control as two separate problems. The planning problem is defined as a decentralised multi-agent graph search problem and is solved using a modern online POMDP solver. The control problem is defined as a local continuous-environment exploration problem and is solved using modern Deep Reinforcement Learning techniques. The proposed framework combines the solution to both of these problems and testing shows that it enables multiple agents to find a target within large, simulated test environments in the presence of unknown obstacles and obstructions. The proposed approach could also be extended or adapted to a number of time sensitive remote-sensing problems, from searching for multiple survivors during a disaster to surveying points of interest in a hazardous environment by adjusting the individual model definitions.


Author(s):  
Takayuki Osogami ◽  
Rudy Raymond

We study reinforcement learning for controlling multiple agents in a collaborative manner. In some of those tasks, it is insufficient for the individual agents to take relevant actions, but those actions should also have diversity. We propose the approach of using the determinant of a positive semidefinite matrix to approximate the action-value function in reinforcement learning, where we learn the matrix in a way that it represents the relevance and diversity of the actions. Experimental results show that the proposed approach allows the agents to learn a nearly optimal policy approximately ten times faster than baseline approaches in benchmark tasks of multi-agent reinforcement learning. The proposed approach is also shown to achieve the performance that cannot be achieved with conventional approaches in partially observable environment with exponentially large action space.


2021 ◽  
Author(s):  
Nikolaos Al. Papadopoulos ◽  
Marti Sanchez-Fibla

Multi-Agent Reinforcement Learning reductionist simulations can provide a spectrum of opportunities towards the modeling and understanding of complex social phenomena such as common-pool appropriation. In this paper, a multiplayer variant of Battle-of-the-Exes is suggested as appropriate for experimentation regarding fair and efficient coordination and turn-taking among selfish agents. Going beyond literature’s fairness and efficiency, a novel measure is proposed for turn-taking coordination evaluation, robust to the number of agents and episodes of a system. Six variants of this measure are defined, entitled Alternation Measures or ALT. ALT measures were found sufficient to capture the desired properties (alternation, fair and efficient distribution) in comparison to state-of-the-art measures, thus they were benchmarked and tested through a series of experiments with Reinforcement Learning agents, aspiring to contribute novel tools for a deeper understanding of emergent social outcomes.


2020 ◽  
Vol 8 (3) ◽  
Author(s):  
Korosh Mahmoodi ◽  
Bruce J West ◽  
Cleotilde Gonzalez

Abstract We propose a model for demonstrating spontaneous emergence of collective intelligent behaviour (i.e. adaptation and resilience of a social system) from selfish individual agents. Agents’ behaviour is modelled using our proposed selfish algorithm ($SA$) with three learning mechanisms: reinforced learning ($SAL$), trust ($SAT$) and connection ($SAC$). Each of these mechanisms provides a distinctly different way an agent can increase the individual benefit accrued through playing the prisoner’s dilemma game ($PDG$) with other agents. $SAL$ generates adaptive reciprocity between the agents with a level of mutual cooperation that depends on the temptation of the individuals to cheat. Adding $SAT$ or $SAC$ to $SAL$ improves the adaptive reciprocity between selfish agents, raising the level of mutual cooperation. Importantly, the mechanisms in the $SA$ are self-tuned by the internal dynamics that depend only on the change in the agent’s own payoffs. This is in contrast to any pre-established reciprocity mechanism (e.g. predefined connections among agents) or awareness of the behaviour or payoffs of other agents. Also, we study adaptation and resilience of the social systems utilizing $SA$ by turning some of the agents to zealots to show that agents reconstruct the reciprocity structure in such a way to eliminate the zealots from getting advantage of a cooperative environment. The implications and applications of the $SA$ are discussed.


2010 ◽  
pp. 1969-1986
Author(s):  
Rex Oleson ◽  
D.J. Kaup ◽  
Thomas L. Clarke ◽  
Linda C. Malone ◽  
Ladislau Bölöni

The “Social Potential”, which the authors will refer to as the SP, is the name given to a technique of implementing multi-agent movement in simulations by representing behaviors, goals, and motivations as artificial social forces. These forces then determine the movement of the individual agents. Several SP models, including the Flocking, Helbing-Molnar–Farkas-Visek (HMFV), and Lakoba-Kaup-Finkelstein (LKF) models, are commonly used to describe pedestrian movement. A systematic procedure is described here, whereby one can construct and use these and other SP models. The theories behind these models are discussed along with the application of the procedure. Through the use of these techniques, it has been possible to represent schools of fish swimming, flocks of birds flying, crowds exiting rooms, crowds walking through hallways, and individuals wandering in open fields. Once one has an understanding of these models, more complex and specific scenarios could be constructed by applying additional constraints and parameters. The models along with the procedure give a guideline for understanding and implementing simulations using SP techniques.


1999 ◽  
Vol 58 (3) ◽  
pp. 201-206 ◽  
Author(s):  
Claude Flament

This paper is concerned by a possible articulation between the diversity of individual opinions and the existence of consensus in social representations. It postulates the existence of consensual normative boundaries framing the individual opinions. A study by questionnaire about the social representations of the development of intelligence gives support to this notion.


2013 ◽  
Vol 5 (1) ◽  
pp. 131-137
Author(s):  
Roxanne Christensen ◽  
LaSonia Barlow ◽  
Demetrius E. Ford

Three personal reflections provided by doctoral students of the Michigan School of Professional Psychology (Farmington Hills, Michigan) address identification of individual perspectives on the tragic events surrounding Trayvon Martin’s death. The historical ramifications of a culture-in-context and the way civil rights, racism, and community traumatization play a role in the social construction of criminals are explored. A justice orientation is applied to both the community and the individual via internal reflection about the unique individual and collective roles social justice plays in the outcome of these events. Finally, the personal and professional responses of a practitioner who is also a mother of minority young men brings to light the need to educate against stereotypes, assist a community to heal, and simultaneously manage the direct effects of such events on youth in society. In all three essays, common themes of community and growth are addressed from varying viewpoints. As worlds collided, a historical division has given rise to a present unity geared toward breaking the cycle of violence and trauma. The authors plead that if there is no other service in the name of this tragedy, let it at least contribute to the actualization of a society toward growth and healing.


Sign in / Sign up

Export Citation Format

Share Document