scholarly journals Autocratic strategies for iterated games with arbitrary action spaces

2016 ◽  
Vol 113 (13) ◽  
pp. 3573-3578 ◽  
Author(s):  
Alex McAvoy ◽  
Christoph Hauert

The recent discovery of zero-determinant strategies for the iterated prisoner’s dilemma sparked a surge of interest in the surprising fact that a player can exert unilateral control over iterated interactions. These remarkable strategies, however, are known to exist only in games in which players choose between two alternative actions such as “cooperate” and “defect.” Here we introduce a broader class of autocratic strategies by extending zero-determinant strategies to iterated games with more general action spaces. We use the continuous donation game as an example, which represents an instance of the prisoner’s dilemma that intuitively extends to a continuous range of cooperation levels. Surprisingly, despite the fact that the opponent has infinitely many donation levels from which to choose, a player can devise an autocratic strategy to enforce a linear relationship between his or her payoff and that of the opponent even when restricting his or her actions to merely two discrete levels of cooperation. In particular, a player can use such a strategy to extort an unfair share of the payoffs from the opponent. Therefore, although the action space of the continuous donation game dwarfs that of the classic prisoner’s dilemma, players can still devise relatively simple autocratic and, in particular, extortionate strategies.

2019 ◽  
Vol 56 (3) ◽  
pp. 810-829
Author(s):  
János Flesch ◽  
Dries Vermeulen ◽  
Anna Zseleva

AbstractWe consider decision problems with arbitrary action spaces, deterministic transitions, and infinite time horizon. In the usual setup when probability measures are countably additive, a general version of Kuhn’s theorem implies under fairly general conditions that for every mixed strategy of the decision maker there exists an equivalent behavior strategy. We examine to what extent this remains valid when probability measures are only assumed to be finitely additive. Under the classical approach of Dubins and Savage (2014), we prove the following statements: (1) If the action space is finite, every mixed strategy has an equivalent behavior strategy. (2) Even if the action space is infinite, at least one optimal mixed strategy has an equivalent behavior strategy. The approach by Dubins and Savage turns out to be essentially maximal: these two statements are no longer valid if we take any extension of their approach that considers all singleton plays.


2021 ◽  
Author(s):  
Zhaoyang Cheng ◽  
Guanpu Chen ◽  
Yiguang Hong

Abstract Zero-determinant (ZD) strategies have attracted wide attention in Iterated Prisoner’s Dilemma (IPD) games, since the player equipped with ZD strategies can unilaterally enforce the two players’ expected utilities subjected to a linear relation. On the other hand, uncertainties, which may be caused by misperception, occur in IPD inevitably in practical circumstances. To better understand the situation, we consider the influence of misperception on ZD strategies in IPD, where the two players, player X and player Y , have different cognitions, but player X detects the misperception and it is believed to make ZD strategies by player Y. We provide a necessary and sufficient condition for the ZD strategies in IPD with misperception, where there is also a linear relationship between players’ utilities in player X’s cognition. Then we explore bounds of players’ expected utility deviation from a linear relationship in player X’s cognition with also improving its own utility.


Author(s):  
Zhou Fan ◽  
Rui Su ◽  
Weinan Zhang ◽  
Yong Yu

In this paper we propose a hybrid architecture of actor-critic algorithms for reinforcement learning in parameterized action space, which consists of multiple parallel sub-actor networks to decompose the structured action space into simpler action spaces along with a critic network to guide the training of all sub-actor networks. While this paper is mainly focused on parameterized action space, the proposed architecture, which we call hybrid actor-critic, can be extended for more general action spaces which has a hierarchical structure. We present an instance of the hybrid actor-critic architecture based on proximal policy optimization (PPO), which we refer to as hybrid proximal policy optimization (H-PPO). Our experiments test H-PPO on a collection of tasks with parameterized action space, where H-PPO demonstrates superior performance over previous methods of parameterized action reinforcement learning.


1999 ◽  
Vol 30 (2/3) ◽  
pp. 179-193 ◽  
Author(s):  
Beate Schuster

Zusammenfassung: Der soziometrische Status und der Viktimisierungsstatus von 5. bis 11. Klässlern wurde ermittelt, der Status hypothetischer InteraktionspartnerInnen sowie deren angebliche Wahlen variiert, und die Reaktionen im Gefangenendilemma erfaßt. Die Reaktionen wurden sowohl durch die experimentell vorgegebenen als auch durch die erwarteten Wahlen der InteraktionspartnerInnen bestimmt: Kooperative Zuege wurden eher kooperativ, und kompetitive Zuege eher kompetitiv beantwortet. Darüber hinaus vermieden Mobbingopfer kompetitive Züge, während zwei Untergruppen der Abgelehnten gegensätzliche Strategiepräferenzen aufwiesen: Versuchspersonen, die sowohl Ablehnung als auch Mobbing erfahren («Viktimisiert-Abgelehnte») verhielten sich besonders kooperativ; abgelehnte ProbandInnen, die nicht viktimisiert werden («Nicht-viktimisiert-Abgelehnte») dagegen vergleichsweise kompetitiv. Die kooperativen Wahlen viktimisierter Versuchspersonen wurden nicht erwidert: Die Versuchspersonen reagierten gegenüber den Viktimisierten kompetitiver als sich die Viktimisierten ihrerseits gegenüber ihren InteraktionspartnerInnen verhielten. Diese Befunde bestätigen die Notwendigkeit, bei «Abgelehnten» zwei Untergruppen auf der Basis der Viktimisierungsdimension zu unterscheiden. Die Befunde werden ferner vor dem Hintergrund der Hypothese diskutiert, daß die Submissivität potentieller Opfer mit zu ihrer Viktimisierungs-Erfahrung beiträgt.


Author(s):  
Laura Mieth ◽  
Raoul Bell ◽  
Axel Buchner

Abstract. The present study serves to test how positive and negative appearance-based expectations affect cooperation and punishment. Participants played a prisoner’s dilemma game with partners who either cooperated or defected. Then they were given a costly punishment option: They could spend money to decrease the payoffs of their partners. Aggregated over trials, participants spent more money for punishing the defection of likable-looking and smiling partners compared to punishing the defection of unlikable-looking and nonsmiling partners, but only because participants were more likely to cooperate with likable-looking and smiling partners, which provided the participants with more opportunities for moralistic punishment. When expressed as a conditional probability, moralistic punishment did not differ as a function of the partners’ facial likability. Smiling had no effect on the probability of moralistic punishment, but punishment was milder for smiling in comparison to nonsmiling partners.


2020 ◽  
Author(s):  
M Testori ◽  
M Kempf ◽  
RB Hoyle ◽  
Hedwig Eisenbarth

© 2019 Hogrefe Publishing. Personality traits have been long recognized to have a strong impact on human decision-making. In this study, a sample of 314 participants took part in an online game to investigate the impact of psychopathic traits on cooperative behavior in an iterated Prisoner's dilemma game. We found that disinhibition decreased the maintenance of cooperation in successive plays, but had no effect on moving toward cooperation after a previous defection or on the overall level of cooperation over rounds. Furthermore, our results underline the crucial importance of a good model selection procedure, showing how a poor choice of statistical model can provide misleading results.


Sign in / Sign up

Export Citation Format

Share Document