scholarly journals Planning Algorithms for Zero-Sum Games with Exponential Action Spaces: A Unifying Perspective

Author(s):  
Levi H. S. Lelis

In this paper we review several planning algorithms developed for zero-sum games with exponential action spaces, i.e., spaces that grow exponentially with the number of game components that can act simultaneously at a given game state. As an example, real-time strategy games have exponential action spaces because the number of actions available grows exponentially with the number of units controlled by the player. We also present a unifying perspective in which several existing algorithms can be described as an instantiation of a variant of NaiveMCTS. In addition to describing several existing planning algorithms for exponential action spaces, we show that other instantiations of this variant of NaiveMCTS represent novel and promising algorithms to be studied in future works.

Author(s):  
João P. Hespanha

This chapter explores the concept of mixed policies and how the notions for pure policies can be adapted to this more general type of policies. A pure policy consists of choices of particular actions (perhaps based on some observation), whereas a mixed policy involves choosing a probability distribution to select actions (perhaps as a function of observations). The idea behind mixed policies is that the players select their actions randomly according to a previously selected probability distribution. The chapter first considers the rock-paper-scissors game as an example of mixed policy before discussing mixed action spaces, mixed security policy and saddle-point equilibrium, mixed saddle-point equilibrium vs. average security levels, and general zero-sum games. It concludes with practice exercises with corresponding solutions and an additional exercise.


Author(s):  
Anderson Rocha Tavares ◽  
Sivasubramanian Anbalagan ◽  
Leandro Soriano Marcolino ◽  
Luiz Chaimowicz

Large state and action spaces are very challenging to reinforcement learning. However, in many domains there is a set of algorithms available, which estimate the best action given a state. Hence, agents can either directly learn a performance-maximizing mapping from states to actions, or from states to algorithms. We investigate several aspects of this dilemma, showing sufficient conditions for learning over algorithms to outperform over actions for a finite number of training iterations. We present synthetic experiments to further study such systems. Finally, we propose a function approximation approach, demonstrating the effectiveness of learning over algorithms in real-time strategy games.


Author(s):  
João P. Hespanha

This chapter defines a number of key concepts for non-zero-sum games involving two players. It begins by considering a two-player game G in which two players P₁ and P₂ are allowed to select policies within action spaces Γ‎₁ and Γ‎₂, respectively. Each player wants to minimize their own outcome, and does not care about the outcome of the other player. The chapter proceeds by discussing the security policy and Nash equilibrium for two-player non-zero-sum games, bimatrix games, admissible Nash equilibrium, and mixed policy. It also explores the order interchangeability property for Nash equilibria in best-response equivalent games before concluding with practice exercises and their corresponding solutions, along with additional exercises.


1976 ◽  
Vol 39 (1) ◽  
pp. 55-61
Author(s):  
Shaul Fox

According to Messick and McClintock (1968), differences in choice behavior of strategy games of the non-zero-sum type may be explained mainly by three motives: the individualistic, the competitive, and the cooperative. The researchers' operational definitions of the motives are based on the payoffs in the game matrices. This article critically examines Messick and Mc-Clintock's expositions and demonstrates that the payoff consideration cannot be the sole criterion for the identification of motivational goals. Disregarding the opponent's choice may lead to mistaken conclusions concerning the participant's motive as inferred from his decision. In the wake of this oversight, the proposal for measuring the three motives, stated in this article, is based on the following principles: (1) A pre-programmed plan for one participant in the game in order to standardize the situation the subjects face. (2) A large number of trials in order to ensure the subject's awareness of the opponent's fixed strategy. (3) The combination of 1 and 2 with appropriate payoff values enables the construction of the conflict situation confronting the subject.


2020 ◽  
pp. 1087724X2098158
Author(s):  
Camilo Benitez-Avila ◽  
Andreas Hartmann ◽  
Geert Dewulf

Process management literature is skeptical about creating legitimacy and a sense of partnership when implementing concessional Public-Private Partnerships. Within such organizational arrangements, managerial interaction often resembles zero-sum games. To explore the possibility to (re)create a sense of partnership in concessional PPPs, we developed the “3P challenge” serious game. Two gaming sessions with a mixed group of practitioners and a team of public project managers showed that the game cycle recreates adversarial situations where players can enact contractual obligations with higher or lower levels of subjectivity. When reflecting on the gaming experience, practitioners point out that PPP contracts can be creatively enacted by managers who act as brokers of diverse interests. While becoming aware of each other stakes they can blend contractual dispositions or place brackets around some contractual clauses for reaching agreement. By doing so, they can (re)create a sense of partnership, clarity, and fairness of the PPP contract.


Sign in / Sign up

Export Citation Format

Share Document