Stochastic Policies for Games in Extensive Form

Author(s):  
João P. Hespanha

This chapter discusses two types of stochastic policy for extensive form game representation as well as the existence and computation of saddle-point equilibrium. For games in extensive form, a mixed policy corresponds to selecting a pure policy in random based on a previously selected probability distribution before the game starts, and then playing that policy throughout the game. It is assumed that the random selections by both players are done statistically independently and the players will try to optimize the expected outcome of the game. After providing an overview of mixed policies and saddle-point equilibria, the chapter considers the behavioral policy for games in extensive form. It also explores behavioral saddle-point equilibrium, behavioral vs. mixed policy, recursive computation of equilibria for feedback games, mixed vs. behavioral order interchangeability, and non-feedback games. It concludes with practice exercises and their corresponding solutions, along with additional exercises.

Author(s):  
João P. Hespanha

This chapter discusses a number of key concepts for extensive form game representation. It first considers a matrix that defines a zero-sum matrix game for which the minimizer has two actions and the maximizer has three actions and shows that the matrix description, by itself, does not capture the information structure of the game and, in fact, other information structures are possible. It then describes an extensive form representation of a zero-sum two-person game, which is a decision tree, the extensive form representation of multi-stage games, and the notions of security policy, security level, and saddle-point equilibrium for a game in extensive form. It also explores the matrix form for games in extensive form, recursive computation of equilibria for single-stage games, feedback games, feedback saddle-point for multi-stage games, and recursive computation of equilibria for multi-stage games. It concludes with a practice exercise with the corresponding solution, along with additional exercises.


2021 ◽  
Author(s):  
Philipp E. Otto

AbstractThe Monty Hall game is one of the most discussed decision problems, but where a convincing behavioral explanation of the systematic deviations from probability theory is still lacking. Most people not changing their initial choice, when this is beneficial under information updating, demands further explanation. Not only trust and the incentive of interestingly prolonging the game for the audience can explain this kind of behavior, but the strategic setting can be modeled more sophisticatedly. When aiming to increase the odds of winning, while Monty’s incentives are unknown, then not to switch doors can be considered as the most secure strategy and avoids a sure loss when Monty’s guiding aim is not to give away the prize. Understanding and modeling the Monty Hall game can be regarded as an ideal teaching example for fundamental statistic understandings.


Author(s):  
João P. Hespanha

This chapter explores the concept of mixed policies and how the notions for pure policies can be adapted to this more general type of policies. A pure policy consists of choices of particular actions (perhaps based on some observation), whereas a mixed policy involves choosing a probability distribution to select actions (perhaps as a function of observations). The idea behind mixed policies is that the players select their actions randomly according to a previously selected probability distribution. The chapter first considers the rock-paper-scissors game as an example of mixed policy before discussing mixed action spaces, mixed security policy and saddle-point equilibrium, mixed saddle-point equilibrium vs. average security levels, and general zero-sum games. It concludes with practice exercises with corresponding solutions and an additional exercise.


Entropy ◽  
2019 ◽  
Vol 21 (10) ◽  
pp. 945
Author(s):  
Karim Banawan ◽  
Sennur Ulukus

We investigate the secure degrees of freedom (s.d.o.f.) of three new channel models: broadcast channel with combating helpers, interference channel with selfish users, and multiple access wiretap channel with deviating users. The goal of introducing these channel models is to investigate various malicious interactions that arise in networks, including active adversaries. That is in contrast with the common assumption in the literature that the users follow a certain protocol altruistically and transmit both message-carrying and cooperative jamming signals in an optimum manner. In the first model, over a classical broadcast channel with confidential messages (BCCM), there are two helpers, each associated with one of the receivers. In the second model, over a classical interference channel with confidential messages (ICCM), there is a helper and users are selfish. By casting each problem as an extensive-form game and applying recursive real interference alignment, we show that, for the first model, the combating intentions of the helpers are neutralized and the full s.d.o.f. is retained; for the second model, selfishness precludes secure communication and no s.d.o.f. is achieved. In the third model, we consider the multiple access wiretap channel (MAC-WTC), where multiple legitimate users wish to have secure communication with a legitimate receiver in the presence of an eavesdropper. We consider the case when a subset of users deviate from the optimum protocol that attains the exact s.d.o.f. of this channel. We consider two kinds of deviation: when some of the users stop transmitting cooperative jamming signals, and when a user starts sending intentional jamming signals. For the first scenario, we investigate possible responses of the remaining users to counteract such deviation. For the second scenario, we use an extensive-form game formulation for the interactions of the deviating and well-behaving users. We prove that a deviating user can drive the s.d.o.f. to zero; however, the remaining users can exploit its intentional jamming signals as cooperative jamming signals against the eavesdropper and achieve an optimum s.d.o.f.


Sign in / Sign up

Export Citation Format

Share Document