State-Feedback Zero-Sum Differential Games

Noncooperative Game Theory ◽

10.23943/princeton/9780691175218.003.0018 ◽

2017 ◽

Author(s):

João P. Hespanha

Keyword(s):

Dynamic Programming ◽

Saddle Point ◽

Continuous Time ◽

Information Structure ◽

Dynamic Games ◽

State Feedback ◽

Feedback Information ◽

Time Dynamic ◽

Zero Sum ◽

Saddle Point Equilibrium

This chapter focuses on the computation of the saddle-point equilibrium of a zero-sum continuous time dynamic game in a state-feedback policy. It begins by considering the solution for two-player zero sum dynamic games in continuous time, assuming a finite horizon integral cost that Player 1 wants to minimize and Player 2 wants to maximize, and taking into account a state feedback information structure. Continuous time dynamic programming can also be used to construct saddle-point equilibria in state-feedback policies. The discussion then turns to continuous time linear quadratic dynamic games and the use of dynamic programming to construct a saddle-point equilibrium in a state-feedback policy for a two-player zero sum differential game with variable termination time. The chapter also describes pursuit-evasion games before concluding with a practice exercise and the corresponding solution.

Download Full-text

State-Feedback Zero-Sum Dynamic Games

Noncooperative Game Theory ◽

10.23943/princeton/9780691175218.003.0017 ◽

2017 ◽

Author(s):

João P. Hespanha

Keyword(s):

Discrete Time ◽

Information Structure ◽

Dynamic Games ◽

State Feedback ◽

Linear Quadratic ◽

Feedback Information ◽

Time Dynamic ◽

Finite State ◽

Solution Methods ◽

Zero Sum

This chapter focuses on the computation of the saddle-point equilibrium of a zero-sum discrete time dynamic game in a state-feedback policy. It begins by considering solution methods for two-player zero sum dynamic games in discrete time, assuming a finite horizon stage-additive cost that Player 1 wants to minimize and Player 2 wants to maximize, and taking into account a state feedback information structure. The discussion then turns to discrete time dynamic programming, the use of MATLAB to solve zero-sum games with finite state spaces and finite action spaces, and discrete time linear quadratic dynamic games. The chapter concludes with a practice exercise that requires computing the cost-to-go for each state of the tic-tac-toe game, and the corresponding solution.

Download Full-text

One-Player Differential Games

Noncooperative Game Theory ◽

10.23943/princeton/9780691175218.003.0016 ◽

2017 ◽

Author(s):

João P. Hespanha

Keyword(s):

Dynamical System ◽

Differential Games ◽

Continuous Time ◽

Information Structure ◽

Dynamic Games ◽

State Feedback ◽

Open Loop ◽

Linear Quadratic ◽

Time Dynamic ◽

Termination Time

This chapter focuses on one-player continuous time dynamic games, that is, the optimal control of a continuous time dynamical system. It begins by considering a one-player continuous time differential game in which the (only) player wants to minimize either using an open-loop policy or a state-feedback policy. It then discusses continuous time cost-to-go, with the following conclusion: regardless of the information structure considered (open loop, state feedback, or other), it is not possible to obtain a cost lower than cost-to-go. It also explores continuous time dynamic programming, linear quadratic dynamic games, and differential games with variable termination time before concluding with a practice exercise and the corresponding solution.

Download Full-text

One-Player Dynamic Games

Noncooperative Game Theory ◽

10.23943/princeton/9780691175218.003.0015 ◽

2017 ◽

Author(s):

João P. Hespanha

Keyword(s):

Dynamic Programming ◽

Discrete Time ◽

Information Structure ◽

Dynamic Games ◽

Open Loop ◽

Linear Quadratic ◽

Time Dynamic ◽

Discrete Time Dynamical System ◽

Solution Methods ◽

The Cost

This chapter focuses on one-player discrete time dynamic games, that is, the optimal control of a discrete time dynamical system. It first considers solution methods for one-player dynamic games, which are simple optimizations, before discussing discrete time cost-to-go. It shows that, regardless of the information structure (open loop, state feedback or other), it is not possible to obtain a cost lower than the cost-to-go. A computationally efficient recursive technique that can be used to compute the cost-to-go is dynamic programming. After providing an overview of discrete time dynamic programming, the chapter explores the complexity of computing the cost-to-go at all stages, the use of MATLAB to solve finite one-player games, and linear quadratic dynamic games. It concludes with a practice exercise and the corresponding solution, along with an additional exercise.

Download Full-text

Robust Adaptive Dynamic Programming of Two-Player Zero-Sum Games for Continuous-Time Linear Systems

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2015.2461452 ◽

2015 ◽

Vol 26 (12) ◽

pp. 3314-3319 ◽

Cited By ~ 22

Author(s):

Yue Fu ◽

Jun Fu ◽

Tianyou Chai

Keyword(s):

Dynamic Programming ◽

Linear Systems ◽

Continuous Time ◽

Adaptive Dynamic Programming ◽

Adaptive Dynamic ◽

Zero Sum Games ◽

Robust Adaptive ◽

Zero Sum ◽

Time Linear

Download Full-text

Mixed Policies

Noncooperative Game Theory ◽

10.23943/princeton/9780691175218.003.0004 ◽

2017 ◽

Author(s):

João P. Hespanha

Keyword(s):

Probability Distribution ◽

Saddle Point ◽

Security Policy ◽

General Type ◽

Zero Sum Games ◽

Security Levels ◽

Mixed Action ◽

Zero Sum ◽

Saddle Point Equilibrium ◽

Action Spaces

This chapter explores the concept of mixed policies and how the notions for pure policies can be adapted to this more general type of policies. A pure policy consists of choices of particular actions (perhaps based on some observation), whereas a mixed policy involves choosing a probability distribution to select actions (perhaps as a function of observations). The idea behind mixed policies is that the players select their actions randomly according to a previously selected probability distribution. The chapter first considers the rock-paper-scissors game as an example of mixed policy before discussing mixed action spaces, mixed security policy and saddle-point equilibrium, mixed saddle-point equilibrium vs. average security levels, and general zero-sum games. It concludes with practice exercises with corresponding solutions and an additional exercise.

Download Full-text

Continuous-time dynamic games for the Cournot adjustment process for competing oligopolists

Applied Mathematics and Computation ◽

10.1016/j.amc.2012.12.078 ◽

2013 ◽

Vol 219 (12) ◽

pp. 6400-6409 ◽

Cited By ~ 4

Author(s):

Brooke C. Snyder ◽

Robert A. Van Gorder ◽

K. Vajravelu

Keyword(s):

Continuous Time ◽

Dynamic Games ◽

Adjustment Process ◽

Time Dynamic

Download Full-text

Finite‐Difference Methods for Continuous‐Time Dynamic Programming

Computational Methods for the Study of Dynamic Economies ◽

10.1093/0199248273.003.0008 ◽

2001 ◽

pp. 172-194 ◽

Cited By ~ 7

Author(s):

Graham V. Candler

Keyword(s):

Dynamic Programming ◽

Finite Difference ◽

Continuous Time ◽

Finite Difference Methods ◽

Time Dynamic ◽

Difference Methods

Download Full-text

A zero-sum stopping game in a continuous-time dynamic fuzzy system

Mathematical and Computer Modelling ◽

10.1016/s0895-7177(01)00086-3 ◽

2001 ◽

Vol 34 (5-6) ◽

pp. 603-614

Author(s):

Y. Yoshida

Keyword(s):

Continuous Time ◽

Fuzzy System ◽

Time Dynamic ◽

Stopping Game ◽

Zero Sum

Download Full-text

A Dynamic Multi-Objective Duopoly Game with Capital Accumulation and Pollution

Mathematics ◽

10.3390/math9161983 ◽

2021 ◽

Vol 9 (16) ◽

pp. 1983

Author(s):

Bertrand Crettez ◽

Naila Hayek ◽

Peter M. Kort

Keyword(s):

Information Structure ◽

Capital Accumulation ◽

Production Capacity ◽

Open Loop ◽

Feedback Nash Equilibrium ◽

Feedback Information ◽

Time Dynamic ◽

Clean Environment ◽

Time Differential ◽

Dynamic Duopoly

This paper studies a discrete-time dynamic duopoly game with homogenous goods. Both firms have to decide on investment where investment increases production capacity so that they are able to put a larger quantity on the market. The downside, however, is that a larger quantity raises pollution. The firms have multiple objectives in the sense that each one maximizes the discounted profit stream and appreciates a clean environment as well. We obtain some surprising results. First, where it is known from the continuous-time differential game literature that firms invest more under a feedback information structure compared to an open-loop one, we detect scenarios where the opposite holds. Second, in a feedback Nash equilibrium, capital stock is more sensitive to environmental appreciation than in the open-loop case.

Download Full-text