One-Player Dynamic Games

Author(s):  
João P. Hespanha

This chapter focuses on one-player discrete time dynamic games, that is, the optimal control of a discrete time dynamical system. It first considers solution methods for one-player dynamic games, which are simple optimizations, before discussing discrete time cost-to-go. It shows that, regardless of the information structure (open loop, state feedback or other), it is not possible to obtain a cost lower than the cost-to-go. A computationally efficient recursive technique that can be used to compute the cost-to-go is dynamic programming. After providing an overview of discrete time dynamic programming, the chapter explores the complexity of computing the cost-to-go at all stages, the use of MATLAB to solve finite one-player games, and linear quadratic dynamic games. It concludes with a practice exercise and the corresponding solution, along with an additional exercise.

Author(s):  
João P. Hespanha

This chapter focuses on the computation of the saddle-point equilibrium of a zero-sum discrete time dynamic game in a state-feedback policy. It begins by considering solution methods for two-player zero sum dynamic games in discrete time, assuming a finite horizon stage-additive cost that Player 1 wants to minimize and Player 2 wants to maximize, and taking into account a state feedback information structure. The discussion then turns to discrete time dynamic programming, the use of MATLAB to solve zero-sum games with finite state spaces and finite action spaces, and discrete time linear quadratic dynamic games. The chapter concludes with a practice exercise that requires computing the cost-to-go for each state of the tic-tac-toe game, and the corresponding solution.


Author(s):  
João P. Hespanha

This chapter focuses on one-player continuous time dynamic games, that is, the optimal control of a continuous time dynamical system. It begins by considering a one-player continuous time differential game in which the (only) player wants to minimize either using an open-loop policy or a state-feedback policy. It then discusses continuous time cost-to-go, with the following conclusion: regardless of the information structure considered (open loop, state feedback, or other), it is not possible to obtain a cost lower than cost-to-go. It also explores continuous time dynamic programming, linear quadratic dynamic games, and differential games with variable termination time before concluding with a practice exercise and the corresponding solution.


Author(s):  
João P. Hespanha

This chapter focuses on the computation of the saddle-point equilibrium of a zero-sum continuous time dynamic game in a state-feedback policy. It begins by considering the solution for two-player zero sum dynamic games in continuous time, assuming a finite horizon integral cost that Player 1 wants to minimize and Player 2 wants to maximize, and taking into account a state feedback information structure. Continuous time dynamic programming can also be used to construct saddle-point equilibria in state-feedback policies. The discussion then turns to continuous time linear quadratic dynamic games and the use of dynamic programming to construct a saddle-point equilibrium in a state-feedback policy for a two-player zero sum differential game with variable termination time. The chapter also describes pursuit-evasion games before concluding with a practice exercise and the corresponding solution.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Kehan Si ◽  
Zhen Wu

AbstractThis paper studies a controlled backward-forward linear-quadratic-Gaussian (LQG) large population system in Stackelberg games. The leader agent is of backward state and follower agents are of forward state. The leader agent is dominating as its state enters those of follower agents. On the other hand, the state-average of all follower agents affects the cost functional of the leader agent. In reality, the leader and the followers may represent two typical types of participants involved in market price formation: the supplier and producers. This differs from standard MFG literature and is mainly due to the Stackelberg structure here. By variational analysis, the consistency condition system can be represented by some fully-coupled backward-forward stochastic differential equations (BFSDEs) with high dimensional block structure in an open-loop sense. Next, we discuss the well-posedness of such a BFSDE system by virtue of the contraction mapping method. Consequently, we obtain the decentralized strategies for the leader and follower agents which are proved to satisfy the ε-Nash equilibrium property.


Sign in / Sign up

Export Citation Format

Share Document