Dynamic Set Values for Nonzero-Sum Games with Multiple Equilibriums

Nonzero sum games typically have multiple Nash equilibriums (or no equilibrium), and unlike the zero-sum case, they may have different values at different equilibriums. Instead of focusing on the existence of individual equilibriums, we study the set of values over all equilibriums, which we call the set value of the game. The set value is unique by nature and always exists (with possible value [Formula: see text]). Similar to the standard value function in control literature, it enjoys many nice properties, such as regularity, stability, and more importantly, the dynamic programming principle. There are two main features in order to obtain the dynamic programming principle: (i) we must use closed-loop controls (instead of open-loop controls); and (ii) we must allow for path dependent controls, even if the problem is in a state-dependent (Markovian) setting. We shall consider both discrete and continuous time models with finite time horizon. For the latter, we will also provide a duality approach through certain standard PDE (or path-dependent PDE), which is quite efficient for numerically computing the set value of the game.

Download Full-text

A BSDE Approach to Stochastic Differential Games with Regime Switching

Mathematical Problems in Engineering ◽

10.1155/2021/9930142 ◽

2021 ◽

Vol 2021 ◽

pp. 1-17

Author(s):

J. Y. Li ◽

M. N. Tang

Keyword(s):

Dynamic Programming ◽

Regime Switching ◽

Time Horizon ◽

Backward Stochastic Differential Equation ◽

Stochastic Differential Games ◽

Dynamic Programming Principle ◽

Value Functions ◽

Finite Time Horizon ◽

Hamilton Jacobi Bellman ◽

Zero Sum

In this paper, we study a two-player zero-sum stochastic differential game with regime switching in the framework of forward-backward stochastic differential equations on a finite time horizon. By means of backward stochastic differential equation methods, in particular that of the notion from stochastic backward semigroups, we prove a dynamic programming principle for both the upper and the lower value functions of the game. Based on the dynamic programming principle, the upper and the lower value functions are shown to be the unique viscosity solutions of the associated upper and lower Hamilton–Jacobi–Bellman–Isaacs equations.

Download Full-text

A Weak Dynamic Programming Principle for Zero-Sum Stochastic Differential Games with Unbounded Controls

SIAM Journal on Control and Optimization ◽

10.1137/120897638 ◽

2013 ◽

Vol 51 (3) ◽

pp. 2036-2080 ◽

Cited By ~ 12

Author(s):

Erhan Bayraktar ◽

Song Yao

Keyword(s):

Dynamic Programming ◽

Differential Games ◽

Stochastic Differential Games ◽

Dynamic Programming Principle ◽

Unbounded Controls ◽

Zero Sum

Download Full-text

A Constrained Markovian Diffusion Model for Controlling the Pollution Accumulation

Mathematics ◽

10.3390/math9131466 ◽

2021 ◽

Vol 9 (13) ◽

pp. 1466

Author(s):

Beatris Adriana Escobedo-Trujillo ◽

José Daniel López-Barrientos ◽

Javier Garrido-Meléndez

Keyword(s):

Dynamic Programming ◽

Dirichlet Problem ◽

Stochastic Control ◽

Finite Time ◽

Time Horizon ◽

Closed Loop ◽

Programming Techniques ◽

Pollution Accumulation ◽

Finite Time Horizon ◽

The Cost

This work presents a study of a finite-time horizon stochastic control problem with restrictions on both the reward and the cost functions. To this end, it uses standard dynamic programming techniques, and an extension of the classic Lagrange multipliers approach. The coefficients considered here are supposed to be unbounded, and the obtained strategies are of non-stationary closed-loop type. The driving thread of the paper is a sequence of examples on a pollution accumulation model, which is used for the purpose of showing three algorithms for the purpose of replicating the results. There, the reader can find a result on the interchangeability of limits in a Dirichlet problem.

Download Full-text

On the dynamic programming principle for uniformly nondegenerate stochastic differential games in domains and the Isaacs equations

Probability Theory and Related Fields ◽

10.1007/s00440-013-0495-y ◽

2013 ◽

Vol 158 (3-4) ◽

pp. 751-783 ◽

Cited By ~ 7

Author(s):

N. V. Krylov

Keyword(s):

Dynamic Programming ◽

Differential Games ◽

Stochastic Differential Games ◽

Dynamic Programming Principle ◽

Isaacs Equations

Download Full-text

Event-Triggered Control of Discrete-Time Zero-Sum Games via Deterministic Policy Gradient Adaptive Dynamic Programming

IEEE Transactions on Systems Man and Cybernetics Systems ◽

10.1109/tsmc.2021.3105663 ◽

2021 ◽

pp. 1-13

Author(s):

Yongwei Zhang ◽

Bo Zhao ◽

Derong Liu ◽

Shunchao Zhang

Keyword(s):

Dynamic Programming ◽

Discrete Time ◽

Adaptive Dynamic Programming ◽

Adaptive Dynamic ◽

Zero Sum Games ◽

Time Zero ◽

Policy Gradient ◽

Zero Sum ◽

Event Triggered

Download Full-text

Adaptive Dynamic Programming for Discrete-Time Zero-Sum Games

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2016.2638863 ◽

2018 ◽

Vol 29 (4) ◽

pp. 957-969 ◽

Cited By ~ 44

Author(s):

Qinglai Wei ◽

Derong Liu ◽

Qiao Lin ◽

Ruizhuo Song

Keyword(s):

Dynamic Programming ◽

Discrete Time ◽

Adaptive Dynamic Programming ◽

Adaptive Dynamic ◽

Zero Sum Games ◽

Time Zero ◽

Zero Sum

Download Full-text

Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2016.2561300 ◽

2017 ◽

Vol 28 (3) ◽

pp. 714-725 ◽

Cited By ~ 46

Author(s):

Yuanheng Zhu ◽

Dongbin Zhao ◽

Xiangjun Li

Keyword(s):

Dynamic Programming ◽

Adaptive Dynamic Programming ◽

Online Data ◽

Adaptive Dynamic ◽

Zero Sum

Download Full-text

A robust adaptive dynamic programming principle for sensorimotor control with signal-dependent noise

Journal of Systems Science and Complexity ◽

10.1007/s11424-015-3310-2 ◽

2015 ◽

Vol 28 (2) ◽

pp. 261-288 ◽

Cited By ~ 6

Author(s):

Yu Jiang ◽

Zhong-Ping Jiang

Keyword(s):

Dynamic Programming ◽

Sensorimotor Control ◽

Adaptive Dynamic Programming ◽

Dynamic Programming Principle ◽

Adaptive Dynamic ◽

Robust Adaptive

Download Full-text

Robust Adaptive Dynamic Programming of Two-Player Zero-Sum Games for Continuous-Time Linear Systems

IEEE Transactions on Neural Networks and Learning Systems ◽

10.1109/tnnls.2015.2461452 ◽

2015 ◽

Vol 26 (12) ◽

pp. 3314-3319 ◽

Cited By ~ 22

Author(s):

Yue Fu ◽

Jun Fu ◽

Tianyou Chai

Keyword(s):

Dynamic Programming ◽

Linear Systems ◽

Continuous Time ◽

Adaptive Dynamic Programming ◽

Adaptive Dynamic ◽

Zero Sum Games ◽

Robust Adaptive ◽

Zero Sum ◽

Time Linear

Download Full-text

Nonlinear Stochastic Optimal Control of MDOF Partially Observable Linear Systems Excited by Combined Harmonic and Wide-Band Noises

International Journal of Structural Stability and Dynamics ◽

10.1142/s0219455419500196 ◽

2019 ◽

Vol 19 (03) ◽

pp. 1950019 ◽

Cited By ~ 3

Author(s):

R. C. Hu ◽

X. F. Wang ◽

X. D. Gu ◽

R. H. Huan

Keyword(s):

Optimal Control ◽

Dynamic Programming ◽

Control Problem ◽

Linear Systems ◽

Control Strategy ◽

Stochastic Optimal Control ◽

Wide Band ◽

Dynamic Programming Principle ◽

Stochastic Dynamic ◽

Partially Observable

In this paper, nonlinear stochastic optimal control of multi-degree-of-freedom (MDOF) partially observable linear systems subjected to combined harmonic and wide-band random excitations is investigated. Based on the separation principle, the control problem of a partially observable system is converted into a completely observable one. The dynamic programming equation for the completely observable control problem is then set up based on the stochastic averaging method and stochastic dynamic programming principle, from which the nonlinear optimal control law is derived. To illustrate the feasibility and efficiency of the proposed control strategy, the responses of the uncontrolled and optimal controlled systems are respectively obtained by solving the associated Fokker–Planck–Kolmogorov (FPK) equation. Numerical results show the proposed control strategy can dramatically reduce the response of stochastic systems subjected to both harmonic and wide-band random excitations.

Download Full-text