A Proximal/Gradient Approach for Computing the Nash Equilibrium in Controllable Markov Games

Author(s):  
Julio B. Clempner
2020 ◽  
Vol 34 (05) ◽  
pp. 7325-7332
Author(s):  
Haifeng Zhang ◽  
Weizhe Chen ◽  
Zeren Huang ◽  
Minne Li ◽  
Yaodong Yang ◽  
...  

Coordination is one of the essential problems in multi-agent systems. Typically multi-agent reinforcement learning (MARL) methods treat agents equally and the goal is to solve the Markov game to an arbitrary Nash equilibrium (NE) when multiple equilibra exist, thus lacking a solution for NE selection. In this paper, we treat agents unequally and consider Stackelberg equilibrium as a potentially better convergence point than Nash equilibrium in terms of Pareto superiority, especially in cooperative environments. Under Markov games, we formally define the bi-level reinforcement learning problem in finding Stackelberg equilibrium. We propose a novel bi-level actor-critic learning method that allows agents to have different knowledge base (thus intelligent), while their actions still can be executed simultaneously and distributedly. The convergence proof is given, while the resulting learning algorithm is tested against the state of the arts. We found that the proposed bi-level actor-critic algorithm successfully converged to the Stackelberg equilibria in matrix games and find a asymmetric solution in a highway merge environment.


2011 ◽  
pp. 65-87 ◽  
Author(s):  
A. Rubinstein

The article considers some aspects of the patronized goods theory with respect to efficient and inefficient equilibria. The author analyzes specific features of patronized goods as well as their connection with market failures, and conjectures that they are related to the emergence of Pareto-inefficient Nash equilibria. The key problem is the analysis of the opportunities for transforming inefficient Nash equilibrium into Pareto-optimal Nash equilibrium for patronized goods by modifying the institutional environment. The paper analyzes social motivation for institutional modernization and equilibrium conditions in the generalized Wicksell-Lindahl model for patronized goods. The author also considers some applications of patronized goods theory to social policy issues.


Sign in / Sign up

Export Citation Format

Share Document