On the Convergence of Model Free Learning in Mean Field Games

Romuald Elie; Julien Pérolat; Mathieu Laurière; Matthieu Geist; Olivier Pietquin

doi:10.1609/aaai.v34i05.6203

On the Convergence of Model Free Learning in Mean Field Games

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6203 ◽

2020 ◽

Vol 34 (05) ◽

pp. 7143-7150

Author(s):

Romuald Elie ◽

Julien Pérolat ◽

Mathieu Laurière ◽

Matthieu Geist ◽

Olivier Pietquin

Keyword(s):

Learning Algorithms ◽

Large Population ◽

Single Agent ◽

Mean Field ◽

Space Environment ◽

Iteration Step ◽

Mean Field Games ◽

Scalable Algorithms ◽

Model Free ◽

Approximate Nash Equilibrium

Learning by experience in Multi-Agent Systems (MAS) is a difficult and exciting task, due to the lack of stationarity of the environment, whose dynamics evolves as the population learns. In order to design scalable algorithms for systems with a large population of interacting agents (e.g., swarms), this paper focuses on Mean Field MAS, where the number of agents is asymptotically infinite. Recently, a very active burgeoning field studies the effects of diverse reinforcement learning algorithms for agents with no prior information on a stationary Mean Field Game (MFG) and learn their policy through repeated experience. We adopt a high perspective on this problem and analyze in full generality the convergence of a fictitious iterative scheme using any single agent learning algorithm at each step. We quantify the quality of the computed approximate Nash equilibrium, in terms of the accumulated errors arising at each learning iteration step. Notably, we show for the first time convergence of model free learning algorithms towards non-stationary MFG equilibria, relying only on classical assumptions on the MFG dynamics. We illustrate our theoretical results with a numerical experiment in a continuous action-space environment, where the approximate best response of the iterative fictitious play scheme is computed with a deep RL algorithm.

Download Full-text

Mean field games models of segregation

Mathematical Models and Methods in Applied Sciences ◽

10.1142/s0218202517400036 ◽

2017 ◽

Vol 27 (01) ◽

pp. 75-113 ◽

Cited By ~ 23

Author(s):

Yves Achdou ◽

Martino Bardi ◽

Marco Cirant

Keyword(s):

Numerical Methods ◽

Existence Of Solutions ◽

Large Population ◽

Mean Field ◽

Mean Field Games ◽

Residential Choice ◽

Urban Settlements ◽

Thomas Schelling ◽

Two Populations ◽

Large Population Limit

This paper introduces and analyzes some models in the framework of mean field games (MFGs) describing interactions between two populations motivated by the studies on urban settlements and residential choice by Thomas Schelling. For static games, a large population limit is proved. For differential games with noise, the existence of solutions is established for the systems of partial differential equations of MFG theory, in the stationary and in the evolutive case. Numerical methods are proposed with several simulations. In the examples and in the numerical results, particular emphasis is put on the phenomenon of segregation between the populations.

Download Full-text

A Novel Mean Field Game-Based Strategy for Charging Electric Vehicles in Solar Powered Parking Lots

Energies ◽

10.3390/en14248517 ◽

2021 ◽

Vol 14 (24) ◽

pp. 8517

Author(s):

Samuel M. Muhindo ◽

Roland P. Malhamé ◽

Geza Joos

Keyword(s):

Nash Equilibrium ◽

Solar Energy ◽

Electric Vehicles ◽

Large Population ◽

Mean Field ◽

Mean Field Games ◽

Parking Lot ◽

Equal Sharing ◽

Solar Powered ◽

Energy Forecast

We develop a strategy, with concepts from Mean Field Games (MFG), to coordinate the charging of a large population of battery electric vehicles (BEVs) in a parking lot powered by solar energy and managed by an aggregator. A yearly parking fee is charged for each BEV irrespective of the amount of energy extracted. The goal is to share the energy available so as to minimize the standard deviation (STD) of the state of charge (SOC) of batteries when the BEVs are leaving the parking lot, while maintaining some fairness and decentralization criteria. The MFG charging laws correspond to the Nash equilibrium induced by quadratic cost functions based on an inverse Nash equilibrium concept and designed to favor the batteries with the lower SOCs upon arrival. While the MFG charging laws are strictly decentralized, they guarantee that a mean of instantaneous charging powers to the BEVs follows a trajectory based on the solar energy forecast for the day. That day ahead forecast is broadcasted to the BEVs which then gauge the necessary SOC upon leaving their home. We illustrate the advantages of the MFG strategy for the case of a typical sunny day and a typical cloudy day when compared to more straightforward strategies: first come first full/serve and equal sharing. The behavior of the charging strategies is contrasted under conditions of random arrivals and random departures of the BEVs in the parking lot.

Download Full-text

Discrete-time mean field games with risk-averse agents

ESAIM Control Optimisation and Calculus of Variations ◽

10.1051/cocv/2021044 ◽

2021 ◽

Author(s):

Joseph Frédéric Bonnans ◽

Pierre Lavigne ◽

Laurent Pfeiffer

Keyword(s):

Discrete Time ◽

Risk Measures ◽

Mean Field ◽

Dynamic Game ◽

Coupled System ◽

Kolmogorov Equation ◽

Mean Field Games ◽

Risk Averse ◽

Existence Of A Solution ◽

Approximate Nash Equilibrium

We propose and investigate a discrete-time mean field game model involving risk-averse agents. The model under study is a coupled system of dynamic programming equations with a Kolmogorov equation. The agents' risk aversion is modeled by composite risk measures. The existence of a solution to the coupled system is obtained with a fixed point approach. The corresponding feedback control allows to construct an approximate Nash equilibrium for a related dynamic game with finitely many players.

Download Full-text

Mean Field Games for Large-Population Multiagent Systems with Markov Jump Parameters

SIAM Journal on Control and Optimization ◽

10.1137/100800324 ◽

2012 ◽

Vol 50 (4) ◽

pp. 2308-2334 ◽

Cited By ~ 62

Author(s):

Bing-Chang Wang ◽

Ji-Feng Zhang

Keyword(s):

Multiagent Systems ◽

Large Population ◽

Mean Field ◽

Mean Field Games ◽

Markov Jump

Download Full-text

Model-free Reinforcement Learning for Non-stationary Mean Field Games

2020 59th IEEE Conference on Decision and Control (CDC) ◽

10.1109/cdc42340.2020.9304340 ◽

2020 ◽

Author(s):

Rajesh K Mishra ◽

Deepanshu Vasal ◽

Sriram Vishwanath

Keyword(s):

Reinforcement Learning ◽

Mean Field ◽

Mean Field Games ◽

Model Free

Download Full-text

Adaptive Mean Field Games for Large Population Coupled ARX Systems with Unknown Coupling Strength

Dynamic Games and Applications ◽

10.1007/s13235-013-0084-9 ◽

2013 ◽

Vol 3 (4) ◽

pp. 489-507 ◽

Cited By ~ 4

Author(s):

Tao Li ◽

Ji-Feng Zhang

Keyword(s):

Coupling Strength ◽

Large Population ◽

Mean Field ◽

Mean Field Games

Download Full-text

Approximate Markov-Nash Equilibria for Discrete-Time Risk-Sensitive Mean-Field Games

Mathematics of Operations Research ◽

10.1287/moor.2019.1044 ◽

2020 ◽

Vol 45 (4) ◽

pp. 1596-1620

Author(s):

Naci Saldi ◽

Tamer Başar ◽

Maxim Raginsky

Keyword(s):

Discrete Time ◽

Empirical Distribution ◽

Infinite Horizon ◽

Mean Field ◽

Mean Field Games ◽

Infinite Population ◽

Approximate Nash Equilibrium ◽

Risk Sensitive ◽

The Mean ◽

Sensitive Optimality

In this paper, we study a class of discrete-time mean-field games under the infinite-horizon risk-sensitive optimality criterion. Risk sensitivity is introduced for each agent (player) via an exponential utility function. In this game model, each agent is coupled with the rest of the population through the empirical distribution of the states, which affects both the agent’s individual cost and its state dynamics. Under mild assumptions, we establish the existence of a mean-field equilibrium in the infinite-population limit as the number of agents (N) goes to infinity, and we then show that the policy obtained from the mean-field equilibrium constitutes an approximate Nash equilibrium when N is sufficiently large.

Download Full-text

Cemracs 2017: numerical probabilistic approach to MFG

ESAIM Proceedings and Surveys ◽

10.1051/proc/201965084 ◽

2019 ◽

Vol 65 ◽

pp. 84-113

Author(s):

Andrea Angiuli ◽

Christy V. Graves ◽

Houzhi Li ◽

Jean-François Chassagneux ◽

François Delarue ◽

...

Keyword(s):

Optimization Problems ◽

Continuation Method ◽

Large Population ◽

Probabilistic Approach ◽

Mean Field ◽

Picard Iteration ◽

Mean Field Games ◽

Benchmark Problems ◽

Control Problems ◽

Great Success

This project investigates numerical methods for solving fully coupled forward-backward stochastic differential equations (FBSDEs) of McKean-Vlasov type. Having numerical solvers for such mean field FBSDEs is of interest because of the potential application of these equations to optimization problems over a large population, say for instance mean field games (MFG) and optimal mean field control problems. Theory for this kind of problems has met with great success since the early works on mean field games by Lasry and Lions, see [29], and by Huang, Caines, and Malhamé, see [26]. Generally speaking, the purpose is to understand the continuum limit of optimizers or of equilibria (say in Nash sense) as the number of underlying players tends to infinity. When approached from the probabilistic viewpoint, solutions to these control problems (or games) can be described by coupled mean field FBSDEs, meaning that the coefficients depend upon the own marginal laws of the solution. In this note, we detail two methods for solving such FBSDEs which we implement and apply to five benchmark problems. The first method uses a tree structure to represent the pathwise laws of the solution, whereas the second method uses a grid discretization to represent the time marginal laws of the solutions. Both are based on a Picard scheme; importantly, we combine each of them with a generic continuation method that permits to extend the time horizon (or equivalently the coupling strength between the two equations) for which the Picard iteration converges.

Download Full-text

Finite-State Contract Theory with a Principal and a Field of Agents

Management Science ◽

10.1287/mnsc.2020.3760 ◽

2020 ◽

Author(s):

René Carmona ◽

Peiqi Wang

Keyword(s):

Optimal Control ◽

Stochastic Models ◽

Contract Theory ◽

Large Population ◽

Mean Field ◽

Theory Model ◽

Mean Field Games ◽

Weak Formulation ◽

Finite State ◽

Set Up

We use the recently developed probabilistic analysis of mean field games with finitely many states in the weak formulation to set up a principal/agent contract theory model where the principal faces a large population of agents interacting in a mean field manner. We reduce the problem to the optimal control of dynamics of the McKean-Vlasov type, and we solve this problem explicitly for a class of models with concave rewards. The paper concludes with a numerical example demonstrating the power of the results when applied to an example of epidemic containment. This paper was accepted by Baris Ata, stochastic models and simulation.

Download Full-text

Multi-Agent Reinforcement Learning: A Review of Challenges and Applications

Applied Sciences ◽

10.3390/app11114948 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4948

Author(s):

Lorenzo Canese ◽

Gian Carlo Cardarilli ◽

Luca Di Di Nunzio ◽

Rocco Fazzolari ◽

Daniele Giardino ◽

...

Keyword(s):

Reinforcement Learning ◽

Mathematical Models ◽

Learning Algorithms ◽

Single Agent ◽

Critical Issues ◽

Multi Agent ◽

Pros And Cons ◽

Application Fields

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.

Download Full-text