scholarly journals On the Convergence of Model Free Learning in Mean Field Games

2020 ◽  
Vol 34 (05) ◽  
pp. 7143-7150
Author(s):  
Romuald Elie ◽  
Julien Pérolat ◽  
Mathieu Laurière ◽  
Matthieu Geist ◽  
Olivier Pietquin

Learning by experience in Multi-Agent Systems (MAS) is a difficult and exciting task, due to the lack of stationarity of the environment, whose dynamics evolves as the population learns. In order to design scalable algorithms for systems with a large population of interacting agents (e.g., swarms), this paper focuses on Mean Field MAS, where the number of agents is asymptotically infinite. Recently, a very active burgeoning field studies the effects of diverse reinforcement learning algorithms for agents with no prior information on a stationary Mean Field Game (MFG) and learn their policy through repeated experience. We adopt a high perspective on this problem and analyze in full generality the convergence of a fictitious iterative scheme using any single agent learning algorithm at each step. We quantify the quality of the computed approximate Nash equilibrium, in terms of the accumulated errors arising at each learning iteration step. Notably, we show for the first time convergence of model free learning algorithms towards non-stationary MFG equilibria, relying only on classical assumptions on the MFG dynamics. We illustrate our theoretical results with a numerical experiment in a continuous action-space environment, where the approximate best response of the iterative fictitious play scheme is computed with a deep RL algorithm.

2017 ◽  
Vol 27 (01) ◽  
pp. 75-113 ◽  
Author(s):  
Yves Achdou ◽  
Martino Bardi ◽  
Marco Cirant

This paper introduces and analyzes some models in the framework of mean field games (MFGs) describing interactions between two populations motivated by the studies on urban settlements and residential choice by Thomas Schelling. For static games, a large population limit is proved. For differential games with noise, the existence of solutions is established for the systems of partial differential equations of MFG theory, in the stationary and in the evolutive case. Numerical methods are proposed with several simulations. In the examples and in the numerical results, particular emphasis is put on the phenomenon of segregation between the populations.


Energies ◽  
2021 ◽  
Vol 14 (24) ◽  
pp. 8517
Author(s):  
Samuel M. Muhindo ◽  
Roland P. Malhamé ◽  
Geza Joos

We develop a strategy, with concepts from Mean Field Games (MFG), to coordinate the charging of a large population of battery electric vehicles (BEVs) in a parking lot powered by solar energy and managed by an aggregator. A yearly parking fee is charged for each BEV irrespective of the amount of energy extracted. The goal is to share the energy available so as to minimize the standard deviation (STD) of the state of charge (SOC) of batteries when the BEVs are leaving the parking lot, while maintaining some fairness and decentralization criteria. The MFG charging laws correspond to the Nash equilibrium induced by quadratic cost functions based on an inverse Nash equilibrium concept and designed to favor the batteries with the lower SOCs upon arrival. While the MFG charging laws are strictly decentralized, they guarantee that a mean of instantaneous charging powers to the BEVs follows a trajectory based on the solar energy forecast for the day. That day ahead forecast is broadcasted to the BEVs which then gauge the necessary SOC upon leaving their home. We illustrate the advantages of the MFG strategy for the case of a typical sunny day and a typical cloudy day when compared to more straightforward strategies: first come first full/serve and equal sharing. The behavior of the charging strategies is contrasted under conditions of random arrivals and random departures of the BEVs in the parking lot.


Author(s):  
Joseph Frédéric Bonnans ◽  
Pierre Lavigne ◽  
Laurent Pfeiffer

We propose and investigate a discrete-time mean field game model involving risk-averse agents. The model under study is a coupled system of dynamic programming equations with a Kolmogorov equation. The agents' risk aversion is modeled by composite risk measures. The existence of a solution to the coupled system is obtained with a fixed point approach. The corresponding feedback control allows to construct an approximate Nash equilibrium for a related dynamic game with finitely many players.


2020 ◽  
Vol 45 (4) ◽  
pp. 1596-1620
Author(s):  
Naci Saldi ◽  
Tamer Başar ◽  
Maxim Raginsky

In this paper, we study a class of discrete-time mean-field games under the infinite-horizon risk-sensitive optimality criterion. Risk sensitivity is introduced for each agent (player) via an exponential utility function. In this game model, each agent is coupled with the rest of the population through the empirical distribution of the states, which affects both the agent’s individual cost and its state dynamics. Under mild assumptions, we establish the existence of a mean-field equilibrium in the infinite-population limit as the number of agents (N) goes to infinity, and we then show that the policy obtained from the mean-field equilibrium constitutes an approximate Nash equilibrium when N is sufficiently large.


2019 ◽  
Vol 65 ◽  
pp. 84-113
Author(s):  
Andrea Angiuli ◽  
Christy V. Graves ◽  
Houzhi Li ◽  
Jean-François Chassagneux ◽  
François Delarue ◽  
...  

This project investigates numerical methods for solving fully coupled forward-backward stochastic differential equations (FBSDEs) of McKean-Vlasov type. Having numerical solvers for such mean field FBSDEs is of interest because of the potential application of these equations to optimization problems over a large population, say for instance mean field games (MFG) and optimal mean field control problems. Theory for this kind of problems has met with great success since the early works on mean field games by Lasry and Lions, see [29], and by Huang, Caines, and Malhamé, see [26]. Generally speaking, the purpose is to understand the continuum limit of optimizers or of equilibria (say in Nash sense) as the number of underlying players tends to infinity. When approached from the probabilistic viewpoint, solutions to these control problems (or games) can be described by coupled mean field FBSDEs, meaning that the coefficients depend upon the own marginal laws of the solution. In this note, we detail two methods for solving such FBSDEs which we implement and apply to five benchmark problems. The first method uses a tree structure to represent the pathwise laws of the solution, whereas the second method uses a grid discretization to represent the time marginal laws of the solutions. Both are based on a Picard scheme; importantly, we combine each of them with a generic continuation method that permits to extend the time horizon (or equivalently the coupling strength between the two equations) for which the Picard iteration converges.


2020 ◽  
Author(s):  
René Carmona ◽  
Peiqi Wang

We use the recently developed probabilistic analysis of mean field games with finitely many states in the weak formulation to set up a principal/agent contract theory model where the principal faces a large population of agents interacting in a mean field manner. We reduce the problem to the optimal control of dynamics of the McKean-Vlasov type, and we solve this problem explicitly for a class of models with concave rewards. The paper concludes with a numerical example demonstrating the power of the results when applied to an example of epidemic containment. This paper was accepted by Baris Ata, stochastic models and simulation.


2021 ◽  
Vol 11 (11) ◽  
pp. 4948
Author(s):  
Lorenzo Canese ◽  
Gian Carlo Cardarilli ◽  
Luca Di Di Nunzio ◽  
Rocco Fazzolari ◽  
Daniele Giardino ◽  
...  

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.


Sign in / Sign up

Export Citation Format

Share Document