Inducing selfish agents towards social efficient solutions
Many multi-agent reinforcement learning (MARL) scenarios lead towards Nash equilibria, which is known to not always be socially efficient. In this study we aim to align the social optimization objective of the system with the individual objectives of the agents by adopting a central controller which can interact with the agents. In details, our approach establishes a communication channel between reinforcement learning agents, and a controller implemented with metaheuristics. The interaction benefit the convergence of both algorithms. Further, we evaluate our method in repeated games with high price of anarchy and show that our approach is able to overcome much of the issues caused by the non-cooperative behaviour of the agents and the non-stationary effects they cause.