scholarly journals Exploration-exploitation in multi-agent learning: Catastrophe theory meets game theory

2021 ◽  
pp. 103653
Author(s):  
Stefanos Leonardos ◽  
Georgios Piliouras
IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Giuseppe Caso ◽  
Ozgu Alay ◽  
Guido Carlo Ferrante ◽  
Luca De Nardis ◽  
Maria-Gabriella Di Benedetto ◽  
...  

Author(s):  
Wolfram Barfuss

AbstractA dynamical systems perspective on multi-agent learning, based on the link between evolutionary game theory and reinforcement learning, provides an improved, qualitative understanding of the emerging collective learning dynamics. However, confusion exists with respect to how this dynamical systems account of multi-agent learning should be interpreted. In this article, I propose to embed the dynamical systems description of multi-agent learning into different abstraction levels of cognitive analysis. The purpose of this work is to make the connections between these levels explicit in order to gain improved insight into multi-agent learning. I demonstrate the usefulness of this framework with the general and widespread class of temporal-difference reinforcement learning. I find that its deterministic dynamical systems description follows a minimum free-energy principle and unifies a boundedly rational account of game theory with decision-making under uncertainty. I then propose an on-line sample-batch temporal-difference algorithm which is characterized by the combination of applying a memory-batch and separated state-action value estimation. I find that this algorithm serves as a micro-foundation of the deterministic learning equations by showing that its learning trajectories approach the ones of the deterministic learning equations under large batch sizes. Ultimately, this framework of embedding a dynamical systems description into different abstraction levels gives guidance on how to unleash the full potential of the dynamical systems approach to multi-agent learning.


2021 ◽  
Vol 54 (5) ◽  
pp. 1-35
Author(s):  
Shubham Pateria ◽  
Budhitama Subagdja ◽  
Ah-hwee Tan ◽  
Chai Quek

Hierarchical Reinforcement Learning (HRL) enables autonomous decomposition of challenging long-horizon decision-making tasks into simpler subtasks. During the past years, the landscape of HRL research has grown profoundly, resulting in copious approaches. A comprehensive overview of this vast landscape is necessary to study HRL in an organized manner. We provide a survey of the diverse HRL approaches concerning the challenges of learning hierarchical policies, subtask discovery, transfer learning, and multi-agent learning using HRL. The survey is presented according to a novel taxonomy of the approaches. Based on the survey, a set of important open problems is proposed to motivate the future research in HRL. Furthermore, we outline a few suitable task domains for evaluating the HRL approaches and a few interesting examples of the practical applications of HRL in the Supplementary Material.


2018 ◽  
Vol 19 (1) ◽  
pp. 154-175 ◽  
Author(s):  
Animesh DEBNATH ◽  
Abhirup BANDYOPADHYAY ◽  
Jagannath ROY ◽  
Samarjit KAR

The long-term evolution of multi agent multi criteria decision making (MCDM) and to obtain sustainable decision a novel methodology is proposed based on evolutionary game theory. In this paper multi agent MCDM is represented as an evolutionary game and the evolutionary strategies are defined as sustainable decisions. Here we consider the problem of decision making in Indian Tea Industry. The agents in this game are essentially Indian Tea Estate owner and Indian Tea board. The replicator dynamics of the evolutionary game are studied to obtain evolutionary strategies which could be defined as sustainable strategies. The multi agent MCDM in Indian Tea Industry is considered under different socio-political and Corporate Social Responsibility scenario and groups of Indian Tea Industry. Again, the impacts of imprecision and market volatility on the outcome of some strategies (decisions) are studied. In this paper the imprecision on the impact of the strategies are modelled as fuzzy numbers whereas the market volatility is taken into account as white noise. Hence the MCDM problem for Indian Tea Industry is modelled as a hybrid evolutionary game. The probabilities of strategies are obtained by solving hybrid evolutionary game and could be represented as a Dempster-Shafer belief structure. The simulation results facilitate the Decision Makers to choose the strategies (decisions) under different type of uncertainty.


2017 ◽  
Vol 4 (3) ◽  
pp. 155-169 ◽  
Author(s):  
Trevor R. Caskey ◽  
James S. Wasek ◽  
Anna Y. Franz

Sign in / Sign up

Export Citation Format

Share Document