Convergent multiple-timescales reinforcement learning algorithms in normal form games

In this review, we present an analysis of the most used multi-agent reinforcement learning algorithms. Starting with the single-agent reinforcement learning algorithms, we focus on the most critical issues that must be taken into account in their extension to multi-agent scenarios. The analyzed algorithms were grouped according to their features. We present a detailed taxonomy of the main multi-agent approaches proposed in the literature, focusing on their related mathematical models. For each algorithm, we describe the possible application fields, while pointing out its pros and cons. The described multi-agent algorithms are compared in terms of the most important characteristics for multi-agent reinforcement learning applications—namely, nonstationarity, scalability, and observability. We also describe the most common benchmark environments used to evaluate the performances of the considered methods.

Download Full-text

On the existence of Pareto undominated mixed-strategy Nash equilibrium in normal-form games with infinite actions

Economics Letters ◽

10.1016/j.econlet.2021.109771 ◽

2021 ◽

Vol 201 ◽

pp. 109771

Author(s):

Haifeng Fu

Keyword(s):

Nash Equilibrium ◽

Normal Form ◽

Mixed Strategy ◽

Normal Form Games ◽

Mixed Strategy Nash Equilibrium ◽

Strategy Nash Equilibrium

Download Full-text

Benchmarking reinforcement learning algorithms for demand response applications

2020 IEEE PES Innovative Smart Grid Technologies Europe (ISGT-Europe) ◽

10.1109/isgt-europe47291.2020.9248800 ◽

2020 ◽

Author(s):

Brida V. Mbuwir ◽

Carlo Manna ◽

Fred Spiessens ◽

Geert Deconinck

Keyword(s):

Reinforcement Learning ◽

Demand Response ◽

Learning Algorithms

Download Full-text

Reinforcement Learning Algorithms: Analysis and Applications

10.1007/978-3-030-41188-6 ◽

2021 ◽

Keyword(s):

Reinforcement Learning ◽

Learning Algorithms

Download Full-text

Experimental evaluation of model-free reinforcement learning algorithms for continuous HVAC control

Applied Energy ◽

10.1016/j.apenergy.2021.117164 ◽

2021 ◽

Vol 298 ◽

pp. 117164

Author(s):

Marco Biemann ◽

Fabian Scheller ◽

Xiufeng Liu ◽

Lizhen Huang

Keyword(s):

Reinforcement Learning ◽

Experimental Evaluation ◽

Learning Algorithms ◽

Model Free ◽

Hvac Control

Download Full-text

Synthetic Experiences for Accelerating DQN Performance in Discrete Non-Deterministic Environments

Algorithms ◽

10.3390/a14080226 ◽

2021 ◽

Vol 14 (8) ◽

pp. 226

Author(s):

Wenzel Pilar von Pilchau ◽

Anthony Stein ◽

Jörg Hähner

Keyword(s):

Reinforcement Learning ◽

State Of The Art ◽

Learning Algorithms ◽

Weighted Average ◽

Up States ◽

Experience Replay

State-of-the-art Deep Reinforcement Learning Algorithms such as DQN and DDPG use the concept of a replay buffer called Experience Replay. The default usage contains only the experiences that have been gathered over the runtime. We propose a method called Interpolated Experience Replay that uses stored (real) transitions to create synthetic ones to assist the learner. In this first approach to this field, we limit ourselves to discrete and non-deterministic environments and use a simple equally weighted average of the reward in combination with observed follow-up states. We could demonstrate a significantly improved overall mean average in comparison to a DQN network with vanilla Experience Replay on the discrete and non-deterministic FrozenLake8x8-v0 environment.

Download Full-text