Provably Efficient Reinforcement Learning in Decentralized General-Sum Markov Games

Solving Cyber Alert Allocation Markov Games with Deep Reinforcement Learning

Lecture Notes in Computer Science - Decision and Game Theory for Security ◽

10.1007/978-3-030-32430-8_11 ◽

2019 ◽

pp. 164-183 ◽

Cited By ~ 1

Author(s):

Noah Dunstatter ◽

Alireza Tahsini ◽

Mina Guirguis ◽

Jelena Tešić

Keyword(s):

Reinforcement Learning ◽

Markov Games

Download Full-text

Multi-agent reinforcement learning using ordinal action selection and approximate policy iteration

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691316500533 ◽

2016 ◽

Vol 14 (06) ◽

pp. 1650053

Author(s):

Daxue Liu ◽

Jun Wu ◽

Xin Xu

Keyword(s):

Reinforcement Learning ◽

Single Agent ◽

Action Selection ◽

Policy Iteration ◽

Common Interest ◽

Policy Space ◽

Markov Games ◽

Approximate Policy Iteration ◽

Multi Agent ◽

Agent Coordination

Multi-agent reinforcement learning (MARL) provides a useful and flexible framework for multi-agent coordination in uncertain dynamic environments. However, the generalization ability and scalability of algorithms to large problem sizes, already problematic in single-agent RL, is an even more formidable obstacle in MARL applications. In this paper, a new MARL method based on ordinal action selection and approximate policy iteration called OAPI (Ordinal Approximate Policy Iteration), is presented to address the scalability issue of MARL algorithms in common-interest Markov Games. In OAPI, an ordinal action selection and learning strategy is integrated with distributed approximate policy iteration not only to simplify the policy space and eliminate the conflicts in multi-agent coordination, but also to realize the approximation of near-optimal policies for Markov Games with large state spaces. Based on the simplified policy space using ordinal action selection, the OAPI algorithm implements distributed approximate policy iteration utilizing online least-squares policy iteration (LSPI). This resulted in multi-agent coordination with good convergence properties with reduced computational complexity. The simulation results of a coordinated multi-robot navigation task illustrate the feasibility and effectiveness of the proposed approach.

Download Full-text

Network Selection in 5G Networks Based on Markov Games and Friend-or-Foe Reinforcement Learning

2020 IEEE Wireless Communications and Networking Conference Workshops (WCNCW) ◽

10.1109/wcncw48565.2020.9124723 ◽

2020 ◽

Cited By ~ 1

Author(s):

Alessandro Giuseppi ◽

Emanuele De Santis ◽

Francesco Delli Priscoli ◽

Seok Ho Won ◽

Taesang Choi ◽

...

Keyword(s):

Reinforcement Learning ◽

Network Selection ◽

5G Networks ◽

Markov Games

Download Full-text

Markov games as a framework for multi-agent reinforcement learning

Machine Learning Proceedings 1994 ◽

10.1016/b978-1-55860-335-6.50027-1 ◽

1994 ◽

pp. 157-163 ◽

Cited By ~ 531

Author(s):

Michael L. Littman

Keyword(s):

Reinforcement Learning ◽

Markov Games ◽

Multi Agent

Download Full-text

A Unified Analysis of Value-Function-Based Reinforcement-Learning Algorithms

Neural Computation ◽

10.1162/089976699300016070 ◽

1999 ◽

Vol 11 (8) ◽

pp. 2017-2060 ◽

Cited By ~ 70

Author(s):

Csaba Szepesvári ◽

Michael L. Littman

Keyword(s):

Reinforcement Learning ◽

Value Function ◽

Learning Algorithm ◽

Learning Algorithms ◽

Sequential Decision ◽

Q Learning ◽

Markov Games ◽

Optimal Behavior ◽

Risk Sensitive ◽

Optimal Value

Reinforcement learning is the problem of generating optimal behavior in a sequential decision-making environment given the opportunity of interacting with it. Many algorithms for solving reinforcement-learning problems work by computing improved estimates of the optimal value function. We extend prior analyses of reinforcement-learning algorithms and present a powerful new theorem that can provide a unified analysis of such value-function-based reinforcement-learning algorithms. The usefulness of the theorem lies in how it allows the convergence of a complex asynchronous reinforcement-learning algorithm to be proved by verifying that a simpler synchronous algorithm converges. We illustrate the application of the theorem by analyzing the convergence of Q-learning, model-based reinforcement learning, Q-learning with multistate updates, Q-learning for Markov games, and risk-sensitive reinforcement learning.

Download Full-text

Model-Based Reinforcement Learning for Alternating Markov Games

Lecture Notes in Computer Science - AI 2003: Advances in Artificial Intelligence ◽

10.1007/978-3-540-24581-0_44 ◽

2003 ◽

pp. 520-531

Author(s):

Drew Mellor

Keyword(s):

Reinforcement Learning ◽

Markov Games ◽

Model Based

Download Full-text

, a simple reinforcement learning scheme for two-player zero-sum Markov games

Neurocomputing ◽

10.1016/j.neucom.2008.12.022 ◽

2009 ◽

Vol 72 (7-9) ◽

pp. 1494-1507

Author(s):

Benoît Frénay ◽

Marco Saerens

Keyword(s):

Reinforcement Learning ◽

Markov Games ◽

Learning Scheme ◽

Zero Sum

Download Full-text

On Reinforcement Learning for Turn-based Zero-sum Markov Games

Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference ◽

10.1145/3412815.3416888 ◽

2020 ◽

Author(s):

Devavrat Shah ◽

Varun Somani ◽

Qiaomin Xie ◽

Zhi Xu

Keyword(s):

Reinforcement Learning ◽

Markov Games ◽

Zero Sum

Download Full-text

Traffic-signal control reinforcement learning approach for continuous-time Markov games

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2019.103415 ◽

2020 ◽

Vol 89 ◽

pp. 103415

Author(s):

Román Aragon-Gómez ◽

Julio B. Clempner

Keyword(s):

Reinforcement Learning ◽

Continuous Time ◽

Traffic Signal ◽

Signal Control ◽

Traffic Signal Control ◽

Learning Approach ◽

Markov Games

Download Full-text

Reinforcement Learning with an Extended Classifier System in Zero-sum Markov Games

2019 IEEE International Conference on Agents (ICA) ◽

10.1109/agents.2019.8929148 ◽

2019 ◽

Cited By ~ 1

Author(s):

Chang Wang ◽

Hao Chen ◽

Chao Yan ◽

Xiaojia Xiang

Keyword(s):

Reinforcement Learning ◽

Markov Games ◽

Classifier System ◽

Zero Sum

Download Full-text