Algorithms or Actions? A Study in Large-Scale Reinforcement Learning

Large state and action spaces are very challenging to reinforcement learning. However, in many domains there is a set of algorithms available, which estimate the best action given a state. Hence, agents can either directly learn a performance-maximizing mapping from states to actions, or from states to algorithms. We investigate several aspects of this dilemma, showing sufficient conditions for learning over algorithms to outperform over actions for a finite number of training iterations. We present synthetic experiments to further study such systems. Finally, we propose a function approximation approach, demonstrating the effectiveness of learning over algorithms in real-time strategy games.

Download Full-text

Gym-µRTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning

10.1109/cog52621.2021.9619076 ◽

2021 ◽

Author(s):

Shengyi Huang ◽

Santiago Ontanon ◽

Chris Bamford ◽

Lukasz Grela

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Strategy Games

Download Full-text

A Novel Reinforcement Learning Architecture for Continuous State and Action Spaces

Advances in Artificial Intelligence ◽

10.1155/2013/492852 ◽

2013 ◽

Vol 2013 ◽

pp. 1-10

Author(s):

Víctor Uc-Cetina

Keyword(s):

Reinforcement Learning ◽

Finite Number ◽

Control Problem ◽

Experimental Work ◽

Infinite Number ◽

Robot Control ◽

Real Numbers ◽

Continuous State ◽

The Right ◽

Action Spaces

We introduce a reinforcement learning architecture designed for problems with an infinite number of states, where each state can be seen as a vector of real numbers and with a finite number of actions, where each action requires a vector of real numbers as parameters. The main objective of this architecture is to distribute in two actors the work required to learn the final policy. One actor decides what action must be performed; meanwhile, a second actor determines the right parameters for the selected action. We tested our architecture and one algorithm based on it solving the robot dribbling problem, a challenging robot control problem taken from the RoboCup competitions. Our experimental work with three different function approximators provides enough evidence to prove that the proposed architecture can be used to implement fast, robust, and reliable reinforcement learning algorithms.

Download Full-text

Planning Algorithms for Zero-Sum Games with Exponential Action Spaces: A Unifying Perspective

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/681 ◽

2020 ◽

Author(s):

Levi H. S. Lelis

Keyword(s):

Real Time ◽

Zero Sum Games ◽

Strategy Games ◽

Planning Algorithms ◽

Game State ◽

Zero Sum ◽

Action Spaces

In this paper we review several planning algorithms developed for zero-sum games with exponential action spaces, i.e., spaces that grow exponentially with the number of game components that can act simultaneously at a given game state. As an example, real-time strategy games have exponential action spaces because the number of actions available grows exponentially with the number of units controlled by the player. We also present a unifying perspective in which several existing algorithms can be described as an instantiation of a variant of NaiveMCTS. In addition to describing several existing planning algorithms for exponential action spaces, we show that other instantiations of this variant of NaiveMCTS represent novel and promising algorithms to be studied in future works.

Download Full-text

Scalable Multi-agent Reinforcement Learning Architecture for Semi-MDP Real-Time Strategy Games

Neural Computing for Advanced Applications - Communications in Computer and Information Science ◽

10.1007/978-981-15-7670-6_36 ◽

2020 ◽

pp. 433-446

Author(s):

Zhentao Wang ◽

Weiwei Wu ◽

Ziyao Huang

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Strategy Games ◽

Multi Agent

Download Full-text

Real Time Strategy Games: A Reinforcement Learning Approach

Procedia Computer Science ◽

10.1016/j.procs.2015.06.030 ◽

2015 ◽

Vol 54 ◽

pp. 257-264 ◽

Cited By ~ 6

Author(s):

Harshit Sethy ◽

Amit Patel ◽

Vineet Padmanabhan

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Learning Approach ◽

Strategy Games

Download Full-text

Tabular Reinforcement Learning in Real-Time Strategy Games via Options

2018 IEEE Conference on Computational Intelligence and Games (CIG) ◽

10.1109/cig.2018.8490427 ◽

2018 ◽

Cited By ~ 1

Author(s):

Anderson R. Tavares ◽

Luiz Chaimowicz

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Strategy Games

Download Full-text

EXPERIMENTS WITH ONLINE REINFORCEMENT LEARNING IN REAL-TIME STRATEGY GAMES

Applied Artificial Intelligence ◽

10.1080/08839510903246526 ◽

2009 ◽

Vol 23 (9) ◽

pp. 855-871 ◽

Cited By ~ 9

Author(s):

Kresten Toftgaard Andersen ◽

Yifeng Zeng ◽

Dennis Dahl Christensen ◽

Dung Tran

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Strategy Games

Download Full-text

Mastering Complex Control in MOBA Games with Deep Reinforcement Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6144 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6672-6679 ◽

Cited By ~ 1

Author(s):

Deheng Ye ◽

Zhao Liu ◽

Mingfei Sun ◽

Bei Shi ◽

Peilin Zhao ◽

...

Keyword(s):

Reinforcement Learning ◽

Large Scale ◽

Action Control ◽

Learning Problem ◽

Complex Action ◽

Complex Control ◽

Learning Framework ◽

High Scalability ◽

Action Spaces ◽

Level Performance

We study the reinforcement learning problem of complex action control in the Multi-player Online Battle Arena (MOBA) 1v1 games. This problem involves far more complicated state and action spaces than those of traditional 1v1 games, such as Go and Atari series, which makes it very difficult to search any policies with human-level performance. In this paper, we present a deep reinforcement learning framework to tackle this problem from the perspectives of both system and algorithm. Our system is of low coupling and high scalability, which enables efficient explorations at large scale. Our algorithm includes several novel strategies, including control dependency decoupling, action mask, target attention, and dual-clip PPO, with which our proposed actor-critic network can be effectively trained in our system. Tested on the MOBA game Honor of Kings, the trained AI agents can defeat top professional human players in full 1v1 games.

Download Full-text

Safe Deployment of a Reinforcement Learning Robot Using Self Stabilization

10.36227/techrxiv.14842245.v1 ◽

2021 ◽

Author(s):

Nanda Kishore Sreenivas ◽

Shrisha Rao

Keyword(s):

Reinforcement Learning ◽

Finite Number ◽

Autonomous Vehicles ◽

Safe Space ◽

Training Phase ◽

Prior Work ◽

Industrial Systems ◽

Learning Agent ◽

Improved Performance ◽

Action Spaces

In toy environments like video games, a reinforcement learning agent is deployed and operates within the same state space in which it was trained. However, in robotics applications such as industrial systems or autonomous vehicles, this cannot be guaranteed. A robot can be pushed out of its training space by some unforeseen perturbation, which may cause it to go into an unknown state from which it has not been trained to move towards its goal. While most prior work in the area of RL safety focuses on ensuring safety in the training phase, this paper focuses on ensuring the safe deployment of a robot that has already been trained to operate within a safe space. This work defines a condition on the state and action spaces, that if satisfied, guarantees the robot's recovery to safety independently. We also propose a strategy and design that facilitate this recovery within a finite number of steps after perturbation. This is implemented and tested against a standard RL model, and the results indicate a much-improved performance.

Download Full-text

Hierarchical Reinforcement Learning for Real-Time Strategy Games

Proceedings of the 10th International Conference on Agents and Artificial Intelligence ◽

10.5220/0006593804700477 ◽

2018 ◽

Author(s):

Remi Niel ◽

Jasper Krebbers ◽

Madalina M. Drugan ◽

Marco A. Wiering

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Hierarchical Reinforcement Learning ◽

Strategy Games

Download Full-text