Q-Table compression for reinforcement learning

The Knowledge Engineering Review ◽

10.1017/s0269888918000280 ◽

2018 ◽

Vol 33 ◽

Cited By ~ 1

Author(s):

Leonardo Amado ◽

Felipe Meneguzzi

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Real Time ◽

Prior Knowledge ◽

Q Value ◽

State Spaces ◽

Factor Problem ◽

Branching Factor ◽

Rts Game

AbstractReinforcement learning (RL) algorithms are often used to compute agents capable of acting in environments without prior knowledge of the environment dynamics. However, these algorithms struggle to converge in environments with large branching factors and their large resulting state-spaces. In this work, we develop an approach to compress the number of entries in a Q-value table using a deep auto-encoder. We develop a set of techniques to mitigate the large branching factor problem. We present the application of such techniques in the scenario of a real-time strategy (RTS) game, where both state space and branching factor are a problem. We empirically evaluate an implementation of the technique to control agents in an RTS game scenario where classical RL fails and provide a number of possible avenues of further work on this problem.

Download Full-text

Playing Card-Based RTS Games with Deep Reinforcement Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/631 ◽

2019 ◽

Author(s):

Tianyu Liu ◽

Zijie Zheng ◽

Hongchang Li ◽

Kaigui Bian ◽

Lingyang Song

Keyword(s):

Reinforcement Learning ◽

Decision Tree ◽

State Space ◽

Imperfect Information ◽

Rule Based ◽

Playing Card ◽

Board Games ◽

Deep Model ◽

Game Ai ◽

Rts Game

Game AI is of great importance as games are simulations of reality. Recent research on game AI has shown much progress in various kinds of games, such as console games, board games and MOBA games. However, the exploration in RTS games remains a challenge for their huge state space, imperfect information, sparse rewards and various strategies. Besides, the typical card-based RTS games have complex card features and are still lacking solutions. We present a deep model SEAT (selection-attention) to play card-based RTS games. The SEAT model includes two parts, a selection part for card choice and an attention part for card usage, and it learns from scratch via deep reinforcement learning. Comprehensive experiments are performed on Clash Royale, a popular mobile card-based RTS game. Empirical results show that the SEAT model agent makes it to reach a high winning rate against rule-based agents and decision-tree-based agent.

Download Full-text

Modified Adversarial Hierarchical Task Network Planning in Real-Time Strategy Games

10.20944/preprints201704.0119.v1 ◽

2017 ◽

Author(s):

Lin Sun ◽

Peng Jiao ◽

Kai Xu ◽

Quanjun Yin ◽

Yabing Zha

Keyword(s):

Real Time ◽

Network Planning ◽

Decision Time ◽

State Spaces ◽

Hierarchical Task Network ◽

Strategy Games ◽

Planning Algorithm ◽

Complex Relationships ◽

Rts Game ◽

Hierarchical Task Network Planning

Real-time strategy (RTS) game has proposed many challenges for AI research for its large state spaces, enormous branch factors, limited decision time and dynamic adversarial environment. To tackle above problems, the method called Adversarial Hierarchical Task Network planning (AHTN) has been proposed and achieves favorable performance. However, the HTN description it used cannot express complex relationships among tasks and impacts of environment on tasks. Moreover, the AHTN cannot handle task failures during plan execution. In this paper, we propose a modified AHTN planning algorithm named AHTNR. The algorithm introduces three elements essential task, phase and exit condition to extend the HTN description. To deal with possible task failures, the AHTNR first uses the extended HTN description to identify failed tasks. And then a novel task repair strategy is proposed based on historical information to maintain the validity of previous plan. Finally, empirical results are presented for the μRTS game, comparing AHTNR to the state-of-the-art search algorithms for RTS games.

Download Full-text

Combined Sewer Overflow and flooding Reduction through a Safe Real-Time Control based on Multi-Reinforcement Learning, Model Predictive Control, and q value improvement

10.1002/essoar.10507735.1 ◽

2021 ◽

Author(s):

WenChong Tian ◽

Zhenliang Liao ◽

Guozheng Zhi ◽

Zhiyu Zhang ◽

XUAN WANG

Keyword(s):

Reinforcement Learning ◽

Model Predictive Control ◽

Real Time ◽

Predictive Control ◽

Combined Sewer Overflow ◽

Q Value ◽

Real Time Control ◽

Time Control ◽

Combined Sewer ◽

Reinforcement Learning Model

Download Full-text

Stratified Strategy Selection for Unit Control in Real-Time Strategy Games

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/522 ◽

2017 ◽

Cited By ~ 4

Author(s):

Levi H. S. Lelis

Keyword(s):

State Space ◽

Real Time ◽

State Of The Art ◽

Search Algorithm ◽

Type System ◽

Type Systems ◽

Strategy Selection ◽

Strategy Games ◽

Selection For ◽

Rts Game

In this paper we introduce Stratified Strategy Selection (SSS), a novel search algorithm for micromanaging units in real-time strategy (RTS) games. SSS uses a type system to partition the player's units into types and assumes that units of the same type must follow the same strategy. SSS searches in the state space induced by the type system to select, from a pool of options, a strategy for each unit. Empirical results on a simulator of an RTS game shows that SSS employing either fixed or adaptive type systems is able to substantially outperform state-of-the-art search-based algorithms in combat scenarios with up to 100 units.

Download Full-text

A State Space Compression Method Based on Multivariate Analysis for Reinforcement Learning in High-Dimensional Continuous State Spaces

IEICE Transactions on Fundamentals of Electronics Communications and Computer Sciences ◽

10.1093/ietfec/e89-a.8.2181 ◽

2006 ◽

Vol E89-A (8) ◽

pp. 2181-2191 ◽

Cited By ~ 2

Author(s):

H. SATOH

Keyword(s):

Multivariate Analysis ◽

Reinforcement Learning ◽

State Space ◽

High Dimensional ◽

Compression Method ◽

State Spaces ◽

Continuous State ◽

Space Compression

Download Full-text

Algorithm of stable state spaces in reinforcement learning

Journal of Computer Applications ◽

10.3724/sp.j.1087.2008.01328 ◽

2008 ◽

Vol 28 (5) ◽

pp. 1328-1330

Author(s):

Yu ZHENG

Keyword(s):

Reinforcement Learning ◽

Stable State ◽

State Spaces

Download Full-text

BattleNet: Capturing Advantageous Battlefield in RTS Games (Student Abstract)

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i10.7197 ◽

2020 ◽

Vol 34 (10) ◽

pp. 13849-13850

Author(s):

Donghyeon Lee ◽

Man-Je Kim ◽

Chang Wook Ahn

Keyword(s):

Artificial Intelligence ◽

Decision Making ◽

Real Time ◽

Large Scale ◽

Outcome Predictor ◽

Short Term ◽

Rts Game

In a real-time strategy (RTS) game, StarCraft II, players need to know the consequences before making a decision in combat. We propose a combat outcome predictor which utilizes terrain information as well as squad information. For training the model, we generated a StarCraft II combat dataset by simulating diverse and large-scale combat situations. The overall accuracy of our model was 89.7%. Our predictor can be integrated into the artificial intelligence agent for RTS games as a short-term decision-making module.

Download Full-text

Real-time Energy Management of Microgrid Using Reinforcement Learning

2020 19th International Symposium on Distributed Computing and Applications for Business Engineering and Science (DCABES) ◽

10.1109/dcabes50732.2020.00019 ◽

2020 ◽

Author(s):

Wenzheng Bi ◽

Yuankai Shu ◽

Wei Dong ◽

Qiang Yang

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Energy Management

Download Full-text

Real-Time Safety Optimization of Connected Vehicle Trajectories Using Reinforcement Learning

Sensors ◽

10.3390/s21113864 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3864

Author(s):

Tarek Ghoul ◽

Tarek Sayed

Keyword(s):

Reinforcement Learning ◽

Real Time ◽

Low Cost ◽

Safety Evaluation ◽

Traffic Volume ◽

Connected Vehicles ◽

Connected Vehicle ◽

Real World Data ◽

Physical Constraints ◽

Traffic Conflicts

Speed advisories are used on highways to inform vehicles of upcoming changes in traffic conditions and apply a variable speed limit to reduce traffic conflicts and delays. This study applies a similar concept to intersections with respect to connected vehicles to provide dynamic speed advisories in real-time that guide vehicles towards an optimum speed. Real-time safety evaluation models for signalized intersections that depend on dynamic traffic parameters such as traffic volume and shock wave characteristics were used for this purpose. The proposed algorithm incorporates a rule-based approach alongside a Deep Deterministic Policy Gradient reinforcement learning technique (DDPG) to assign ideal speeds for connected vehicles at intersections and improve safety. The system was tested on two intersections using real-world data and yielded an average reduction in traffic conflicts ranging from 9% to 23%. Further analysis was performed to show that the algorithm yields tangible results even at lower market penetration rates (MPR). The algorithm was tested on the same intersection with different traffic volume conditions as well as on another intersection with different physical constraints and characteristics. The proposed algorithm provides a low-cost approach that is not computationally intensive and works towards optimizing for safety by reducing rear-end traffic conflicts.

Download Full-text

Reinforcement learning versus swarm intelligence for autonomous multi-HAPS coordination

SN Applied Sciences ◽

10.1007/s42452-021-04658-6 ◽

2021 ◽

Vol 3 (6) ◽

Author(s):

Ogbonnaya Anicho ◽

Philip B. Charlesworth ◽

Gurvinder S. Baicher ◽

Atulya K. Nagar

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Swarm Intelligence ◽

Performance Indicators ◽

Convergence Rates ◽

Tuning Parameters ◽

Continuous State Space ◽

Continuous State ◽

User Coverage ◽

Better Than

AbstractThis work analyses the performance of Reinforcement Learning (RL) versus Swarm Intelligence (SI) for coordinating multiple unmanned High Altitude Platform Stations (HAPS) for communications area coverage. It builds upon previous work which looked at various elements of both algorithms. The main aim of this paper is to address the continuous state-space challenge within this work by using partitioning to manage the high dimensionality problem. This enabled comparing the performance of the classical cases of both RL and SI establishing a baseline for future comparisons of improved versions. From previous work, SI was observed to perform better across various key performance indicators. However, after tuning parameters and empirically choosing suitable partitioning ratio for the RL state space, it was observed that the SI algorithm still maintained superior coordination capability by achieving higher mean overall user coverage (about 20% better than the RL algorithm), in addition to faster convergence rates. Though the RL technique showed better average peak user coverage, the unpredictable coverage dip was a key weakness, making SI a more suitable algorithm within the context of this work.

Download Full-text