Playing Card-Based RTS Games with Deep Reinforcement Learning

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/631 ◽

2019 ◽

Author(s):

Tianyu Liu ◽

Zijie Zheng ◽

Hongchang Li ◽

Kaigui Bian ◽

Lingyang Song

Keyword(s):

Reinforcement Learning ◽

Decision Tree ◽

State Space ◽

Imperfect Information ◽

Rule Based ◽

Playing Card ◽

Board Games ◽

Deep Model ◽

Game Ai ◽

Rts Game

Game AI is of great importance as games are simulations of reality. Recent research on game AI has shown much progress in various kinds of games, such as console games, board games and MOBA games. However, the exploration in RTS games remains a challenge for their huge state space, imperfect information, sparse rewards and various strategies. Besides, the typical card-based RTS games have complex card features and are still lacking solutions. We present a deep model SEAT (selection-attention) to play card-based RTS games. The SEAT model includes two parts, a selection part for card choice and an attention part for card usage, and it learns from scratch via deep reinforcement learning. Comprehensive experiments are performed on Clash Royale, a popular mobile card-based RTS game. Empirical results show that the SEAT model agent makes it to reach a high winning rate against rule-based agents and decision-tree-based agent.

Get full-text (via PubEx)

Self Organizing Decision Tree Based on Reinforcement Learning and its Application on State Space Partition

2006 IEEE International Conference on Systems, Man and Cybernetics ◽

10.1109/icsmc.2006.385115 ◽

2006 ◽

Cited By ~ 3

Author(s):

Kao-Shing Hwang ◽

Tsung-Wen Yang ◽

Chia-Ju Lin

Keyword(s):

Reinforcement Learning ◽

Decision Tree ◽

State Space ◽

Space Partition ◽

Self Organizing

Get full-text (via PubEx)

Reinforcement learning vs. rule-based adaptive traffic signal control: A Fourier basis linear function approximation for traffic signal control

AI Communications ◽

10.3233/aic-201580 ◽

2021 ◽

pp. 1-15

Author(s):

Theresa Ziemke ◽

Lucas N. Alegre ◽

Ana L.C. Bazzan

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Function Approximation ◽

Traffic Signals ◽

The State ◽

Signal Control ◽

Traffic Signal Control ◽

Rule Based ◽

Fourier Basis ◽

Linear Function Approximation

Reinforcement learning is an efficient, widely used machine learning technique that performs well when the state and action spaces have a reasonable size. This is rarely the case regarding control-related problems, as for instance controlling traffic signals. Here, the state space can be very large. In order to deal with the curse of dimensionality, a rough discretization of such space can be employed. However, this is effective just up to a certain point. A way to mitigate this is to use techniques that generalize the state space such as function approximation. In this paper, a linear function approximation is used. Specifically, SARSA ( λ ) with Fourier basis features is implemented to control traffic signals in the agent-based transport simulation MATSim. The results are compared not only to trivial controllers such as fixed-time, but also to state-of-the-art rule-based adaptive methods. It is concluded that SARSA ( λ ) with Fourier basis features is able to outperform such methods, especially in scenarios with varying traffic demands or unexpected events.

Get full-text (via PubEx)

Improving RTS Game AI by Supervised Policy Learning, Tactical Search, and Deep Reinforcement Learning

IEEE Computational Intelligence Magazine ◽

10.1109/mci.2019.2919363 ◽

2019 ◽

Vol 14 (3) ◽

pp. 8-18 ◽

Cited By ~ 4

Author(s):

Nicolas A. Barriga ◽

Marius Stanescu ◽

Felipe Besoain ◽

Michael Buro

Keyword(s):

Reinforcement Learning ◽

Policy Learning ◽

Game Ai ◽

Rts Game

Get full-text (via PubEx)

Q-Table compression for reinforcement learning

The Knowledge Engineering Review ◽

10.1017/s0269888918000280 ◽

2018 ◽

Vol 33 ◽

Cited By ~ 1

Author(s):

Leonardo Amado ◽

Felipe Meneguzzi

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Real Time ◽

Prior Knowledge ◽

Q Value ◽

State Spaces ◽

Factor Problem ◽

Branching Factor ◽

Rts Game

AbstractReinforcement learning (RL) algorithms are often used to compute agents capable of acting in environments without prior knowledge of the environment dynamics. However, these algorithms struggle to converge in environments with large branching factors and their large resulting state-spaces. In this work, we develop an approach to compress the number of entries in a Q-value table using a deep auto-encoder. We develop a set of techniques to mitigate the large branching factor problem. We present the application of such techniques in the scenario of a real-time strategy (RTS) game, where both state space and branching factor are a problem. We empirically evaluate an implementation of the technique to control agents in an RTS game scenario where classical RL fails and provide a number of possible avenues of further work on this problem.

Get full-text (via PubEx)

Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation ◽

10.1162/neco.2007.19.11.3051 ◽

2007 ◽

Vol 19 (11) ◽

pp. 3051-3087 ◽

Cited By ~ 5

Author(s):

Hajime Fujita ◽

Shin Ishii

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Imperfect Information ◽

Large Scale ◽

Computational Cost ◽

Sampling Technique ◽

Partial Observability ◽

Estimation And Prediction ◽

Model Based ◽

Partially Observable

Games constitute a challenging domain of reinforcement learning (RL) for acquiring strategies because many of them include multiple players and many unobservable variables in a large state space. The difficulty of solving such realistic multiagent problems with partial observability arises mainly from the fact that the computational cost for the estimation and prediction in the whole state space, including unobservable variables, is too heavy. To overcome this intractability and enable an agent to learn in an unknown environment, an effective approximation method is required with explicit learning of the environmental model. We present a model-based RL scheme for large-scale multiagent problems with partial observability and apply it to a card game, hearts. This game is a well-defined example of an imperfect information game and can be approximately formulated as a partially observable Markov decision process (POMDP) for a single learning agent. To reduce the computational cost, we use a sampling technique in which the heavy integration required for the estimation and prediction can be approximated by a plausible number of samples. Computer simulation results show that our method is effective in solving such a difficult, partially observable multiagent problem.

Get full-text (via PubEx)

Transfer Reinforcement Learning for Autonomous Driving

ACM Transactions on Modeling and Computer Simulation ◽

10.1145/3449356 ◽

2021 ◽

Vol 31 (3) ◽

pp. 1-26

Author(s):

Aravind Balakrishnan ◽

Jaeyoung Lee ◽

Ashish Gaurav ◽

Krzysztof Czarnecki ◽

Sean Sedwards

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Transfer Problem ◽

Autonomous Driving ◽

High Fidelity ◽

Rule Based ◽

High Level ◽

Real Vehicle

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.

Get full-text (via PubEx)

Reinforcement learning versus swarm intelligence for autonomous multi-HAPS coordination

SN Applied Sciences ◽

10.1007/s42452-021-04658-6 ◽

2021 ◽

Vol 3 (6) ◽

Author(s):

Ogbonnaya Anicho ◽

Philip B. Charlesworth ◽

Gurvinder S. Baicher ◽

Atulya K. Nagar

Keyword(s):

Reinforcement Learning ◽

State Space ◽

Swarm Intelligence ◽

Performance Indicators ◽

Convergence Rates ◽

Tuning Parameters ◽

Continuous State Space ◽

Continuous State ◽

User Coverage ◽

Better Than

AbstractThis work analyses the performance of Reinforcement Learning (RL) versus Swarm Intelligence (SI) for coordinating multiple unmanned High Altitude Platform Stations (HAPS) for communications area coverage. It builds upon previous work which looked at various elements of both algorithms. The main aim of this paper is to address the continuous state-space challenge within this work by using partitioning to manage the high dimensionality problem. This enabled comparing the performance of the classical cases of both RL and SI establishing a baseline for future comparisons of improved versions. From previous work, SI was observed to perform better across various key performance indicators. However, after tuning parameters and empirically choosing suitable partitioning ratio for the RL state space, it was observed that the SI algorithm still maintained superior coordination capability by achieving higher mean overall user coverage (about 20% better than the RL algorithm), in addition to faster convergence rates. Though the RL technique showed better average peak user coverage, the unpredictable coverage dip was a key weakness, making SI a more suitable algorithm within the context of this work.

Get full-text (via PubEx)