Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions

Gomoku is a two-player board game that originated in ancient China. There are various cases of developing Gomoku using artificial intelligence, such as a genetic algorithm and a tree search algorithm. Alpha-Gomoku, Gomoku AI built with Alpha-Go’s algorithm, defines all possible situations in the Gomoku board using Monte-Carlo tree search (MCTS), and minimizes the probability of learning other correct answers in the duplicated Gomoku board situation. However, in the tree search algorithm, the accuracy drops, because the classification criteria are manually set. In this paper, we propose an improved reinforcement learning-based high-level decision approach using convolutional neural networks (CNN). The proposed algorithm expresses each state as One-Hot Encoding based vectors and determines the state of the Gomoku board by combining the similar state of One-Hot Encoding based vectors. Thus, in a case where a stone that is determined by CNN has already been placed or cannot be placed, we suggest a method for selecting an alternative. We verify the proposed method of Gomoku AI in GuPyEngine, a Python-based 3D simulation platform.

Download Full-text

A reinforcement learning application of a guided Monte Carlo Tree Search algorithm for beam orientation selection in radiation therapy

Machine Learning: Science and Technology ◽

10.1088/2632-2153/abe528 ◽

2021 ◽

Author(s):

Azar Sadeghnejad Barkousaraie ◽

Gyanendra Bohara ◽

Steve B Jiang ◽

Dan Nguyen

Keyword(s):

Monte Carlo ◽

Radiation Therapy ◽

Reinforcement Learning ◽

Search Algorithm ◽

Tree Search ◽

Orientation Selection ◽

Monte Carlo Tree Search ◽

Beam Orientation ◽

Tree Search Algorithm

Download Full-text

Deep learning inspired routing in ICN using Monte Carlo Tree Search algorithm

Journal of Parallel and Distributed Computing ◽

10.1016/j.jpdc.2020.12.014 ◽

2021 ◽

Author(s):

Nitul Dutta ◽

Shobhit K. Patel ◽

Vadim Samusenkov ◽

Vigneswaran D.

Keyword(s):

Monte Carlo ◽

Deep Learning ◽

Search Algorithm ◽

Tree Search ◽

Monte Carlo Tree Search ◽

Tree Search Algorithm

Download Full-text

Gesture commands for controlling high-level UAV behavior

SN Applied Sciences ◽

10.1007/s42452-021-04583-8 ◽

2021 ◽

Vol 3 (6) ◽

Author(s):

John Akagi ◽

T. Devon Morris ◽

Brady Moon ◽

Xingguang Chen ◽

Cameron K. Peterson

Keyword(s):

Search Algorithm ◽

Dynamic Environment ◽

List Type ◽

Hardware In The Loop ◽

Monte Carlo Tree Search ◽

Natural Interface ◽

Constrained Environments ◽

Novel Variant ◽

High Level ◽

Tree Search Algorithm

Abstract Directing groups of unmanned air vehicles (UAVs) is a task that typically requires the full attention of several operators. This can be prohibitive in situations where an operator must pay attention to their surroundings. In this paper we present a gesture device that assists operators in commanding UAVs in focus-constrained environments. The operator influences the UAVs’ behavior by using intuitive hand gesture movements. Gestures are captured using an accelerometer and gyroscope and then classified using a logistic regression model. Ten gestures were chosen to provide behaviors for a group of fixed-wing UAVs. These behaviors specified various searching, following, and tracking patterns that could be used in a dynamic environment. A novel variant of the Monte Carlo Tree Search algorithm was developed to autonomously plan the paths of the cooperating UAVs. These autonomy algorithms were executed when their corresponding gesture was recognized by the gesture device. The gesture device was trained to classify the ten gestures and accurately identified them 95% of the time. Each of the behaviors associated with the gestures was tested in hardware-in-the-loop simulations and the ability to dynamically switch between them was demonstrated. The results show that the system can be used as a natural interface to assist an operator in directing a fleet of UAVs. Article highlights A gesture device was created that enables operators to command a group of UAVs in focus-constrained environments. Each gesture triggers high-level commands that direct a UAV group to execute complex behaviors. Software simulations and hardware-in-the-loop testing shows the device is effective in directing UAV groups.

Download Full-text