HeuRL: A Heuristically Initialized Reinforcement Learning Method for Autonomous Driving Control Task

2021 ◽

Author(s):

Lixin Zou

Keyword(s):

Reinforcement Learning ◽

Malaria Control ◽

Control Treatment ◽

Treatment Recommendation ◽

Learning Method ◽

Sequential Decision ◽

Control Task ◽

Daily Lives ◽

Monte Carlo Tree Search ◽

Main Challenge

Sequential decision-making under cost-sensitive tasks is prohibitively daunting, especially for the problem that has a significant impact on people's daily lives, such as malaria control, treatment recommendation. The main challenge faced by policymakers is to learn a policy from scratch by interacting with a complex environment in a few trials. This work introduces a practical, data-efficient policy learning method, named Variance-Bonus Monte Carlo Tree Search~(VB-MCTS), which can copy with very little data and facilitate learning from scratch in only a few trials. Specifically, the solution is a model-based reinforcement learning method. To avoid model bias, we apply Gaussian Process~(GP) regression to estimate the transitions explicitly. With the GP world model, we propose a variance-bonus reward to measure the uncertainty about the world. Adding the reward to the planning with MCTS can result in more efficient and effective exploration. Furthermore, the derived polynomial sample complexity indicates that VB-MCTS is sample efficient. Finally, outstanding performance on a competitive world-level RL competition and extensive experimental results verify its advantage over the state-of-the-art on the challenging malaria control task.

Download Full-text

A Plant Control Technology Using Reinforcement Learning Method with Automatic Reward Adjustment

IEEJ Transactions on Electronics Information and Systems ◽

10.1541/ieejeiss.129.1253 ◽

2009 ◽

Vol 129 (7) ◽

pp. 1253-1263

Author(s):

Toru Eguchi ◽

Takaaki Sekiai ◽

Akihiro Yamada ◽

Satoru Shimizu ◽

Masayuki Fukai

Keyword(s):

Reinforcement Learning ◽

Control Technology ◽

Learning Method ◽

Plant Control

Download Full-text

Multi-index Evaluation based Reinforcement Learning Method for Cyclic Optimization of Multiple Energy Utilization in Steel Industry

2020 39th Chinese Control Conference (CCC) ◽

10.23919/ccc50068.2020.9189540 ◽

2020 ◽

Author(s):

Ze Wang ◽

Linqing Wang ◽

Zhongyang Han ◽

Jun Zhao

Keyword(s):

Reinforcement Learning ◽

Steel Industry ◽

Energy Utilization ◽

Learning Method ◽

Multi Index ◽

Index Evaluation

Download Full-text

Medium and Long-Term Stochastic Optimization of Hybrid Pumped Storage Reservoir via Reinforcement Learning Method

International Journal of Scientific and Research Publications (IJSRP) ◽

10.29322/ijsrp.8.11.2018.p8309 ◽

2018 ◽

Vol 8 (11) ◽

Author(s):

Daniel Eliote Mbanze ◽

Li Wenwu ◽

Zhang Xueying

Keyword(s):

Reinforcement Learning ◽

Stochastic Optimization ◽

Learning Method ◽

Storage Reservoir ◽

Pumped Storage

Download Full-text

Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation

2020 IEEE Intelligent Vehicles Symposium (IV) ◽

10.1109/iv47402.2020.9304614 ◽

2020 ◽

Author(s):

Carl-Johan Hoel ◽

Krister Wolff ◽

Leo Laine

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Autonomous Driving ◽

Uncertainty Estimation ◽

Tactical Decision

Download Full-text

Collision Avoidance in IEEE 802.11 DCF using a Reinforcement Learning Method

2020 International Conference on Information and Communication Technology Convergence (ICTC) ◽

10.1109/ictc49870.2020.9289402 ◽

2020 ◽

Author(s):

Chang Kyu Lee ◽

Seung Hyong Rhee

Keyword(s):

Reinforcement Learning ◽

Collision Avoidance ◽

Ieee 802.11 ◽

Learning Method ◽

Ieee 802.11 Dcf ◽

802.11 Dcf

Download Full-text

A Reinforcement Learning Approach for Enacting Cautious Behaviours in Autonomous Driving System: Safe Speed Choice in the Interaction With Distracted Pedestrians

IEEE Transactions on Intelligent Transportation Systems ◽

10.1109/tits.2021.3086397 ◽

2021 ◽

pp. 1-18

Author(s):

Gastone Pietro Rosati Papini ◽

Alice Plebe ◽

Mauro Da Lio ◽

Riccardo Dona

Keyword(s):

Reinforcement Learning ◽

Autonomous Driving ◽

Learning Approach ◽

Driving System ◽

Autonomous Driving System

Download Full-text

Control of an Inverted Pendulum by Reinforcement Learning Method in PLC Environment

2020 Innovations in Intelligent Systems and Applications Conference (ASYU) ◽

10.1109/asyu50717.2020.9259890 ◽

2020 ◽

Author(s):

Gokhan Demirkiran ◽

Ozcan Erdener ◽

Onay Akpinar ◽

Pelin Demirtas ◽

M. Yagiz Arik ◽

...

Keyword(s):

Reinforcement Learning ◽

Inverted Pendulum ◽

Learning Method

Download Full-text

AoI-Energy-Aware UAV-assisted Data Collection for IoT Networks: A Deep Reinforcement Learning Method

IEEE Internet of Things Journal ◽

10.1109/jiot.2021.3078701 ◽

2021 ◽

pp. 1-1

Author(s):

Mengying Sun ◽

Xiaodong Xu ◽

Xiaoqi Qin ◽

Ping Zhang

Keyword(s):

Reinforcement Learning ◽

Data Collection ◽

Learning Method ◽

Energy Aware

Download Full-text

Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions

Applied Sciences ◽

10.3390/app11031291 ◽

2021 ◽

Vol 11 (3) ◽

pp. 1291

Author(s):

Bonwoo Gu ◽

Yunsick Sung

Keyword(s):

Reinforcement Learning ◽

Search Algorithm ◽

Classification Criteria ◽

Tree Search ◽

Learning Method ◽

Board Game ◽

Ancient China ◽

Monte Carlo Tree Search ◽

High Level ◽

Tree Search Algorithm

Gomoku is a two-player board game that originated in ancient China. There are various cases of developing Gomoku using artificial intelligence, such as a genetic algorithm and a tree search algorithm. Alpha-Gomoku, Gomoku AI built with Alpha-Go’s algorithm, defines all possible situations in the Gomoku board using Monte-Carlo tree search (MCTS), and minimizes the probability of learning other correct answers in the duplicated Gomoku board situation. However, in the tree search algorithm, the accuracy drops, because the classification criteria are manually set. In this paper, we propose an improved reinforcement learning-based high-level decision approach using convolutional neural networks (CNN). The proposed algorithm expresses each state as One-Hot Encoding based vectors and determines the state of the Gomoku board by combining the similar state of One-Hot Encoding based vectors. Thus, in a case where a stone that is determined by CNN has already been placed or cannot be placed, we suggest a method for selecting an alternative. We verify the proposed method of Gomoku AI in GuPyEngine, a Python-based 3D simulation platform.

Download Full-text

HeuRL: A Heuristically Initialized Reinforcement Learning Method for Autonomous Driving Control Task

Data-Efficient Reinforcement Learning for Malaria Control

A Plant Control Technology Using Reinforcement Learning Method with Automatic Reward Adjustment

Multi-index Evaluation based Reinforcement Learning Method for Cyclic Optimization of Multiple Energy Utilization in Steel Industry

Medium and Long-Term Stochastic Optimization of Hybrid Pumped Storage Reservoir via Reinforcement Learning Method

Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation

Collision Avoidance in IEEE 802.11 DCF using a Reinforcement Learning Method

A Reinforcement Learning Approach for Enacting Cautious Behaviours in Autonomous Driving System: Safe Speed Choice in the Interaction With Distracted Pedestrians

Control of an Inverted Pendulum by Reinforcement Learning Method in PLC Environment

AoI-Energy-Aware UAV-assisted Data Collection for IoT Networks: A Deep Reinforcement Learning Method

Enhanced Reinforcement Learning Method Combining One-Hot Encoding-Based Vectors for CNN-Based Alternative High-Level Decisions

Export Citation Format