Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving

2020 ◽  
Vol 131 ◽  
pp. 103568
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Andrea Tirinzoni ◽  
Riccardo Giol ◽  
Marcello Restelli ◽  
...  
2021 ◽  
Vol 31 (3) ◽  
pp. 1-26
Author(s):  
Aravind Balakrishnan ◽  
Jaeyoung Lee ◽  
Ashish Gaurav ◽  
Krzysztof Czarnecki ◽  
Sean Sedwards

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.


Author(s):  
Zhenhai Gao ◽  
Xiangtong Yan ◽  
Fei Gao ◽  
Lei He

Decision-making is one of the key parts of the research on vehicle longitudinal autonomous driving. Considering the behavior of human drivers when designing autonomous driving decision-making strategies is a current research hotspot. In longitudinal autonomous driving decision-making strategies, traditional rule-based decision-making strategies are difficult to apply to complex scenarios. Current decision-making methods that use reinforcement learning and deep reinforcement learning construct reward functions designed with safety, comfort, and economy. Compared with human drivers, the obtained decision strategies still have big gaps. Focusing on the above problems, this paper uses the driver’s behavior data to design the reward function of the deep reinforcement learning algorithm through BP neural network fitting, and uses the deep reinforcement learning DQN algorithm and the DDPG algorithm to establish two driver-like longitudinal autonomous driving decision-making models. The simulation experiment compares the decision-making effect of the two models with the driver curve. The results shows that the two algorithms can realize driver-like decision-making, and the consistency of the DDPG algorithm and human driver behavior is higher than that of the DQN algorithm, the effect of the DDPG algorithm is better than the DQN algorithm.


Author(s):  
Junfeng Zhang ◽  
Qing Xue

In a tactical wargame, the decisions of the artificial intelligence (AI) commander are critical to the final combat result. Due to the existence of fog-of-war, AI commanders are faced with unknown and invisible information on the battlefield and lack of understanding of the situation, and it is difficult to make appropriate tactical strategies. The traditional knowledge rule-based decision-making method lacks flexibility and autonomy. How to make flexible and autonomous decision-making when facing complex battlefield situations is a difficult problem. This paper aims to solve the decision-making problem of the AI commander by using the deep reinforcement learning (DRL) method. We develop a tactical wargame as the research environment, which contains built-in script AI and supports the machine–machine combat mode. On this basis, an end-to-end actor–critic framework for commander decision making based on the convolutional neural network is designed to represent the battlefield situation and the reinforcement learning method is used to try different tactical strategies. Finally, we carry out a combat experiment between a DRL-based agent and a rule-based agent in a jungle terrain scenario. The result shows that the AI commander who adopts the actor–critic method successfully learns how to get a higher score in the tactical wargame, and the DRL-based agent has a higher winning ratio than the rule-based agent.


2020 ◽  
Vol 3 (4) ◽  
pp. 374-385
Author(s):  
Guofa Li ◽  
Shenglong Li ◽  
Shen Li ◽  
Yechen Qin ◽  
Dongpu Cao ◽  
...  

2020 ◽  
Vol 5 (2) ◽  
pp. 294-305 ◽  
Author(s):  
Carl-Johan Hoel ◽  
Katherine Driggs-Campbell ◽  
Krister Wolff ◽  
Leo Laine ◽  
Mykel J. Kochenderfer

Author(s):  
Akifumi Wachi

We examine the problem of adversarial reinforcement learning for multi-agent domains including a rule-based agent. Rule-based algorithms are required in safety-critical applications for them to work properly in a wide range of situations. Hence, every effort is made to find failure scenarios during the development phase. However, as the software becomes complicated, finding failure cases becomes difficult. Especially in multi-agent domains, such as autonomous driving environments, it is much harder to find useful failure scenarios that help us improve the algorithm. We propose a method for efficiently finding failure scenarios; this method trains the adversarial agents using multi-agent reinforcement learning such that the tested rule-based agent fails. We demonstrate the effectiveness of our proposed method using a simple environment and autonomous driving simulator.


Author(s):  
Ruolan Zhang ◽  
Masao Furusho

Abstract Due to the quality and error of the data itself, historical automatic identification system (AIS) data was insufficient used to predict navigation risk at sea, but it adequately used to train decision-making neural networks. This paper presents a real AIS ship navigation environment with a rule-based and a neural-based decision processes with frame motion and training the decision network using a deep reinforcement learning algorithm. Rule-based decision-making has several applications in the field of adaptive systems, expert systems, and decision support systems, it also including general ship navigation which regulated by the convention on the international regulations for preventing collisions at sea (COLREGs). However, if someone intend to achieve full unmanned ship navigation without any remote control at the open sea, a rule-based decision-making system cannot be implemented alone. With the growing amount of data, complex sea environment, different collision scenarios, the agent-based decision has become an important role in transportation. For ships, combined rule-based and neural-based decision-making is the only option. It has become progressively challenging to satisfy autonomous decision-making development requirements. This study uses deep reinforcement learning to evaluate the performance of decision-making efficiency under different AIS data input shapes. The results show that the decision neural network trained with AIS data has good robustness and a high ability to achieve collision avoidance. Furthermore, using the same methodology, include instructive guidance for processing radar, camera, ENC, etc., respond to different risk perception tasks in different scenarios. It has important implications for fully unmanned navigation.


Sign in / Sign up

Export Citation Format

Share Document