scholarly journals Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation

Author(s):  
Carl-Johan Hoel ◽  
Krister Wolff ◽  
Leo Laine
2020 ◽  
Vol 5 (2) ◽  
pp. 294-305 ◽  
Author(s):  
Carl-Johan Hoel ◽  
Katherine Driggs-Campbell ◽  
Krister Wolff ◽  
Leo Laine ◽  
Mykel J. Kochenderfer

2021 ◽  
Vol 31 (3) ◽  
pp. 1-26
Author(s):  
Aravind Balakrishnan ◽  
Jaeyoung Lee ◽  
Ashish Gaurav ◽  
Krzysztof Czarnecki ◽  
Sean Sedwards

Reinforcement learning (RL) is an attractive way to implement high-level decision-making policies for autonomous driving, but learning directly from a real vehicle or a high-fidelity simulator is variously infeasible. We therefore consider the problem of transfer reinforcement learning and study how a policy learned in a simple environment using WiseMove can be transferred to our high-fidelity simulator, W ise M ove . WiseMove is a framework to study safety and other aspects of RL for autonomous driving. W ise M ove accurately reproduces the dynamics and software stack of our real vehicle. We find that the accurately modelled perception errors in W ise M ove contribute the most to the transfer problem. These errors, when even naively modelled in WiseMove , provide an RL policy that performs better in W ise M ove than a hand-crafted rule-based policy. Applying domain randomization to the environment in WiseMove yields an even better policy. The final RL policy reduces the failures due to perception errors from 10% to 2.75%. We also observe that the RL policy has significantly less reliance on velocity compared to the rule-based policy, having learned that its measurement is unreliable.


Author(s):  
Zhenhai Gao ◽  
Xiangtong Yan ◽  
Fei Gao ◽  
Lei He

Decision-making is one of the key parts of the research on vehicle longitudinal autonomous driving. Considering the behavior of human drivers when designing autonomous driving decision-making strategies is a current research hotspot. In longitudinal autonomous driving decision-making strategies, traditional rule-based decision-making strategies are difficult to apply to complex scenarios. Current decision-making methods that use reinforcement learning and deep reinforcement learning construct reward functions designed with safety, comfort, and economy. Compared with human drivers, the obtained decision strategies still have big gaps. Focusing on the above problems, this paper uses the driver’s behavior data to design the reward function of the deep reinforcement learning algorithm through BP neural network fitting, and uses the deep reinforcement learning DQN algorithm and the DDPG algorithm to establish two driver-like longitudinal autonomous driving decision-making models. The simulation experiment compares the decision-making effect of the two models with the driver curve. The results shows that the two algorithms can realize driver-like decision-making, and the consistency of the DDPG algorithm and human driver behavior is higher than that of the DQN algorithm, the effect of the DDPG algorithm is better than the DQN algorithm.


2020 ◽  
Vol 3 (4) ◽  
pp. 374-385
Author(s):  
Guofa Li ◽  
Shenglong Li ◽  
Shen Li ◽  
Yechen Qin ◽  
Dongpu Cao ◽  
...  

2020 ◽  
Vol 131 ◽  
pp. 103568
Author(s):  
Amarildo Likmeta ◽  
Alberto Maria Metelli ◽  
Andrea Tirinzoni ◽  
Riccardo Giol ◽  
Marcello Restelli ◽  
...  

2020 ◽  

At present, with the development of Intelligent Vehicle Infrastructure Cooperative Systems (IVICS), the decision-making for automated vehicle based on connected environment conditions has attracted more attentions. Reliability, efficiency and generalization performance are the basic requirements for the vehicle decision-making system. Therefore, this paper proposed a decision-making method for connected autonomous driving based on Wasserstein Generative Adversarial Nets-Deep Deterministic Policy Gradient (WGAIL-DDPG) algorithm. In which, the key components for reinforcement learning (RL) model, reward function, is designed from the aspect of vehicle serviceability, such as safety, ride comfort and handling stability. To reduce the complexity of the proposed model, an imitation learning strategy is introduced to improve the RL training process. Meanwhile, the model training strategy based on cloud computing effectively solves the problem of insufficient computing resources of the vehicle-mounted system. Test results show that the proposed method can improve the efficiency for RL training process with reliable decision making performance and reveals excellent generalization capability.


IEEE Access ◽  
2021 ◽  
pp. 1-1
Author(s):  
Shuwei Zhang ◽  
Yutian Wu ◽  
Harutoshi Ogai ◽  
Hiroshi Inujima ◽  
Shigeyuki Tateno

Sign in / Sign up

Export Citation Format

Share Document