scholarly journals Weakly Supervised Reinforcement Learning for Autonomous Highway Driving via Virtual Safety Cages

Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2032
Author(s):  
Sampo Kuutti ◽  
Richard Bowden ◽  
Saber Fallah

The use of neural networks and reinforcement learning has become increasingly popular in autonomous vehicle control. However, the opaqueness of the resulting control policies presents a significant barrier to deploying neural network-based control in autonomous vehicles. In this paper, we present a reinforcement learning based approach to autonomous vehicle longitudinal control, where the rule-based safety cages provide enhanced safety for the vehicle as well as weak supervision to the reinforcement learning agent. By guiding the agent to meaningful states and actions, this weak supervision improves the convergence during training and enhances the safety of the final trained policy. This rule-based supervisory controller has the further advantage of being fully interpretable, thereby enabling traditional validation and verification approaches to ensure the safety of the vehicle. We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance. Additionally, we show that when the model parameters are constrained or sub-optimal, the safety cages can enable a model to learn a safe driving policy even when the model could not be trained to drive through reinforcement learning alone.

2021 ◽  
Vol 11 (4) ◽  
pp. 1514 ◽  
Author(s):  
Quang-Duy Tran ◽  
Sang-Hoon Bae

To reduce the impact of congestion, it is necessary to improve our overall understanding of the influence of the autonomous vehicle. Recently, deep reinforcement learning has become an effective means of solving complex control tasks. Accordingly, we show an advanced deep reinforcement learning that investigates how the leading autonomous vehicles affect the urban network under a mixed-traffic environment. We also suggest a set of hyperparameters for achieving better performance. Firstly, we feed a set of hyperparameters into our deep reinforcement learning agents. Secondly, we investigate the leading autonomous vehicle experiment in the urban network with different autonomous vehicle penetration rates. Thirdly, the advantage of leading autonomous vehicles is evaluated using entire manual vehicle and leading manual vehicle experiments. Finally, the proximal policy optimization with a clipped objective is compared to the proximal policy optimization with an adaptive Kullback–Leibler penalty to verify the superiority of the proposed hyperparameter. We demonstrate that full automation traffic increased the average speed 1.27 times greater compared with the entire manual vehicle experiment. Our proposed method becomes significantly more effective at a higher autonomous vehicle penetration rate. Furthermore, the leading autonomous vehicles could help to mitigate traffic congestion.


Author(s):  
Óscar Pérez-Gil ◽  
Rafael Barea ◽  
Elena López-Guillén ◽  
Luis M. Bergasa ◽  
Carlos Gómez-Huélamo ◽  
...  

AbstractNowadays, Artificial Intelligence (AI) is growing by leaps and bounds in almost all fields of technology, and Autonomous Vehicles (AV) research is one more of them. This paper proposes the using of algorithms based on Deep Learning (DL) in the control layer of an autonomous vehicle. More specifically, Deep Reinforcement Learning (DRL) algorithms such as Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) are implemented in order to compare results between them. The aim of this work is to obtain a trained model, applying a DRL algorithm, able of sending control commands to the vehicle to navigate properly and efficiently following a determined route. In addition, for each of the algorithms, several agents are presented as a solution, so that each of these agents uses different data sources to achieve the vehicle control commands. For this purpose, an open-source simulator such as CARLA is used, providing to the system with the ability to perform a multitude of tests without any risk into an hyper-realistic urban simulation environment, something that is unthinkable in the real world. The results obtained show that both DQN and DDPG reach the goal, but DDPG obtains a better performance. DDPG perfoms trajectories very similar to classic controller as LQR. In both cases RMSE is lower than 0.1m following trajectories with a range 180-700m. To conclude, some conclusions and future works are commented.


In this paper, we propose a method to automatically segment the road area from the input road images to support safe driving of autonomous vehicles. In the proposed method, the semantic segmentation network (SSN) is trained by using the deep learning method and the road area is segmented by utilizing the SSN. The SSN uses the weights initialized from the VGC-16 network to create the SegNet network. In order to fast the learning time and to obtain results, the class is simplified and learned so that it can be divided into two classes as the road area and the non-road area in the trained SegNet CNN network. In order to improve the accuracy of the road segmentation result, the boundary line of the road region with the straight-line component is detected through the Hough transform and the result is shown by dividing the accurate road region by combining with the segmentation result of the SSN. The proposed method can be applied to safe driving support by autonomously driving the autonomous vehicle by automatically classifying the road area during operation and applying it to the road area departure warning system


2020 ◽  
Vol 10 (16) ◽  
pp. 5722 ◽  
Author(s):  
Duy Quang Tran ◽  
Sang-Hoon Bae

Advanced deep reinforcement learning shows promise as an approach to addressing continuous control tasks, especially in mixed-autonomy traffic. In this study, we present a deep reinforcement-learning-based model that considers the effectiveness of leading autonomous vehicles in mixed-autonomy traffic at a non-signalized intersection. This model integrates the Flow framework, the simulation of urban mobility simulator, and a reinforcement learning library. We also propose a set of proximal policy optimization hyperparameters to obtain reliable simulation performance. First, the leading autonomous vehicles at the non-signalized intersection are considered with varying autonomous vehicle penetration rates that range from 10% to 100% in 10% increments. Second, the proximal policy optimization hyperparameters are input into the multiple perceptron algorithm for the leading autonomous vehicle experiment. Finally, the superiority of the proposed model is evaluated using all human-driven vehicle and leading human-driven vehicle experiments. We demonstrate that full-autonomy traffic can improve the average speed and delay time by 1.38 times and 2.55 times, respectively, compared with all human-driven vehicle experiments. Our proposed method generates more positive effects when the autonomous vehicle penetration rate increases. Additionally, the leading autonomous vehicle experiment can be used to dissipate the stop-and-go waves at a non-signalized intersection.


Energies ◽  
2021 ◽  
Vol 14 (2) ◽  
pp. 517
Author(s):  
Dániel Fényes ◽  
Balázs Németh ◽  
Péter Gáspár

This paper presents a novel modeling method for the control design of autonomous vehicle systems. The goal of the method is to provide a control-oriented model in a predefined Linear Parameter Varying (LPV) structure. The scheduling variables of the LPV model through machine-learning-based methods using a big dataset are selected. Moreover, the LPV model parameters through an optimization algorithm are computed, with which accurate fitting on the dataset is achieved. The proposed method is illustrated on the nonlinear modeling of the lateral vehicle dynamics. The resulting LPV-based vehicle model is used for the control design of path following functionality of autonomous vehicles. The effectiveness of the modeling and control design methods through comprehensive simulation examples based on a high-fidelity simulation software are illustrated.


Author(s):  
Hongbo Gao ◽  
Guanya Shi ◽  
Kelong Wang ◽  
Guotao Xie ◽  
Yuchao Liu

Purpose Over the past decades, there has been significant research effort dedicated to the development of autonomous vehicles. The decision-making system, which is responsible for driving safety, is one of the most important technologies for autonomous vehicles. The purpose of this study is the use of an intensive learning method combined with car-following data by a driving simulator to obtain an explanatory learning following algorithm and establish an anthropomorphic car-following model. Design/methodology/approach This paper proposed car-following method based on reinforcement learning for autonomous vehicles decision-making. An approximator is used to approximate the value function by determining state space, action space and state transition relationship. A gradient descent method is used to solve the parameter. Findings The effect of car-following on certain driving styles is initially achieved through the simulation of step conditions. The effect of car-following initially proves that the reinforcement learning system is more adaptive to car following and that it has certain explanatory and stability based on the explicit calculation of R. Originality/value The simulation results show that the car-following method based on reinforcement learning for autonomous vehicle decision-making realizes reliable car-following decision-making and has the advantages of simple sample, small amount of data, simple algorithm and good robustness.


Author(s):  
I-Ming Chen ◽  
Ching-Yao Chan

Path tracking is an essential task for autonomous vehicles (AV), for which controllers are designed to issue commands so that the AV will follow the planned path properly to ensure operational safety, comfort, and efficiency. While solving the time-varying nonlinear vehicle dynamic problem is still challenging today, deep neural network (NN) methods, with their capability to deal with nonlinear systems, provide an alternative approach to tackle the difficulties. This study explores the potential of using deep reinforcement learning (DRL) for vehicle control and applies it to the path tracking task. In this study, proximal policy optimization (PPO) is selected as the DRL algorithm and is combined with the conventional pure pursuit (PP) method to structure the vehicle controller architecture. The PP method is used to generate a baseline steering control command, and the PPO is used to derive a correction command to mitigate the inaccuracy associated with the baseline from PP. The blend of the two controllers makes the overall operation more robust and adaptive and attains the optimality to improve tracking performance. In this paper, the structure, settings and training process of the PPO are described. Simulation experiments are carried out based on the proposed methodology, and the results show that the path tracking capability in a low-speed driving condition is significantly enhanced.


Author(s):  
Emmanuel Ifeanyi Iroegbu ◽  
Devaraj Madhavi

Deep reinforcement learning has been successful in solving common autonomous driving tasks such as lane-keeping by simply using pixel data from the front view camera as input. However, raw pixel data contains a very high-dimensional observation that affects the learning quality of the agent due to the complexity imposed by a 'realistic' urban environment. Ergo, we investigate how compressing the raw pixel data from high-dimensional state to low-dimensional latent space offline using a variational autoencoder can significantly improve the training of a deep reinforcement learning agent. We evaluated our method on a simulated autonomous vehicle in car learning to act and compared our results with many baselines including deep deterministic policy gradient, proximal policy optimization, and soft actorcritic. The result shows that the method greatly accelerates the training time and there was a remarkable improvement in the quality of the deep reinforcement learning agent.


Author(s):  
László Orgován ◽  
Tamás Bécsi ◽  
Szilárd Aradi

Autonomous vehicles or self-driving cars are prevalent nowadays, many vehicle manufacturers, and other tech companies are trying to develop autonomous vehicles. One major goal of the self-driving algorithms is to perform manoeuvres safely, even when some anomaly arises. To solve these kinds of complex issues, Artificial Intelligence and Machine Learning methods are used. One of these motion planning problems is when the tires lose their grip on the road, an autonomous vehicle should handle this situation. Thus the paper provides an Autonomous Drifting algorithm using Reinforcement Learning. The algorithm is based on a model-free learning algorithm, Twin Delayed Deep Deterministic Policy Gradients (TD3). The model is trained on six different tracks in a simulator, which is developed specifically for autonomous driving systems; namely CARLA.


2021 ◽  
Author(s):  
Nanda Kishore Sreenivas ◽  
Shrisha Rao

In toy environments like video games, a reinforcement learning agent is deployed and operates within the same state space in which it was trained. However, in robotics applications such as industrial systems or autonomous vehicles, this cannot be guaranteed. A robot can be pushed out of its training space by some unforeseen perturbation, which may cause it to go into an unknown state from which it has not been trained to move towards its goal. While most prior work in the area of RL safety focuses on ensuring safety in the training phase, this paper focuses on ensuring the safe deployment of a robot that has already been trained to operate within a safe space. This work defines a condition on the state and action spaces, that if satisfied, guarantees the robot's recovery to safety independently. We also propose a strategy and design that facilitate this recovery within a finite number of steps after perturbation. This is implemented and tested against a standard RL model, and the results indicate a much-improved performance.


Sign in / Sign up

Export Citation Format

Share Document