Weakly Supervised Reinforcement Learning for Autonomous Highway Driving via Virtual Safety Cages

The use of neural networks and reinforcement learning has become increasingly popular in autonomous vehicle control. However, the opaqueness of the resulting control policies presents a significant barrier to deploying neural network-based control in autonomous vehicles. In this paper, we present a reinforcement learning based approach to autonomous vehicle longitudinal control, where the rule-based safety cages provide enhanced safety for the vehicle as well as weak supervision to the reinforcement learning agent. By guiding the agent to meaningful states and actions, this weak supervision improves the convergence during training and enhances the safety of the final trained policy. This rule-based supervisory controller has the further advantage of being fully interpretable, thereby enabling traditional validation and verification approaches to ensure the safety of the vehicle. We compare models with and without safety cages, as well as models with optimal and constrained model parameters, and show that the weak supervision consistently improves the safety of exploration, speed of convergence, and model performance. Additionally, we show that when the model parameters are constrained or sub-optimal, the safety cages can enable a model to learn a safe driving policy even when the model could not be trained to drive through reinforcement learning alone.

Download Full-text

An Efficiency Enhancing Methodology for Multiple Autonomous Vehicles in an Urban Network Adopting Deep Reinforcement Learning

Applied Sciences ◽

10.3390/app11041514 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1514 ◽

Cited By ~ 2

Author(s):

Quang-Duy Tran ◽

Sang-Hoon Bae

Keyword(s):

Reinforcement Learning ◽

Traffic Congestion ◽

Autonomous Vehicles ◽

Penetration Rate ◽

Autonomous Vehicle ◽

Effective Means ◽

Urban Network ◽

Learning Agents ◽

Policy Optimization ◽

The Impact

To reduce the impact of congestion, it is necessary to improve our overall understanding of the influence of the autonomous vehicle. Recently, deep reinforcement learning has become an effective means of solving complex control tasks. Accordingly, we show an advanced deep reinforcement learning that investigates how the leading autonomous vehicles affect the urban network under a mixed-traffic environment. We also suggest a set of hyperparameters for achieving better performance. Firstly, we feed a set of hyperparameters into our deep reinforcement learning agents. Secondly, we investigate the leading autonomous vehicle experiment in the urban network with different autonomous vehicle penetration rates. Thirdly, the advantage of leading autonomous vehicles is evaluated using entire manual vehicle and leading manual vehicle experiments. Finally, the proximal policy optimization with a clipped objective is compared to the proximal policy optimization with an adaptive Kullback–Leibler penalty to verify the superiority of the proposed hyperparameter. We demonstrate that full automation traffic increased the average speed 1.27 times greater compared with the entire manual vehicle experiment. Our proposed method becomes significantly more effective at a higher autonomous vehicle penetration rate. Furthermore, the leading autonomous vehicles could help to mitigate traffic congestion.

Download Full-text

Deep reinforcement learning based control for Autonomous Vehicles in CARLA

Multimedia Tools and Applications ◽

10.1007/s11042-021-11437-3 ◽

2022 ◽

Author(s):

Óscar Pérez-Gil ◽

Rafael Barea ◽

Elena López-Guillén ◽

Luis M. Bergasa ◽

Carlos Gómez-Huélamo ◽

...

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Vehicle Control ◽

Data Sources ◽

Simulation Environment ◽

Urban Simulation ◽

Policy Gradient ◽

Almost All ◽

Control Layer

AbstractNowadays, Artificial Intelligence (AI) is growing by leaps and bounds in almost all fields of technology, and Autonomous Vehicles (AV) research is one more of them. This paper proposes the using of algorithms based on Deep Learning (DL) in the control layer of an autonomous vehicle. More specifically, Deep Reinforcement Learning (DRL) algorithms such as Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) are implemented in order to compare results between them. The aim of this work is to obtain a trained model, applying a DRL algorithm, able of sending control commands to the vehicle to navigate properly and efficiently following a determined route. In addition, for each of the algorithms, several agents are presented as a solution, so that each of these agents uses different data sources to achieve the vehicle control commands. For this purpose, an open-source simulator such as CARLA is used, providing to the system with the ability to perform a multitude of tests without any risk into an hyper-realistic urban simulation environment, something that is unthinkable in the real world. The results obtained show that both DQN and DDPG reach the goal, but DDPG obtains a better performance. DDPG perfoms trajectories very similar to classic controller as LQR. In both cases RMSE is lower than 0.1m following trajectories with a range 180-700m. To conclude, some conclusions and future works are commented.

Download Full-text

Road Segmentation using Semantic Segmentation Networks for ADAS

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k1530.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 1740-1743

Keyword(s):

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Warning System ◽

Semantic Segmentation ◽

Boundary Line ◽

Segmentation Result ◽

Safe Driving ◽

The Road ◽

Straight Line ◽

Road Segmentation

In this paper, we propose a method to automatically segment the road area from the input road images to support safe driving of autonomous vehicles. In the proposed method, the semantic segmentation network (SSN) is trained by using the deep learning method and the road area is segmented by utilizing the SSN. The SSN uses the weights initialized from the VGC-16 network to create the SegNet network. In order to fast the learning time and to obtain results, the class is simplified and learned so that it can be divided into two classes as the road area and the non-road area in the trained SegNet CNN network. In order to improve the accuracy of the road segmentation result, the boundary line of the road region with the straight-line component is detected through the Hough transform and the result is shown by dividing the accurate road region by combining with the segmentation result of the SSN. The proposed method can be applied to safe driving support by autonomously driving the autonomous vehicle by automatically classifying the road area during operation and applying it to the road area departure warning system

Download Full-text

Proximal Policy Optimization Through a Deep Reinforcement Learning Framework for Multiple Autonomous Vehicles at a Non-Signalized Intersection

Applied Sciences ◽

10.3390/app10165722 ◽

2020 ◽

Vol 10 (16) ◽

pp. 5722 ◽

Cited By ~ 1

Author(s):

Duy Quang Tran ◽

Sang-Hoon Bae

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Signalized Intersection ◽

Continuous Control ◽

Simulation Performance ◽

Perceptron Algorithm ◽

Learning Framework ◽

Positive Effects ◽

Policy Optimization

Advanced deep reinforcement learning shows promise as an approach to addressing continuous control tasks, especially in mixed-autonomy traffic. In this study, we present a deep reinforcement-learning-based model that considers the effectiveness of leading autonomous vehicles in mixed-autonomy traffic at a non-signalized intersection. This model integrates the Flow framework, the simulation of urban mobility simulator, and a reinforcement learning library. We also propose a set of proximal policy optimization hyperparameters to obtain reliable simulation performance. First, the leading autonomous vehicles at the non-signalized intersection are considered with varying autonomous vehicle penetration rates that range from 10% to 100% in 10% increments. Second, the proximal policy optimization hyperparameters are input into the multiple perceptron algorithm for the leading autonomous vehicle experiment. Finally, the superiority of the proposed model is evaluated using all human-driven vehicle and leading human-driven vehicle experiments. We demonstrate that full-autonomy traffic can improve the average speed and delay time by 1.38 times and 2.55 times, respectively, compared with all human-driven vehicle experiments. Our proposed method generates more positive effects when the autonomous vehicle penetration rate increases. Additionally, the leading autonomous vehicle experiment can be used to dissipate the stop-and-go waves at a non-signalized intersection.

Download Full-text

A Novel Data-Driven Modeling and Control Design Method for Autonomous Vehicles

Energies ◽

10.3390/en14020517 ◽

2021 ◽

Vol 14 (2) ◽

pp. 517

Author(s):

Dániel Fényes ◽

Balázs Németh ◽

Péter Gáspár

Keyword(s):

Autonomous Vehicles ◽

Design Method ◽

Nonlinear Modeling ◽

Autonomous Vehicle ◽

Control Design ◽

Path Following ◽

Simulation Software ◽

Model Parameters ◽

Modeling And Control ◽

And Control

This paper presents a novel modeling method for the control design of autonomous vehicle systems. The goal of the method is to provide a control-oriented model in a predefined Linear Parameter Varying (LPV) structure. The scheduling variables of the LPV model through machine-learning-based methods using a big dataset are selected. Moreover, the LPV model parameters through an optimization algorithm are computed, with which accurate fitting on the dataset is achieved. The proposed method is illustrated on the nonlinear modeling of the lateral vehicle dynamics. The resulting LPV-based vehicle model is used for the control design of path following functionality of autonomous vehicles. The effectiveness of the modeling and control design methods through comprehensive simulation examples based on a high-fidelity simulation software are illustrated.

Download Full-text

Research on decision-making of autonomous vehicle following based on reinforcement learning method

Industrial Robot the international journal of robotics research and application ◽

10.1108/ir-07-2018-0154 ◽

2019 ◽

Vol 46 (3) ◽

pp. 444-452 ◽

Cited By ~ 2

Author(s):

Hongbo Gao ◽

Guanya Shi ◽

Kelong Wang ◽

Guotao Xie ◽

Yuchao Liu

Keyword(s):

Decision Making ◽

Reinforcement Learning ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Learning System ◽

Explicit Calculation ◽

Gradient Descent Method ◽

Learning Method ◽

Content Type ◽

Car Following

Purpose Over the past decades, there has been significant research effort dedicated to the development of autonomous vehicles. The decision-making system, which is responsible for driving safety, is one of the most important technologies for autonomous vehicles. The purpose of this study is the use of an intensive learning method combined with car-following data by a driving simulator to obtain an explanatory learning following algorithm and establish an anthropomorphic car-following model. Design/methodology/approach This paper proposed car-following method based on reinforcement learning for autonomous vehicles decision-making. An approximator is used to approximate the value function by determining state space, action space and state transition relationship. A gradient descent method is used to solve the parameter. Findings The effect of car-following on certain driving styles is initially achieved through the simulation of step conditions. The effect of car-following initially proves that the reinforcement learning system is more adaptive to car following and that it has certain explanatory and stability based on the explicit calculation of R. Originality/value The simulation results show that the car-following method based on reinforcement learning for autonomous vehicle decision-making realizes reliable car-following decision-making and has the advantages of simple sample, small amount of data, simple algorithm and good robustness.

Download Full-text

Deep reinforcement learning based path tracking controller for autonomous vehicle

Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering ◽

10.1177/0954407020954591 ◽

2020 ◽

pp. 095440702095459

Author(s):

I-Ming Chen ◽

Ching-Yao Chan

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Path Tracking ◽

Steering Control ◽

Alternative Approach ◽

Control Command ◽

Pure Pursuit ◽

Driving Condition ◽

Policy Optimization

Path tracking is an essential task for autonomous vehicles (AV), for which controllers are designed to issue commands so that the AV will follow the planned path properly to ensure operational safety, comfort, and efficiency. While solving the time-varying nonlinear vehicle dynamic problem is still challenging today, deep neural network (NN) methods, with their capability to deal with nonlinear systems, provide an alternative approach to tackle the difficulties. This study explores the potential of using deep reinforcement learning (DRL) for vehicle control and applies it to the path tracking task. In this study, proximal policy optimization (PPO) is selected as the DRL algorithm and is combined with the conventional pure pursuit (PP) method to structure the vehicle controller architecture. The PP method is used to generate a baseline steering control command, and the PPO is used to derive a correction command to mitigate the inaccuracy associated with the baseline from PP. The blend of the two controllers makes the overall operation more robust and adaptive and attains the optimality to improve tracking performance. In this paper, the structure, settings and training process of the PPO are described. Simulation experiments are carried out based on the proposed methodology, and the results show that the path tracking capability in a low-speed driving condition is significantly enhanced.

Download Full-text

Accelerating the training of deep reinforcement learning in autonomous driving

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i3.pp649-656 ◽

2021 ◽

Vol 10 (3) ◽

pp. 649

Author(s):

Emmanuel Ifeanyi Iroegbu ◽

Devaraj Madhavi

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicle ◽

Autonomous Driving ◽

High Dimensional ◽

Training Time ◽

Learning Agent ◽

Policy Gradient ◽

Low Dimensional ◽

Policy Optimization

Deep reinforcement learning has been successful in solving common autonomous driving tasks such as lane-keeping by simply using pixel data from the front view camera as input. However, raw pixel data contains a very high-dimensional observation that affects the learning quality of the agent due to the complexity imposed by a 'realistic' urban environment. Ergo, we investigate how compressing the raw pixel data from high-dimensional state to low-dimensional latent space offline using a variational autoencoder can significantly improve the training of a deep reinforcement learning agent. We evaluated our method on a simulated autonomous vehicle in car learning to act and compared our results with many baselines including deep deterministic policy gradient, proximal policy optimization, and soft actorcritic. The result shows that the method greatly accelerates the training time and there was a remarkable improvement in the quality of the deep reinforcement learning agent.

Download Full-text

Autonomous Drifting Using Reinforcement Learning

Periodica Polytechnica Transportation Engineering ◽

10.3311/pptr.18581 ◽

2021 ◽

Author(s):

László Orgován ◽

Tamás Bécsi ◽

Szilárd Aradi

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Learning Algorithm ◽

Autonomous Vehicle ◽

Autonomous Driving ◽

The Road ◽

Model Free ◽

On The Road ◽

Self Driving Cars ◽

Planning Problems

Autonomous vehicles or self-driving cars are prevalent nowadays, many vehicle manufacturers, and other tech companies are trying to develop autonomous vehicles. One major goal of the self-driving algorithms is to perform manoeuvres safely, even when some anomaly arises. To solve these kinds of complex issues, Artificial Intelligence and Machine Learning methods are used. One of these motion planning problems is when the tires lose their grip on the road, an autonomous vehicle should handle this situation. Thus the paper provides an Autonomous Drifting algorithm using Reinforcement Learning. The algorithm is based on a model-free learning algorithm, Twin Delayed Deep Deterministic Policy Gradients (TD3). The model is trained on six different tracks in a simulator, which is developed specifically for autonomous driving systems; namely CARLA.

Download Full-text

Safe Deployment of a Reinforcement Learning Robot Using Self Stabilization

10.36227/techrxiv.14842245.v1 ◽

2021 ◽

Author(s):

Nanda Kishore Sreenivas ◽

Shrisha Rao

Keyword(s):

Reinforcement Learning ◽

Finite Number ◽

Autonomous Vehicles ◽

Safe Space ◽

Training Phase ◽

Prior Work ◽

Industrial Systems ◽

Learning Agent ◽

Improved Performance ◽

Action Spaces

In toy environments like video games, a reinforcement learning agent is deployed and operates within the same state space in which it was trained. However, in robotics applications such as industrial systems or autonomous vehicles, this cannot be guaranteed. A robot can be pushed out of its training space by some unforeseen perturbation, which may cause it to go into an unknown state from which it has not been trained to move towards its goal. While most prior work in the area of RL safety focuses on ensuring safety in the training phase, this paper focuses on ensuring the safe deployment of a robot that has already been trained to operate within a safe space. This work defines a condition on the state and action spaces, that if satisfied, guarantees the robot's recovery to safety independently. We also propose a strategy and design that facilitate this recovery within a finite number of steps after perturbation. This is implemented and tested against a standard RL model, and the results indicate a much-improved performance.

Download Full-text