Pedestrian and Vehicle Detection in Autonomous Vehicle Perception Systems—A Review

Autonomous Vehicles (AVs) have the potential to solve many traffic problems, such as accidents, congestion and pollution. However, there are still challenges to overcome, for instance, AVs need to accurately perceive their environment to safely navigate in busy urban scenarios. The aim of this paper is to review recent articles on computer vision techniques that can be used to build an AV perception system. AV perception systems need to accurately detect non-static objects and predict their behaviour, as well as to detect static objects and recognise the information they are providing. This paper, in particular, focuses on the computer vision techniques used to detect pedestrians and vehicles. There have been many papers and reviews on pedestrians and vehicles detection so far. However, most of the past papers only reviewed pedestrian or vehicle detection separately. This review aims to present an overview of the AV systems in general, and then review and investigate several detection computer vision techniques for pedestrians and vehicles. The review concludes that both traditional and Deep Learning (DL) techniques have been used for pedestrian and vehicle detection; however, DL techniques have shown the best results. Although good detection results have been achieved for pedestrians and vehicles, the current algorithms still struggle to detect small, occluded, and truncated objects. In addition, there is limited research on how to improve detection performance in difficult light and weather conditions. Most of the algorithms have been tested on well-recognised datasets such as Caltech and KITTI; however, these datasets have their own limitations. Therefore, this paper recommends that future works should be implemented on more new challenging datasets, such as PIE and BDD100K.

Download Full-text

Computer vision based obstacle detection and target tracking for autonomous vehicles

MATEC Web of Conferences ◽

10.1051/matecconf/202133607004 ◽

2021 ◽

Vol 336 ◽

pp. 07004

Author(s):

Ruoyu Fang ◽

Cheng Cai

Keyword(s):

Neural Network ◽

Computer Vision ◽

Deep Learning ◽

Target Tracking ◽

Real Time ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Obstacle Detection ◽

Pid Algorithm ◽

Deep Learning Neural Network

Obstacle detection and target tracking are two major issues for intelligent autonomous vehicles. This paper proposes a new scheme to achieve target tracking and real-time obstacle detection of obstacles based on computer vision. ResNet-18 deep learning neural network is utilized for obstacle detection and Yolo-v3 deep learning neural network is employed for real-time target tracking. These two trained models can be deployed on an autonomous vehicle equipped with an NVIDIA Jetson Nano motherboard. The autonomous vehicle moves to avoid obstacles and follow tracked targets by camera. Adjusting the steering and movement of the autonomous vehicle according to the PID algorithm during the movement, therefore, will help the proposed vehicle achieve stable and precise tracking.

Download Full-text

WeatherNet: Recognising Weather and Visual Conditions from Street-Level Images Using Deep Residual Learning

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi8120549 ◽

2019 ◽

Vol 8 (12) ◽

pp. 549 ◽

Cited By ~ 2

Author(s):

Mohamed Ibrahim ◽

James Haworth ◽

Tao Cheng

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Autonomous Vehicles ◽

Wide Spectrum ◽

Weather Conditions ◽

Machine Intelligence ◽

Residual Learning ◽

Street Level ◽

Unified Method ◽

Visual Conditions

Extracting information related to weather and visual conditions at a given time and space is indispensable for scene awareness, which strongly impacts our behaviours, from simply walking in a city to riding a bike, driving a car, or autonomous drive-assistance. Despite the significance of this subject, it has still not been fully addressed by the machine intelligence relying on deep learning and computer vision to detect the multi-labels of weather and visual conditions with a unified method that can be easily used in practice. What has been achieved to-date are rather sectorial models that address a limited number of labels that do not cover the wide spectrum of weather and visual conditions. Nonetheless, weather and visual conditions are often addressed individually. In this paper, we introduce a novel framework to automatically extract this information from street-level images relying on deep learning and computer vision using a unified method without any pre-defined constraints in the processed images. A pipeline of four deep convolutional neural network (CNN) models, so-called WeatherNet, is trained, relying on residual learning using ResNet50 architecture, to extract various weather and visual conditions such as dawn/dusk, day and night for time detection, glare for lighting conditions, and clear, rainy, snowy, and foggy for weather conditions. WeatherNet shows strong performance in extracting this information from user-defined images or video streams that can be used but are not limited to autonomous vehicles and drive-assistance systems, tracking behaviours, safety-related research, or even for better understanding cities through images for policy-makers.

Download Full-text

Performance Test of Autonomous Vehicle Lidar Sensors Under Different Weather Conditions

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120901681 ◽

2020 ◽

Vol 2674 (1) ◽

pp. 319-329

Author(s):

Li Tang ◽

Yunpeng Shi ◽

Qing He ◽

Adel W. Sadek ◽

Chunming Qiao

Keyword(s):

Autonomous Vehicles ◽

Performance Test ◽

Autonomous Vehicle ◽

Pedestrian Detection ◽

Weather Conditions ◽

Detection Performance ◽

Detection Failure ◽

Rainy Weather ◽

Parking Lot ◽

The University

This paper intends to analyze the Light Detection and Ranging (Lidar) sensor performance on detecting pedestrians under different weather conditions. Lidar sensor is the key sensor in autonomous vehicles, which can provide high-resolution object information. Thus, it is important to analyze the performance of Lidar. This paper involves an autonomous bus operating several pedestrian detection tests in a parking lot at the University at Buffalo. By comparing the pedestrian detection results on rainy days with the results on sunny days, the evidence shows that the rain can cause unstable performance and even failures of Lidar sensors to detect pedestrians in time. After analyzing the test data, three logit models are built to estimate the probability of Lidar detection failure. The rainy weather still plays an important role in affecting Lidar detection performance. Moreover, the distance between a vehicle and a pedestrian, as well as the autonomous vehicle velocity, are also important. This paper can provide a way to improve the Lidar detection performance in autonomous vehicles.

Download Full-text

Applicability of Computer Vision Architectures and Their Influence on Traffic Safety of Autonomous Vehicles

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijitee.f91540981119 ◽

2019 ◽

Vol 8 (6) ◽

pp. 5295-5301 ◽

Cited By ~ 2

Keyword(s):

Neural Network ◽

Computer Vision ◽

Traffic Safety ◽

Autonomous Vehicles ◽

Vision System ◽

Autonomous Vehicle ◽

Computing Time ◽

Weather Conditions ◽

Network Architectures ◽

Neural Network Architectures

This article considers modern rapid architectures of detecting neural networks, structural peculiarities of each selected neural network architectures are analyzed. Experiment is carried out on the basis of potentially dangerous situation upon autonomous vehicle movement; in the selected experimental environment a set of architectures for computer vision system of autonomous vehicle is analyzed, and traffic safety of autonomous vehicle is estimated under various weather conditions; computing time required for application of additional control and analysis algorithms is evaluated. Experimental results are analyzed aiming at development of reasonable selection of neural network architectures for object recognition required for variability of support of autonomous vehicle traffic. Conclusion about applicability of the considered neural network architectures is made for conditions of certain project.

Download Full-text

Imitating a Safe Human Driver Behaviour in Roundabouts Through Deep Learning

Science & Technique ◽

10.21122/2227-1031-2020-19-1-85-88 ◽

2020 ◽

Vol 19 (1) ◽

pp. 85-88

Author(s):

A. S. J. Cervera ◽

F. J. Alonso ◽

F. S. García ◽

A. D. Alvarez

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Driver Behaviour ◽

Safety Issues ◽

Current State ◽

Surrounding Environment ◽

Human Driver ◽

State Of The Environment

Roundabouts provide safe and fast circulation as well as many environmental advantages, but drivers adopting unsafe behaviours while circulating through them may cause safety issues, provoking accidents. In this paper we propose a way of training an autonomous vehicle in order to behave in a human and safe way when entering a roundabout. By placing a number of cameras in our vehicle and processing their video feeds through a series of algorithms, including Machine Learning, we can build a representation of the state of the surrounding environment. Then, we use another set of Deep Learning algorithms to analyze the data and determine the safest way of circulating through a roundabout given the current state of the environment, including nearby vehicles with their estimated positions, speeds and accelerations. By watching multiple attempts of a human entering a roundabout with both safe and unsafe behaviours, our second set of algorithms can learn to mimic the human’s good attempts and act in the same way as him, which is key to a safe implementation of autonomous vehicles. This work details the series of steps that we took, from building the representation of our environment to acting according to it in order to attain safe entry into single lane roundabouts.

Download Full-text

Comprehensive Overview of Neural Networks and Its Applications in Autonomous Vehicles

Computational Intelligence in the Internet of Things - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-7955-7.ch007 ◽

2019 ◽

pp. 159-173

Author(s):

Jay Rodge ◽

Swati Jaiswal

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Neural Networks ◽

Deep Learning ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Learning Algorithms ◽

Activation Functions ◽

Comprehensive Overview ◽

Unit Component

Deep learning and Artificial intelligence (AI) have been trending these days due to the capability and state-of-the-art results that they provide. They have replaced some highly skilled professionals with neural network-powered AI, also known as deep learning algorithms. Deep learning majorly works on neural networks. This chapter discusses about the working of a neuron, which is a unit component of neural network. There are numerous techniques that can be incorporated while designing a neural network, such as activation functions, training, etc. to improve its features, which will be explained in detail. It has some challenges such as overfitting, which are difficult to neglect but can be overcome using proper techniques and steps that have been discussed. The chapter will help the academician, researchers, and practitioners to further investigate the associated area of deep learning and its applications in the autonomous vehicle industry.

Download Full-text

Real-Time Vehicle Detection Algorithm Based on Vision and Lidar Point Cloud Fusion

Journal of Sensors ◽

10.1155/2019/8473980 ◽

2019 ◽

Vol 2019 ◽

pp. 1-9 ◽

Cited By ~ 8

Author(s):

Hai Wang ◽

Xinyu Lou ◽

Yingfeng Cai ◽

Yicheng Li ◽

Long Chen

Keyword(s):

Deep Learning ◽

Real Time ◽

Autonomous Vehicles ◽

Point Cloud ◽

Vehicle Detection ◽

Detection Algorithm ◽

Detection Methods ◽

Depth Information ◽

Detection Accuracy ◽

Classification Rate

Vehicle detection is one of the most important environment perception tasks for autonomous vehicles. The traditional vision-based vehicle detection methods are not accurate enough especially for small and occluded targets, while the light detection and ranging- (lidar-) based methods are good in detecting obstacles but they are time-consuming and have a low classification rate for different target types. Focusing on these shortcomings to make the full use of the advantages of the depth information of lidar and the obstacle classification ability of vision, this work proposes a real-time vehicle detection algorithm which fuses vision and lidar point cloud information. Firstly, the obstacles are detected by the grid projection method using the lidar point cloud information. Then, the obstacles are mapped to the image to get several separated regions of interest (ROIs). After that, the ROIs are expanded based on the dynamic threshold and merged to generate the final ROI. Finally, a deep learning method named You Only Look Once (YOLO) is applied on the ROI to detect vehicles. The experimental results on the KITTI dataset demonstrate that the proposed algorithm has high detection accuracy and good real-time performance. Compared with the detection method based only on the YOLO deep learning, the mean average precision (mAP) is increased by 17%.

Download Full-text

Low-observable targets detection for autonomous vehicles based on dual-modal sensor fusion with deep learning approach

Proceedings of the Institution of Mechanical Engineers Part D Journal of Automobile Engineering ◽

10.1177/0954407019859821 ◽

2019 ◽

Vol 233 (9) ◽

pp. 2270-2283

Author(s):

Keke Geng ◽

Wei Zou ◽

Guodong Yin ◽

Yang Li ◽

Zihao Zhou ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Autonomous Vehicles ◽

Deep Neural Network ◽

Infrared Imaging ◽

Infrared Image ◽

Complex Environments ◽

Infrared Images ◽

Night Time ◽

Perception System

Environment perception is a basic and necessary technology for autonomous vehicles to ensure safety and reliable driving. A lot of studies have focused on the ideal environment, while much less work has been done on the perception of low-observable targets, features of which may not be obvious in a complex environment. However, it is inevitable for autonomous vehicles to drive in environmental conditions such as rain, snow and night-time, during which the features of the targets are not obvious and detection models trained by images with significant features fail to detect low-observable target. This article mainly studies the efficient and intelligent recognition algorithm of low-observable targets in complex environments, focuses on the development of engineering method to dual-modal image (color–infrared images) low-observable target recognition and explores the applications of infrared imaging and color imaging for an intelligent perception system in autonomous vehicles. A dual-modal deep neural network is established to fuse the color and infrared images and detect low-observable targets in dual-modal images. A manually labeled color–infrared image dataset of low-observable targets is built. The deep learning neural network is trained to optimize internal parameters to make the system capable for both pedestrians and vehicle recognition in complex environments. The experimental results indicate that the dual-modal deep neural network has a better performance on the low-observable target detection and recognition in complex environments than traditional methods.

Download Full-text

Deep Learning for Visual SLAM in Transportation Robotics: A review

Transportation Safety and Environment ◽

10.1093/tse/tdz019 ◽

2019 ◽

Vol 1 (3) ◽

pp. 177-184

Author(s):

Chao Duan ◽

Steffen Junginger ◽

Jiahao Huang ◽

Kairong Jin ◽

Kerstin Thurow

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Future Development ◽

Research Progress ◽

Visual Slam ◽

Learning Methods ◽

Research Results ◽

The Past ◽

Localization And Mapping ◽

Challenging Environment

Abstract Visual SLAM (Simultaneously Localization and Mapping) is a solution to achieve localization and mapping of robots simultaneously. Significant achievements have been made during the past decades, geography-based methods are becoming more and more successful in dealing with static environments. However, they still cannot handle a challenging environment. With the great achievements of deep learning methods in the field of computer vision, there is a trend of applying deep learning methods to visual SLAM. In this paper, the latest research progress of deep learning applied to the field of visual SLAM is reviewed. The outstanding research results of deep learning visual odometry and deep learning loop closure detect are summarized. Finally, future development directions of visual SLAM based on deep learning is prospected.

Download Full-text

Single Image Façade Segmentation and Computational Rephotography of House Images Using Deep Learning

Journal on Computing and Cultural Heritage ◽

10.1145/3461014 ◽

2021 ◽

Vol 14 (4) ◽

pp. 1-17

Author(s):

Dilawar Ali ◽

Steven Verstockt ◽

Nico Van De Weghe

Keyword(s):

Computer Vision ◽

Deep Learning ◽

Detection Algorithm ◽

Social Changes ◽

Single Image ◽

Cloud Data ◽

The Past ◽

The Social ◽

Fully Automatic ◽

Over Time

Rephotography is the process of recapturing the photograph of a location from the same perspective in which it was captured earlier. A rephotographed image is the best presentation to visualize and study the social changes of a location over time. Traditionally, only expert artists and photographers are capable of generating the rephotograph of any specific location. Manual editing or human eye judgment that is considered for generating rephotographs normally requires a lot of precision, effort and is not always accurate. In the era of computer science and deep learning, computer vision techniques make it easier and faster to perform precise operations to an image. Until now many research methodologies have been proposed for rephotography but none of them is fully automatic. Some of these techniques require manual input by the user or need multiple images of the same location with 3D point cloud data while others are only suggestions to the user to perform rephotography. In historical records/archives most of the time we can find only one 2D image of a certain location. Computational rephotography is a challenge in the case of using only one image of a location captured at different timestamps because it is difficult to find the accurate perspective of a single 2D historical image. Moreover, in the case of building rephotography, it is required to maintain the alignments and regular shape. The features of a building may change over time and in most of the cases, it is not possible to use a features detection algorithm to detect the key features. In this research paper, we propose a methodology to rephotograph house images by combining deep learning and traditional computer vision techniques. The purpose of this research is to rephotograph an image of the past based on a single image. This research will be helpful not only for computer scientists but also for history and cultural heritage research scholars to study the social changes of a location during a specific time period, and it will allow users to go back in time to see how a specific place looked in the past. We have achieved good, fully automatic rephotographed results based on façade segmentation using only a single image.

Download Full-text