Real-Time Semantic Segmentation of 3D Point Cloud for Autonomous Driving

Autonomous vehicles perceive objects through various sensors. Cameras, radar, and LiDAR are generally used as vehicle sensors, each of which has its own characteristics. As examples, cameras are used for a high-level understanding of a scene, radar is applied to weather-resistant distance perception, and LiDAR is used for accurate distance recognition. The ability of a camera to understand a scene has overwhelmingly increased with the recent development of deep learning. In addition, technologies that emulate other sensors using a single sensor are being developed. Therefore, in this study, a LiDAR data-based scene understanding method was developed through deep learning. The approaches to accessing LiDAR data through deep learning are mainly divided into point, projection, and voxel methods. The purpose of this study is to apply a projection method to secure a real-time performance. The convolutional neural network method used by a conventional camera can be easily applied to the projection method. In addition, an adaptive break point detector method used for conventional 2D LiDAR information is utilized to solve the misclassification caused by the conversion from 2D into 3D. The results of this study are evaluated through a comparison with other technologies.

Download Full-text

On the Performance of One-Stage and Two-Stage Object Detectors in Autonomous Vehicles Using Camera Data

Remote Sensing ◽

10.3390/rs13010089 ◽

2020 ◽

Vol 13 (1) ◽

pp. 89

Author(s):

Manuel Carranza-García ◽

Jesús Torres-Mateo ◽

Pedro Lara-Benítez ◽

Jorge García-Gutiérrez

Keyword(s):

Deep Learning ◽

Real Time ◽

Autonomous Vehicles ◽

Remote Sensing Data ◽

Autonomous Driving ◽

Two Stage ◽

Detection Systems ◽

One Stage ◽

Time Requirements ◽

Speed Accuracy

Object detection using remote sensing data is a key task of the perception systems of self-driving vehicles. While many generic deep learning architectures have been proposed for this problem, there is little guidance on their suitability when using them in a particular scenario such as autonomous driving. In this work, we aim to assess the performance of existing 2D detection systems on a multi-class problem (vehicles, pedestrians, and cyclists) with images obtained from the on-board camera sensors of a car. We evaluate several one-stage (RetinaNet, FCOS, and YOLOv3) and two-stage (Faster R-CNN) deep learning meta-architectures under different image resolutions and feature extractors (ResNet, ResNeXt, Res2Net, DarkNet, and MobileNet). These models are trained using transfer learning and compared in terms of both precision and efficiency, with special attention to the real-time requirements of this context. For the experimental study, we use the Waymo Open Dataset, which is the largest existing benchmark. Despite the rising popularity of one-stage detectors, our findings show that two-stage detectors still provide the most robust performance. Faster R-CNN models outperform one-stage detectors in accuracy, being also more reliable in the detection of minority classes. Faster R-CNN Res2Net-101 achieves the best speed/accuracy tradeoff but needs lower resolution images to reach real-time speed. Furthermore, the anchor-free FCOS detector is a slightly faster alternative to RetinaNet, with similar precision and lower memory usage.

Download Full-text

Real-Time Semantic Segmentation with Dual Encoder and Self-Attention Mechanism for Autonomous Driving

Sensors ◽

10.3390/s21238072 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8072

Author(s):

Yu-Bang Chang ◽

Chieh Tsai ◽

Chang-Hong Lin ◽

Poki Chen

Keyword(s):

Deep Learning ◽

Real Time ◽

Network Architecture ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Attention Mechanism ◽

Trade Off ◽

Segmentation Methods ◽

General Semantic ◽

Deep Learning Model

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.

Download Full-text

Development of environment design support mixed reality system capable of environment estimation using deep learning

Impact ◽

10.21820/23987073.2020.2.9 ◽

2020 ◽

Vol 2020 (2) ◽

pp. 9-11

Author(s):

Tomohiro Fukuda

Keyword(s):

Deep Learning ◽

Real Time ◽

Computer Games ◽

Construction Projects ◽

Mixed Reality ◽

Semantic Segmentation ◽

Environment Design ◽

Aviation Training ◽

Architecture And Design ◽

World Environment

Mixed reality (MR) is rapidly becoming a vital tool, not just in gaming, but also in education, medicine, construction and environmental management. The term refers to systems in which computer-generated content is superimposed over objects in a real-world environment across one or more sensory modalities. Although most of us have heard of the use of MR in computer games, it also has applications in military and aviation training, as well as tourism, healthcare and more. In addition, it has the potential for use in architecture and design, where buildings can be superimposed in existing locations to render 3D generations of plans. However, one major challenge that remains in MR development is the issue of real-time occlusion. This refers to hiding 3D virtual objects behind real articles. Dr Tomohiro Fukuda, who is based at the Division of Sustainable Energy and Environmental Engineering, Graduate School of Engineering at Osaka University in Japan, is an expert in this field. Researchers, led by Dr Tomohiro Fukuda, are tackling the issue of occlusion in MR. They are currently developing a MR system that realises real-time occlusion by harnessing deep learning to achieve an outdoor landscape design simulation using a semantic segmentation technique. This methodology can be used to automatically estimate the visual environment prior to and after construction projects.

Download Full-text

Collaborative Autonomous Driving—A Survey of Solution Approaches and Future Challenges

Sensors ◽

10.3390/s21113783 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3783

Author(s):

Sumbal Malik ◽

Manzoor Ahmed Khan ◽

Hesham El-Sayed

Keyword(s):

Autonomous Vehicles ◽

Communication Technologies ◽

Autonomous Driving ◽

Use Cases ◽

Cooperative Driving ◽

Vehicle To Vehicle ◽

Future Challenges ◽

Vehicle To Infrastructure ◽

High Level ◽

Intersection Management

Sooner than expected, roads will be populated with a plethora of connected and autonomous vehicles serving diverse mobility needs. Rather than being stand-alone, vehicles will be required to cooperate and coordinate with each other, referred to as cooperative driving executing the mobility tasks properly. Cooperative driving leverages Vehicle to Vehicle (V2V) and Vehicle to Infrastructure (V2I) communication technologies aiming to carry out cooperative functionalities: (i) cooperative sensing and (ii) cooperative maneuvering. To better equip the readers with background knowledge on the topic, we firstly provide the detailed taxonomy section describing the underlying concepts and various aspects of cooperation in cooperative driving. In this survey, we review the current solution approaches in cooperation for autonomous vehicles, based on various cooperative driving applications, i.e., smart car parking, lane change and merge, intersection management, and platooning. The role and functionality of such cooperation become more crucial in platooning use-cases, which is why we also focus on providing more details of platooning use-cases and focus on one of the challenges, electing a leader in high-level platooning. Following, we highlight a crucial range of research gaps and open challenges that need to be addressed before cooperative autonomous vehicles hit the roads. We believe that this survey will assist the researchers in better understanding vehicular cooperation, its various scenarios, solution approaches, and challenges.

Download Full-text

Deep-Framework: A Distributed, Scalable, and Edge-Oriented Framework for Real-Time Analysis of Video Streams

Sensors ◽

10.3390/s21124045 ◽

2021 ◽

Vol 21 (12) ◽

pp. 4045

Author(s):

Alessandro Sassu ◽

Jose Francisco Saenz-Cogollo ◽

Maurizio Agelli

Keyword(s):

Deep Learning ◽

Real Time ◽

Video Data ◽

Video Analytics ◽

Web Based ◽

Real Time Analysis ◽

Open Source Framework ◽

Cluster Configuration ◽

Time Requirements ◽

High Level

Edge computing is the best approach for meeting the exponential demand and the real-time requirements of many video analytics applications. Since most of the recent advances regarding the extraction of information from images and video rely on computation heavy deep learning algorithms, there is a growing need for solutions that allow the deployment and use of new models on scalable and flexible edge architectures. In this work, we present Deep-Framework, a novel open source framework for developing edge-oriented real-time video analytics applications based on deep learning. Deep-Framework has a scalable multi-stream architecture based on Docker and abstracts away from the user the complexity of cluster configuration, orchestration of services, and GPU resources allocation. It provides Python interfaces for integrating deep learning models developed with the most popular frameworks and also provides high-level APIs based on standard HTTP and WebRTC interfaces for consuming the extracted video data on clients running on browsers or any other web-based platform.

Download Full-text

Computer vision based obstacle detection and target tracking for autonomous vehicles

MATEC Web of Conferences ◽

10.1051/matecconf/202133607004 ◽

2021 ◽

Vol 336 ◽

pp. 07004

Author(s):

Ruoyu Fang ◽

Cheng Cai

Keyword(s):

Neural Network ◽

Computer Vision ◽

Deep Learning ◽

Target Tracking ◽

Real Time ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Obstacle Detection ◽

Pid Algorithm ◽

Deep Learning Neural Network

Obstacle detection and target tracking are two major issues for intelligent autonomous vehicles. This paper proposes a new scheme to achieve target tracking and real-time obstacle detection of obstacles based on computer vision. ResNet-18 deep learning neural network is utilized for obstacle detection and Yolo-v3 deep learning neural network is employed for real-time target tracking. These two trained models can be deployed on an autonomous vehicle equipped with an NVIDIA Jetson Nano motherboard. The autonomous vehicle moves to avoid obstacles and follow tracked targets by camera. Adjusting the steering and movement of the autonomous vehicle according to the PID algorithm during the movement, therefore, will help the proposed vehicle achieve stable and precise tracking.

Download Full-text

Fast Drivable Areas Estimation with Multi-Task Learning for Real-Time Autonomous Driving Assistant

Applied Sciences ◽

10.3390/app112210713 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10713

Author(s):

Dong-Gyu Lee

Keyword(s):

Real Time ◽

Computational Efficiency ◽

Autonomous Driving ◽

Learning Approaches ◽

Practical Applications ◽

Task Learning ◽

Lane Line ◽

High Level ◽

Time Autonomous

Autonomous driving is a safety-critical application that requires a high-level understanding of computer vision with real-time inference. In this study, we focus on the computational efficiency of an important factor by improving the running time and performing multiple tasks simultaneously for practical applications. We propose a fast and accurate multi-task learning-based architecture for joint segmentation of drivable area, lane line, and classification of the scene. An encoder-decoder architecture efficiently handles input frames through shared representation. A comprehensive understanding of the driving environment is improved by generalization and regularization from different tasks. The proposed method learns end-to-end through multi-task learning on a very challenging Berkeley Deep Drive dataset and shows its robustness for three tasks in autonomous driving. Experimental results show that the proposed method outperforms other multi-task learning approaches in both speed and accuracy. The computational efficiency of the method was over 93.81 fps at inference, enabling execution in real-time.

Download Full-text

Automated Testing of Ultrawideband Positioning for Autonomous Driving

Journal of Robotics ◽

10.1155/2020/9345360 ◽

2020 ◽

Vol 2020 ◽

pp. 1-15

Author(s):

Benjamin Vedder ◽

Bo Joel Svensson ◽

Jonny Vinter ◽

Magnus Jonsson

Keyword(s):

Fault Tolerance ◽

Real Time ◽

Autonomous Vehicles ◽

Fault Injection ◽

Autonomous Driving ◽

Automated Testing ◽

Test Cases ◽

Positioning System ◽

Novel Approach ◽

Driving Model

Autonomous vehicles need accurate and dependable positioning, and these systems need to be tested extensively. We have evaluated positioning based on ultrawideband (UWB) ranging with our self-driving model car using a highly automated approach. Random drivable trajectories were generated, while the UWB position was compared against the Real-Time Kinematic Satellite Navigation (RTK-SN) positioning system which our model car also is equipped with. Fault injection was used to study the fault tolerance of the UWB positioning system. Addressed challenges are automatically generating test cases for real-time hardware, restoring the state between tests, and maintaining safety by preventing collisions. We were able to automatically generate and carry out hundreds of experiments on the model car in real time and rerun them consistently with and without fault injection enabled. Thereby, we demonstrate one novel approach to perform automated testing on complex real-time hardware.

Download Full-text

Business Applications of Deep Learning

Deep Learning and Neural Networks ◽

10.4018/978-1-7998-0414-7.ch052 ◽

2020 ◽

pp. 942-964

Author(s):

Armando Vieira

Keyword(s):

Deep Learning ◽

Language Processing ◽

Video Processing ◽

Autonomous Vehicles ◽

Video Annotation ◽

Business Applications ◽

Personal Assistants ◽

Efficient Learning ◽

High Level ◽

Almost All

Deep Learning (DL) took Artificial Intelligence (AI) by storm and has infiltrated into business at an unprecedented rate. Access to vast amounts of data extensive computational power and a new wave of efficient learning algorithms, helped Artificial Neural Networks to achieve state-of-the-art results in almost all AI challenges. DL is the cornerstone technology behind products for image recognition and video annotation, voice recognition, personal assistants, automated translation and autonomous vehicles. DL works similarly to the brain by extracting high-level, complex abstractions from data in a hierarchical and discriminative or generative way. The implications of DL supported AI in business is tremendous, shaking to the foundations many industries. In this chapter, I present the most significant algorithms and applications, including Natural Language Processing (NLP), image and video processing and finance.

Download Full-text

Road-Aware Trajectory Prediction for Autonomous Driving on Highways

Sensors ◽

10.3390/s20174703 ◽

2020 ◽

Vol 20 (17) ◽

pp. 4703

Author(s):

Yookhyun Yoon ◽

Taeyeon Kim ◽

Ho Lee ◽

Jahnghyon Park

Keyword(s):

Deep Learning ◽

Autonomous Vehicles ◽

Prediction Method ◽

Autonomous Driving ◽

Structural Constraints ◽

High Definition ◽

Trajectory Prediction ◽

The Road ◽

Road Geometry ◽

Efficient Learning

For driving safely and comfortably, the long-term trajectory prediction of surrounding vehicles is essential for autonomous vehicles. For handling the uncertain nature of trajectory prediction, deep-learning-based approaches have been proposed previously. An on-road vehicle must obey road geometry, i.e., it should run within the constraint of the road shape. Herein, we present a novel road-aware trajectory prediction method which leverages the use of high-definition maps with a deep learning network. We developed a data-efficient learning framework for the trajectory prediction network in the curvilinear coordinate system of the road and a lane assignment for the surrounding vehicles. Then, we proposed a novel output-constrained sequence-to-sequence trajectory prediction network to incorporate the structural constraints of the road. Our method uses these structural constraints as prior knowledge for the prediction network. It is not only used as an input to the trajectory prediction network, but is also included in the constrained loss function of the maneuver recognition network. Accordingly, the proposed method can predict a feasible and realistic intention of the driver and trajectory. Our method has been evaluated using a real traffic dataset, and the results thus obtained show that it is data-efficient and can predict reasonable trajectories at merging sections.

Download Full-text