Multi-objective Evaluation of Deep Learning Based Semantic Segmentation for Autonomous Driving Systems

Author(s):  
Cynthia Olvera ◽  
Yoshio Rubio ◽  
Oscar Montiel
Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8072
Author(s):  
Yu-Bang Chang ◽  
Chieh Tsai ◽  
Chang-Hong Lin ◽  
Poki Chen

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.


Author(s):  
Yang Zhao ◽  
Wei Tian ◽  
Hong Cheng

AbstractWith the fast-developing deep learning models in the field of autonomous driving, the research on the uncertainty estimation of deep learning models has also prevailed. Herein, a pyramid Bayesian deep learning method is proposed for the model uncertainty evaluation of semantic segmentation. Semantic segmentation is one of the most important perception problems in understanding visual scene, which is critical for autonomous driving. This study to optimize Bayesian SegNet for uncertainty evaluation. This paper first simplifies the network structure of Bayesian SegNet by reducing the number of MC-Dropout layer and then introduces the pyramid pooling module to improve the performance of Bayesian SegNet. mIoU and mPAvPU are used as evaluation matrics to test the proposed method on the public Cityscapes dataset. The experimental results show that the proposed method improves the sampling effect of the Bayesian SegNet, shortens the sampling time, and improves the network performance.


Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1960
Author(s):  
Dongwan Kang ◽  
Anthony Wong ◽  
Banghyon Lee ◽  
Jungha Kim

Autonomous vehicles perceive objects through various sensors. Cameras, radar, and LiDAR are generally used as vehicle sensors, each of which has its own characteristics. As examples, cameras are used for a high-level understanding of a scene, radar is applied to weather-resistant distance perception, and LiDAR is used for accurate distance recognition. The ability of a camera to understand a scene has overwhelmingly increased with the recent development of deep learning. In addition, technologies that emulate other sensors using a single sensor are being developed. Therefore, in this study, a LiDAR data-based scene understanding method was developed through deep learning. The approaches to accessing LiDAR data through deep learning are mainly divided into point, projection, and voxel methods. The purpose of this study is to apply a projection method to secure a real-time performance. The convolutional neural network method used by a conventional camera can be easily applied to the projection method. In addition, an adaptive break point detector method used for conventional 2D LiDAR information is utilized to solve the misclassification caused by the conversion from 2D into 3D. The results of this study are evaluated through a comparison with other technologies.


Author(s):  
Desire Mulindwa Burume ◽  
Shengzhi Du

Beyond semantic segmentation,3D instance segmentation(a process to delineate objects of interest and also classifying the objects into a set of categories) is gaining more and more interest among researchers since numerous computer vision applications need accurate segmentation processes(autonomous driving, indoor navigation, and even virtual or augmented reality systems…) This paper gives an overview and a technical comparison of the existing deep learning architectures in handling unstructured Euclidean data for the rapidly developing 3D instance segmentation. First, the authors divide the 3D point clouds based instance segmentation techniques into two major categories which are proposal based methods and proposal free methods. Then, they also introduce and compare the most used datasets with regard to 3D instance segmentation. Furthermore, they compare and analyze these techniques performance (speed, accuracy, response to noise…). Finally, this paper provides a review of the possible future directions of deep learning for 3D sensor-based information and provides insight into the most promising areas for prospective research.


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 471
Author(s):  
Zhiyang Guo ◽  
Yingping Huang ◽  
Xing Hu ◽  
Hongjian Wei ◽  
Baigan Zhao

As a prerequisite for autonomous driving, scene understanding has attracted extensive research. With the rise of the convolutional neural network (CNN)-based deep learning technique, research on scene understanding has achieved significant progress. This paper aims to provide a comprehensive survey of deep learning-based approaches for scene understanding in autonomous driving. We categorize these works into four work streams, including object detection, full scene semantic segmentation, instance segmentation, and lane line segmentation. We discuss and analyze these works according to their characteristics, advantages and disadvantages, and basic frameworks. We also summarize the benchmark datasets and evaluation criteria used in the research community and make a performance comparison of some of the latest works. Lastly, we summarize the review work and provide a discussion on the future challenges of the research domain.


Impact ◽  
2020 ◽  
Vol 2020 (2) ◽  
pp. 9-11
Author(s):  
Tomohiro Fukuda

Mixed reality (MR) is rapidly becoming a vital tool, not just in gaming, but also in education, medicine, construction and environmental management. The term refers to systems in which computer-generated content is superimposed over objects in a real-world environment across one or more sensory modalities. Although most of us have heard of the use of MR in computer games, it also has applications in military and aviation training, as well as tourism, healthcare and more. In addition, it has the potential for use in architecture and design, where buildings can be superimposed in existing locations to render 3D generations of plans. However, one major challenge that remains in MR development is the issue of real-time occlusion. This refers to hiding 3D virtual objects behind real articles. Dr Tomohiro Fukuda, who is based at the Division of Sustainable Energy and Environmental Engineering, Graduate School of Engineering at Osaka University in Japan, is an expert in this field. Researchers, led by Dr Tomohiro Fukuda, are tackling the issue of occlusion in MR. They are currently developing a MR system that realises real-time occlusion by harnessing deep learning to achieve an outdoor landscape design simulation using a semantic segmentation technique. This methodology can be used to automatically estimate the visual environment prior to and after construction projects.


IEEE Access ◽  
2020 ◽  
pp. 1-1
Author(s):  
Jeremy M. Webb ◽  
Duane D. Meixner ◽  
Shaheeda A. Adusei ◽  
Eric C. Polley ◽  
Mostafa Fatemi ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 437
Author(s):  
Yuya Onozuka ◽  
Ryosuke Matsumi ◽  
Motoki Shino

Detection of traversable areas is essential to navigation of autonomous personal mobility systems in unknown pedestrian environments. However, traffic rules may recommend or require driving in specified areas, such as sidewalks, in environments where roadways and sidewalks coexist. Therefore, it is necessary for such autonomous mobility systems to estimate the areas that are mechanically traversable and recommended by traffic rules and to navigate based on this estimation. In this paper, we propose a method for weakly-supervised recommended traversable area segmentation in environments with no edges using automatically labeled images based on paths selected by humans. This approach is based on the idea that a human-selected driving path more accurately reflects both mechanical traversability and human understanding of traffic rules and visual information. In addition, we propose a data augmentation method and a loss weighting method for detecting the appropriate recommended traversable area from a single human-selected path. Evaluation of the results showed that the proposed learning methods are effective for recommended traversable area detection and found that weakly-supervised semantic segmentation using human-selected path information is useful for recommended area detection in environments with no edges.


Sign in / Sign up

Export Citation Format

Share Document