scholarly journals Real-Time Semantic Image Segmentation with Deep Learning for Autonomous Driving: A Survey

2021 ◽  
Vol 11 (19) ◽  
pp. 8802
Author(s):  
Ilias Papadeas ◽  
Lazaros Tsochatzidis ◽  
Angelos Amanatiadis ◽  
Ioannis Pratikakis

Semantic image segmentation for autonomous driving is a challenging task due to its requirement for both effectiveness and efficiency. Recent developments in deep learning have demonstrated important performance boosting in terms of accuracy. In this paper, we present a comprehensive overview of the state-of-the-art semantic image segmentation methods using deep-learning techniques aiming to operate in real time so that can efficiently support an autonomous driving scenario. To this end, the presented overview puts a particular emphasis on the presentation of all those approaches which permit inference time reduction, while an analysis of the existing methods is addressed by taking into account their end-to-end functionality, as well as a comparative study that relies upon a consistent evaluation framework. Finally, a fruitful discussion is presented that provides key insights for the current trend and future research directions in real-time semantic image segmentation with deep learning for autonomous driving.

Sensors ◽  
2020 ◽  
Vol 20 (8) ◽  
pp. 2272 ◽  
Author(s):  
Faisal Khan ◽  
Saqib Salahuddin ◽  
Hossein Javidnia

Monocular depth estimation from Red-Green-Blue (RGB) images is a well-studied ill-posed problem in computer vision which has been investigated intensively over the past decade using Deep Learning (DL) approaches. The recent approaches for monocular depth estimation mostly rely on Convolutional Neural Networks (CNN). Estimating depth from two-dimensional images plays an important role in various applications including scene reconstruction, 3D object-detection, robotics and autonomous driving. This survey provides a comprehensive overview of this research topic including the problem representation and a short description of traditional methods for depth estimation. Relevant datasets and 13 state-of-the-art deep learning-based approaches for monocular depth estimation are reviewed, evaluated and discussed. We conclude this paper with a perspective towards future research work requiring further investigation in monocular depth estimation challenges.


2019 ◽  
Vol 10 (11) ◽  
pp. 3145-3154 ◽  
Author(s):  
Swarnendu Ghosh ◽  
Anisha Pal ◽  
Shourya Jaiswal ◽  
K. C. Santosh ◽  
Nibaran Das ◽  
...  

Author(s):  
Zainab Oufqir ◽  
Lamiae Binan ◽  
Abdellatif EL ABDERRAHMANI ◽  
Khalid Satori

In this article, we give a comprehensive overview of recent methods in object detection using deep learning and their uses in augmented reality. The objective is to present a complete understanding of these algorithms and how augmented reality functions and services can be improved by integrating these methods. We discuss in detail the different characteristics of each approach and their influence on real-time detection performance. Experimental analyses are provided to compare the performance of each method and make meaningful conclusions for their use in augmented reality. Two-stage detectors generally provide better detection performance, while single-stage detectors are significantly more time efficient and more applicable to real-time object detection. Finally, we discuss several future directions to facilitate and stimulate future research on object detection in augmented reality. Keywords: object detection, deep learning, convolutional neural network, augmented reality.


2020 ◽  
Vol 13 (1) ◽  
pp. 89
Author(s):  
Manuel Carranza-García ◽  
Jesús Torres-Mateo ◽  
Pedro Lara-Benítez ◽  
Jorge García-Gutiérrez

Object detection using remote sensing data is a key task of the perception systems of self-driving vehicles. While many generic deep learning architectures have been proposed for this problem, there is little guidance on their suitability when using them in a particular scenario such as autonomous driving. In this work, we aim to assess the performance of existing 2D detection systems on a multi-class problem (vehicles, pedestrians, and cyclists) with images obtained from the on-board camera sensors of a car. We evaluate several one-stage (RetinaNet, FCOS, and YOLOv3) and two-stage (Faster R-CNN) deep learning meta-architectures under different image resolutions and feature extractors (ResNet, ResNeXt, Res2Net, DarkNet, and MobileNet). These models are trained using transfer learning and compared in terms of both precision and efficiency, with special attention to the real-time requirements of this context. For the experimental study, we use the Waymo Open Dataset, which is the largest existing benchmark. Despite the rising popularity of one-stage detectors, our findings show that two-stage detectors still provide the most robust performance. Faster R-CNN models outperform one-stage detectors in accuracy, being also more reliable in the detection of minority classes. Faster R-CNN Res2Net-101 achieves the best speed/accuracy tradeoff but needs lower resolution images to reach real-time speed. Furthermore, the anchor-free FCOS detector is a slightly faster alternative to RetinaNet, with similar precision and lower memory usage.


2021 ◽  
Vol 2010 (1) ◽  
pp. 012128
Author(s):  
Yuting Liang ◽  
Tangtian Hang ◽  
Jie Chen ◽  
Lei Liu

2020 ◽  
Vol 12 (6) ◽  
pp. 959 ◽  
Author(s):  
Mohammad Pashaei ◽  
Hamid Kamangir ◽  
Michael J. Starek ◽  
Philippe Tissot

Deep learning has already been proved as a powerful state-of-the-art technique for many image understanding tasks in computer vision and other applications including remote sensing (RS) image analysis. Unmanned aircraft systems (UASs) offer a viable and economical alternative to a conventional sensor and platform for acquiring high spatial and high temporal resolution data with high operational flexibility. Coastal wetlands are among some of the most challenging and complex ecosystems for land cover prediction and mapping tasks because land cover targets often show high intra-class and low inter-class variances. In recent years, several deep convolutional neural network (CNN) architectures have been proposed for pixel-wise image labeling, commonly called semantic image segmentation. In this paper, some of the more recent deep CNN architectures proposed for semantic image segmentation are reviewed, and each model’s training efficiency and classification performance are evaluated by training it on a limited labeled image set. Training samples are provided using the hyper-spatial resolution UAS imagery over a wetland area and the required ground truth images are prepared by manual image labeling. Experimental results demonstrate that deep CNNs have a great potential for accurate land cover prediction task using UAS hyper-spatial resolution images. Some simple deep learning architectures perform comparable or even better than complex and very deep architectures with remarkably fewer training epochs. This performance is especially valuable when limited training samples are available, which is a common case in most RS applications.


Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 427 ◽  
Author(s):  
Sanxing Zhang ◽  
Zhenhuan Ma ◽  
Gang Zhang ◽  
Tao Lei ◽  
Rui Zhang ◽  
...  

Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.


2019 ◽  
Vol 5 (1) ◽  
pp. 399-426 ◽  
Author(s):  
Thomas Serre

Artificial vision has often been described as one of the key remaining challenges to be solved before machines can act intelligently. Recent developments in a branch of machine learning known as deep learning have catalyzed impressive gains in machine vision—giving a sense that the problem of vision is getting closer to being solved. The goal of this review is to provide a comprehensive overview of recent deep learning developments and to critically assess actual progress toward achieving human-level visual intelligence. I discuss the implications of the successes and limitations of modern machine vision algorithms for biological vision and the prospect for neuroscience to inform the design of future artificial vision systems.


Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1339
Author(s):  
Willy Anugrah Cahyadi ◽  
Yeon Ho Chung ◽  
Zabih Ghassemlooy ◽  
Navid Bani Hassan

Optical wireless communications (OWC) are emerging as cost-effective and practical solutions to the congested radio frequency-based wireless technologies. As part of OWC, optical camera communications (OCC) have become very attractive, considering recent developments in cameras and the use of fitted cameras in smart devices. OCC together with visible light communications (VLC) is considered within the framework of the IEEE 802.15.7m standardization. OCCs based on both organic and inorganic light sources as well as cameras are being considered for low-rate transmissions and localization in indoor as well as outdoor short-range applications and within the framework of the IEEE 802.15.7m standardization together with VLC. This paper introduces the underlying principles of OCC and gives a comprehensive overview of this emerging technology with recent standardization activities in OCC. It also outlines the key technical issues such as mobility, coverage, interference, performance enhancement, etc. Future research directions and open issues are also presented.


Sign in / Sign up

Export Citation Format

Share Document