SEMANTIC SEGMENTATION USING OPTICAL SENSORS IN THE TASK OF AUTONOMOUS DRIVING

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2021.11.pp.027-036 ◽

2021 ◽

pp. 27-36

Author(s):

I. V. Sgibnev ◽

B. V. Vishnyakov

Keyword(s):

Optical Sensors ◽

Vision System ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Input Image ◽

Simulated Dataset ◽

Limited Resources ◽

Practical Application ◽

Robotic Vehicle ◽

Validation Set

This paper is devoted to the problem of image semantic segmentation for machine vision system of off-road autonomous robotic vehicle. Most modern convolutional neural networks require large computing resources that go beyond the capabilities of many robotic platforms. Therefore, the main drawback of such models is extremely high complexity of the convolutional neural network used, whereas tasks in real applications must be performed on devices with limited resources in real-time. This paper focuses on the practical application of modern lightweight architectures as applied to the task of semantic segmentation on mobile robotic systems. The article discusses backbones based on ResNet18, ResNet34, MobileNetV2, ShuffleNetV2, EfficientNet-B0 and decoders based on U-Net, DeepLabV3 and DeepLabV3+ as well as additional components that can increase the accuracy of segmentation and reduce the inference time. In this paper we propose a model using ResNet34 enconding and DeepLabV3+ decoding with Squeeze & Excitation blocks that was optimal in terms of inference time and accuracy. We also demonstrate our off-road dataset and simulated dataset for semantic segmentation. Furthermore, we increased mIoU metric by 2.6 % on our off-road dataset using pretrained weights on simulated dataset, compared with mIoU metric using pretrained weights on the Cityscapes. Moreover, we achieved 76.1 % mIoU on the Cityscapes validation set and 85.4 % mIoU on our off-road validation set at 37 FPS (Frames per Second) for an input image of 1024×1024 size on one NVIDIA GeForce RTX 2080 card using NVIDIA TensorRT inference framework.

Download Full-text

DEEP SEMANTIC SEGMENTATION FOR THE OFF-ROAD AUTONOMOUS DRIVING

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-617-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 617-622

Author(s):

I. Sgibnev ◽

A. Sorokin ◽

B. Vishnyakov ◽

Y. Vizilter

Keyword(s):

Vision System ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Simulated Dataset ◽

Robotic Systems ◽

Limited Resources ◽

Practical Application ◽

High Complexity ◽

Robotic Vehicle ◽

Validation Set

Abstract. This paper is devoted to the problem of image semantic segmentation for machine vision system of off-road autonomous robotic vehicle. Most modern convolutional neural networks require large computing resources that go beyond the capabilities of many robotic platforms. Therefore, the main drawback of such models is extremely high complexity of the convolutional neural network used, whereas tasks in real applications must be performed on devices with limited resources in real-time. This paper focuses on the practical application of modern lightweight architectures as applied to the task of semantic segmentation on mobile robotic systems. The article discusses backbones based on ResNet18, ResNet34, MobileNetV2, ShuffleNetV2, EfficientNet-B0 and decoders based on U-Net and DeepLabV3 as well as additional components that can increase the accuracy of segmentation and reduce the inference time. In this paper we propose a model using ResNet34 and DeepLabV3 decoding with Squeeze & Excitation blocks that was optimal in terms of inference time and accuracy. We also demonstrate our off-road dataset and simulated dataset for semantic segmentation. Furthermore, we present that using pre-trained weights on simulated dataset achieves to increase 2.7% mIoU on our off-road dataset compared pre-trained weights on the Cityscapes. Moreover, we achieve 75.6% mIoU on the Cityscapes validation set and 85.2% mIoU on our off-road validation set with a speed of 37 FPS for a 1,024×1,024 input on one NVIDIA GeForce RTX 2080 card using NVIDIA TensorRT.

Download Full-text

Semantic Image Segmentation with Deep Convolutional Neural Networks and Quick Shift

Symmetry ◽

10.3390/sym12030427 ◽

2020 ◽

Vol 12 (3) ◽

pp. 427 ◽

Cited By ~ 1

Author(s):

Sanxing Zhang ◽

Zhenhuan Ma ◽

Gang Zhang ◽

Tao Lei ◽

Rui Zhang ◽

...

Keyword(s):

Neural Networks ◽

Image Segmentation ◽

Convolutional Neural Networks ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Input Image ◽

Feature Representation ◽

Segmentation Algorithm ◽

Deep Convolutional Neural Networks ◽

Semantic Image Segmentation

Semantic image segmentation, as one of the most popular tasks in computer vision, has been widely used in autonomous driving, robotics and other fields. Currently, deep convolutional neural networks (DCNNs) are driving major advances in semantic segmentation due to their powerful feature representation. However, DCNNs extract high-level feature representations by strided convolution, which makes it impossible to segment foreground objects precisely, especially when locating object boundaries. This paper presents a novel semantic segmentation algorithm with DeepLab v3+ and super-pixel segmentation algorithm-quick shift. DeepLab v3+ is employed to generate a class-indexed score map for the input image. Quick shift is applied to segment the input image into superpixels. Outputs of them are then fed into a class voting module to refine the semantic segmentation results. Extensive experiments on proposed semantic image segmentation are performed over PASCAL VOC 2012 dataset, and results that the proposed method can provide a more efficient solution.

Download Full-text

Unsupervised Domain Adaptation in Semantic Segmentation: A Review

Technologies ◽

10.3390/technologies8020035 ◽

2020 ◽

Vol 8 (2) ◽

pp. 35

Author(s):

Marco Toldo ◽

Andrea Maracani ◽

Umberto Michieli ◽

Pietro Zanuttigh

Keyword(s):

Ad Hoc ◽

Domain Adaptation ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Input Image ◽

Comprehensive Overview ◽

Open Problems ◽

Unsupervised Domain Adaptation ◽

Driving Scenario ◽

Segmentation Models

The aim of this paper is to give an overview of the recent advancements in the Unsupervised Domain Adaptation (UDA) of deep networks for semantic segmentation. This task is attracting a wide interest since semantic segmentation models require a huge amount of labeled data and the lack of data fitting specific requirements is the main limitation in the deployment of these techniques. This field has been recently explored and has rapidly grown with a large number of ad-hoc approaches. This motivates us to build a comprehensive overview of the proposed methodologies and to provide a clear categorization. In this paper, we start by introducing the problem, its formulation and the various scenarios that can be considered. Then, we introduce the different levels at which adaptation strategies may be applied: namely, at the input (image) level, at the internal features representation and at the output level. Furthermore, we present a detailed overview of the literature in the field, dividing previous methods based on the following (non mutually exclusive) categories: adversarial learning, generative-based, analysis of the classifier discrepancies, self-teaching, entropy minimization, curriculum learning and multi-task learning. Novel research directions are also briefly introduced to give a hint of interesting open problems in the field. Finally, a comparison of the performance of the various methods in the widely used autonomous driving scenario is presented.

Download Full-text

REAL-TIME SEMANTIC SLAM WITH DCNN-BASED FEATURE POINT DETECTION, MATCHING AND DENSE POINT CLOUD AGGREGATION

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-399-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 399-404

Author(s):

B. Vishnyakov ◽

I. Sgibnev ◽

V. Sheverdin ◽

A. Sorokin ◽

P. Masalov ◽

...

Keyword(s):

Neural Networks ◽

Real Time ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Scene Reconstruction ◽

Deep Convolutional Neural Networks ◽

Dense Point ◽

Robotic Vehicle ◽

Semantic Scene ◽

Point Detection

Abstract. In this paper we present the semantic SLAM method based on a bundle of deep convolutional neural networks. It provides real-time dense semantic scene reconstruction for the autonomous driving system of an off-road robotic vehicle. Most state-of-the-art neural networks require large computing resources that go beyond the capabilities of many robotic platforms. We propose an architecture for 3D semantic scene reconstruction on top of the recent progress in computer vision by integrating SuperPoint, SuperGlue, Bi3D, DeepLabV3+, RTM3D and additional module with pre-processing, inference and postprocessing operations performed on GPU. We also updated our simulated dataset for semantic segmentation and added disparity images.

Download Full-text

Weakly-Supervised Recommended Traversable Area Segmentation Using Automatically Labeled Images for Autonomous Driving in Pedestrian Environment with No Edges

Sensors ◽

10.3390/s21020437 ◽

2021 ◽

Vol 21 (2) ◽

pp. 437

Author(s):

Yuya Onozuka ◽

Ryosuke Matsumi ◽

Motoki Shino

Keyword(s):

Visual Information ◽

Data Augmentation ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Weighting Method ◽

Personal Mobility ◽

Human Understanding ◽

Autonomous Mobility ◽

Weakly Supervised ◽

Traffic Rules

Detection of traversable areas is essential to navigation of autonomous personal mobility systems in unknown pedestrian environments. However, traffic rules may recommend or require driving in specified areas, such as sidewalks, in environments where roadways and sidewalks coexist. Therefore, it is necessary for such autonomous mobility systems to estimate the areas that are mechanically traversable and recommended by traffic rules and to navigate based on this estimation. In this paper, we propose a method for weakly-supervised recommended traversable area segmentation in environments with no edges using automatically labeled images based on paths selected by humans. This approach is based on the idea that a human-selected driving path more accurately reflects both mechanical traversability and human understanding of traffic rules and visual information. In addition, we propose a data augmentation method and a loss weighting method for detecting the appropriate recommended traversable area from a single human-selected path. Evaluation of the results showed that the proposed learning methods are effective for recommended traversable area detection and found that weakly-supervised semantic segmentation using human-selected path information is useful for recommended area detection in environments with no edges.

Download Full-text

Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images

Remote Sensing ◽

10.3390/rs13163065 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3065

Author(s):

Libo Wang ◽

Rui Li ◽

Dongzhi Wang ◽

Chenxi Duan ◽

Teng Wang ◽

...

Keyword(s):

Large Scale ◽

Texture Features ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Research Field ◽

Learning Approaches ◽

Fine Grained ◽

Urban Scene ◽

Fine Resolution ◽

With Memory

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.

Download Full-text

Learning Uncertainty for Safety-Oriented Semantic Segmentation in Autonomous Driving

10.1109/icip42928.2021.9506719 ◽

2021 ◽

Author(s):

Victor Besnier ◽

David Picard ◽

Alexandre Briot

Keyword(s):

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

Visual Scene Understanding for Autonomous Driving Using Semantic Segmentation

Explainable AI: Interpreting, Explaining and Visualizing Deep Learning - Lecture Notes in Computer Science ◽

10.1007/978-3-030-28954-6_15 ◽

2019 ◽

pp. 285-296 ◽

Cited By ~ 1

Author(s):

Markus Hofmarcher ◽

Thomas Unterthiner ◽

José Arjona-Medina ◽

Günter Klambauer ◽

Sepp Hochreiter ◽

...

Keyword(s):

Scene Understanding ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Visual Scene ◽

Visual Scene Understanding

Download Full-text

Deep Semantic Segmentation in Autonomous Driving

Deep Learning in Computer Vision ◽

10.1201/9781351003827-6 ◽

2020 ◽

pp. 151-182

Author(s):

Hazem Rashed ◽

Senthil Yogamani ◽

Ahmad El-Sallab ◽

Mohamed Elhelw ◽

Mahmoud Hassaballah

Keyword(s):

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

Optical Properties of CaF2 Anti-Reflection Coating On ZnS for 8~12 µm Infrared Region

Korean Journal of Metals and Materials ◽

10.3365/kjmm.2020.58.6.433 ◽

2020 ◽

Vol 58 (6) ◽

pp. 433-438

Author(s):

Ill-Joo Lee ◽

Seung-Chan Hong ◽

Byung-Sam Kim ◽

Jae-Kyung Cheon

Keyword(s):

Vision System ◽

Single Layer ◽

Cost Effective ◽

Autonomous Driving ◽

Infrared Region ◽

Substrate Material ◽

Night Vision ◽

High Price ◽

Thermal Camera ◽

Quarter Wavelength

Technologies for pedestrian safety are increasingly emphasized by Automakers in advance of autonomous driving vehicles. A Night Vision System attached behind the front grille can reduce fatal accidents, especially during the nighttime, however, consumers may hesitate to adopt such systems on account of their high price. High-cost Germanium is used in commercial Night Vision System windows, and therefore replacing it with a cheaper infrared window material can lead to a more affordable system. To achieve this, Zinc Sulfide (ZnS), which has about 70% transmittance in the Long-Wavelength Infrared region of 8~12 μm, was selected for the window substrate material. In this study, we designed, fabricated and characterized a single layer cost-effective anti-reflection coating on a ZnS window substrate using Calcium Fluoride (CaF2). The CaF2 coating was fabricated by E-beam evaporation technique, with Quarter wavelength anti-reflection thickness (QAR). It was characterized by FT-IR, SEM and a thermal camera test module. We found that CaF2 both side coated the ZnS window and exhibited about 10~15% higher transmittance than the ZnS window substrate. In addition the CaF2 coating stably bonded to the ZnS substrate without any internal defects. A thermal camera based window test also showed better detection performance with the CaF2 Coating than a bare ZnS substrate window, which was calculated using the output voltage of the microbolometer thermal sensor.

Download Full-text