A Comparative Study for Open Set Semantic Segmentation Methods

Typical semantic segmentation methods do not recognize unknown pixels during the test or deployment stage. This capability is critical for open-world environment applications where unseen objects appear all the time. Recently, to solve those limitations, Open Set Semantic Segmentation (OSSS) was introduced. This task aims to produce known and unknown pixels semantic segments. However, due to its recent introduction, few works are found in the literature, and consequently, few datasets are publicly available. This work carried out a comparative study between the existing OSSS methods on a new synthetic dataset of images and the well-known PASCAL VOC 2012 dataset. The compared methods include SoftMax-T, OpenMax-based, and OpenIPCS. The results are encouraging and show some of the advantages and main limitations of each technique. However, in general, they demonstrate that the problem of OSSS remains open and demands further research aiming at real applications, such as autonomous driving and robotics.

Download Full-text

Real-Time Semantic Segmentation with Dual Encoder and Self-Attention Mechanism for Autonomous Driving

Sensors ◽

10.3390/s21238072 ◽

2021 ◽

Vol 21 (23) ◽

pp. 8072

Author(s):

Yu-Bang Chang ◽

Chieh Tsai ◽

Chang-Hong Lin ◽

Poki Chen

Keyword(s):

Deep Learning ◽

Real Time ◽

Network Architecture ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Attention Mechanism ◽

Trade Off ◽

Segmentation Methods ◽

General Semantic ◽

Deep Learning Model

As the techniques of autonomous driving become increasingly valued and universal, real-time semantic segmentation has become very popular and challenging in the field of deep learning and computer vision in recent years. However, in order to apply the deep learning model to edge devices accompanying sensors on vehicles, we need to design a structure that has the best trade-off between accuracy and inference time. In previous works, several methods sacrificed accuracy to obtain a faster inference time, while others aimed to find the best accuracy under the condition of real time. Nevertheless, the accuracies of previous real-time semantic segmentation methods still have a large gap compared to general semantic segmentation methods. As a result, we propose a network architecture based on a dual encoder and a self-attention mechanism. Compared with preceding works, we achieved a 78.6% mIoU with a speed of 39.4 FPS with a 1024 × 2048 resolution on a Cityscapes test submission.

Download Full-text

A Comparative Study of Real-Time Semantic Segmentation for Autonomous Driving

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) ◽

10.1109/cvprw.2018.00101 ◽

2018 ◽

Cited By ~ 17

Author(s):

Mennatullah Siam ◽

Mostafa Gamal ◽

Moemen Abdel-Razek ◽

Senthil Yogamani ◽

Martin Jagersand ◽

...

Keyword(s):

Comparative Study ◽

Real Time ◽

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

Development of environment design support mixed reality system capable of environment estimation using deep learning

Impact ◽

10.21820/23987073.2020.2.9 ◽

2020 ◽

Vol 2020 (2) ◽

pp. 9-11

Author(s):

Tomohiro Fukuda

Keyword(s):

Deep Learning ◽

Real Time ◽

Computer Games ◽

Construction Projects ◽

Mixed Reality ◽

Semantic Segmentation ◽

Environment Design ◽

Aviation Training ◽

Architecture And Design ◽

World Environment

Mixed reality (MR) is rapidly becoming a vital tool, not just in gaming, but also in education, medicine, construction and environmental management. The term refers to systems in which computer-generated content is superimposed over objects in a real-world environment across one or more sensory modalities. Although most of us have heard of the use of MR in computer games, it also has applications in military and aviation training, as well as tourism, healthcare and more. In addition, it has the potential for use in architecture and design, where buildings can be superimposed in existing locations to render 3D generations of plans. However, one major challenge that remains in MR development is the issue of real-time occlusion. This refers to hiding 3D virtual objects behind real articles. Dr Tomohiro Fukuda, who is based at the Division of Sustainable Energy and Environmental Engineering, Graduate School of Engineering at Osaka University in Japan, is an expert in this field. Researchers, led by Dr Tomohiro Fukuda, are tackling the issue of occlusion in MR. They are currently developing a MR system that realises real-time occlusion by harnessing deep learning to achieve an outdoor landscape design simulation using a semantic segmentation technique. This methodology can be used to automatically estimate the visual environment prior to and after construction projects.

Download Full-text

Weakly-Supervised Recommended Traversable Area Segmentation Using Automatically Labeled Images for Autonomous Driving in Pedestrian Environment with No Edges

Sensors ◽

10.3390/s21020437 ◽

2021 ◽

Vol 21 (2) ◽

pp. 437

Author(s):

Yuya Onozuka ◽

Ryosuke Matsumi ◽

Motoki Shino

Keyword(s):

Visual Information ◽

Data Augmentation ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Weighting Method ◽

Personal Mobility ◽

Human Understanding ◽

Autonomous Mobility ◽

Weakly Supervised ◽

Traffic Rules

Detection of traversable areas is essential to navigation of autonomous personal mobility systems in unknown pedestrian environments. However, traffic rules may recommend or require driving in specified areas, such as sidewalks, in environments where roadways and sidewalks coexist. Therefore, it is necessary for such autonomous mobility systems to estimate the areas that are mechanically traversable and recommended by traffic rules and to navigate based on this estimation. In this paper, we propose a method for weakly-supervised recommended traversable area segmentation in environments with no edges using automatically labeled images based on paths selected by humans. This approach is based on the idea that a human-selected driving path more accurately reflects both mechanical traversability and human understanding of traffic rules and visual information. In addition, we propose a data augmentation method and a loss weighting method for detecting the appropriate recommended traversable area from a single human-selected path. Evaluation of the results showed that the proposed learning methods are effective for recommended traversable area detection and found that weakly-supervised semantic segmentation using human-selected path information is useful for recommended area detection in environments with no edges.

Download Full-text

Transformer Meets Convolution: A Bilateral Awareness Network for Semantic Segmentation of Very Fine Resolution Urban Scene Images

Remote Sensing ◽

10.3390/rs13163065 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3065

Author(s):

Libo Wang ◽

Rui Li ◽

Dongzhi Wang ◽

Chenxi Duan ◽

Teng Wang ◽

...

Keyword(s):

Large Scale ◽

Texture Features ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Research Field ◽

Learning Approaches ◽

Fine Grained ◽

Urban Scene ◽

Fine Resolution ◽

With Memory

Semantic segmentation from very fine resolution (VFR) urban scene images plays a significant role in several application scenarios including autonomous driving, land cover classification, urban planning, etc. However, the tremendous details contained in the VFR image, especially the considerable variations in scale and appearance of objects, severely limit the potential of the existing deep learning approaches. Addressing such issues represents a promising research field in the remote sensing community, which paves the way for scene-level landscape pattern analysis and decision making. In this paper, we propose a Bilateral Awareness Network which contains a dependency path and a texture path to fully capture the long-range relationships and fine-grained details in VFR images. Specifically, the dependency path is conducted based on the ResT, a novel Transformer backbone with memory-efficient multi-head self-attention, while the texture path is built on the stacked convolution operation. In addition, using the linear attention mechanism, a feature aggregation module is designed to effectively fuse the dependency features and texture features. Extensive experiments conducted on the three large-scale urban scene image segmentation datasets, i.e., ISPRS Vaihingen dataset, ISPRS Potsdam dataset, and UAVid dataset, demonstrate the effectiveness of our BANet. Specifically, a 64.6% mIoU is achieved on the UAVid dataset.

Download Full-text

Learning Uncertainty for Safety-Oriented Semantic Segmentation in Autonomous Driving

10.1109/icip42928.2021.9506719 ◽

2021 ◽

Author(s):

Victor Besnier ◽

David Picard ◽

Alexandre Briot

Keyword(s):

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

Self-Attention in Reconstruction Bias U-Net for Semantic Segmentation of Building Rooftops in Optical Remote Sensing Images

Remote Sensing ◽

10.3390/rs13132524 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2524

Author(s):

Ziyi Chen ◽

Dilong Li ◽

Wentao Fan ◽

Haiyan Guan ◽

Cheng Wang ◽

...

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Semantic Segmentation ◽

Extraction Methods ◽

The Self ◽

Optical Remote Sensing ◽

Building Extraction ◽

Learning Models ◽

Remote Sensing Images ◽

Segmentation Methods

Deep learning models have brought great breakthroughs in building extraction from high-resolution optical remote-sensing images. Among recent research, the self-attention module has called up a storm in many fields, including building extraction. However, most current deep learning models loading with the self-attention module still lose sight of the reconstruction bias’s effectiveness. Through tipping the balance between the abilities of encoding and decoding, i.e., making the decoding network be much more complex than the encoding network, the semantic segmentation ability will be reinforced. To remedy the research weakness in combing self-attention and reconstruction-bias modules for building extraction, this paper presents a U-Net architecture that combines self-attention and reconstruction-bias modules. In the encoding part, a self-attention module is added to learn the attention weights of the inputs. Through the self-attention module, the network will pay more attention to positions where there may be salient regions. In the decoding part, multiple large convolutional up-sampling operations are used for increasing the reconstruction ability. We test our model on two open available datasets: the WHU and Massachusetts Building datasets. We achieve IoU scores of 89.39% and 73.49% for the WHU and Massachusetts Building datasets, respectively. Compared with several recently famous semantic segmentation methods and representative building extraction methods, our method’s results are satisfactory.

Download Full-text

Visual Scene Understanding for Autonomous Driving Using Semantic Segmentation

Explainable AI: Interpreting, Explaining and Visualizing Deep Learning - Lecture Notes in Computer Science ◽

10.1007/978-3-030-28954-6_15 ◽

2019 ◽

pp. 285-296 ◽

Cited By ~ 1

Author(s):

Markus Hofmarcher ◽

Thomas Unterthiner ◽

José Arjona-Medina ◽

Günter Klambauer ◽

Sepp Hochreiter ◽

...

Keyword(s):

Scene Understanding ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Visual Scene ◽

Visual Scene Understanding

Download Full-text

Deep Semantic Segmentation in Autonomous Driving

Deep Learning in Computer Vision ◽

10.1201/9781351003827-6 ◽

2020 ◽

pp. 151-182

Author(s):

Hazem Rashed ◽

Senthil Yogamani ◽

Ahmad El-Sallab ◽

Mohamed Elhelw ◽

Mahmoud Hassaballah

Keyword(s):

Semantic Segmentation ◽

Autonomous Driving

Download Full-text

EVALUATION OF SEMANTIC SEGMENTATION METHODS FOR DEFORESTATION DETECTION IN THE AMAZON

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b3-2020-1497-2020 ◽

2020 ◽

Vol XLIII-B3-2020 ◽

pp. 1497-1505

Author(s):

R. B. Andrade ◽

G. A. O. P. Costa ◽

G. L. A. Mota ◽

M. X. Ortega ◽

R. Q. Feitosa ◽

...

Keyword(s):

Global Climate ◽

Biodiversity Loss ◽

Semantic Segmentation ◽

Training Sample ◽

Training Data ◽

Detection Methods ◽

Amazon Forest ◽

Convolutional Network ◽

Detection Approach ◽

Segmentation Methods

Abstract. Deforestation is a wide-reaching problem, responsible for serious environmental issues, such as biodiversity loss and global climate change. Containing approximately ten percent of all biomass on the planet and home to one tenth of the known species, the Amazon biome has faced important deforestation pressure in the last decades. Devising efficient deforestation detection methods is, therefore, key to combat illegal deforestation and to aid in the conception of public policies directed to promote sustainable development in the Amazon. In this work, we implement and evaluate a deforestation detection approach which is based on a Fully Convolutional, Deep Learning (DL) model: the DeepLabv3+. We compare the results obtained with the devised approach to those obtained with previously proposed DL-based methods (Early Fusion and Siamese Convolutional Network) using Landsat OLI-8 images acquired at different dates, covering a region of the Amazon forest. In order to evaluate the sensitivity of the methods to the amount of training data, we also evaluate them using varying training sample set sizes. The results show that all tested variants of the proposed method significantly outperform the other DL-based methods in terms of overall accuracy and F1-score. The gains in performance were even more substantial when limited amounts of samples were used in training the evaluated methods.

Download Full-text