scholarly journals The Fishyscapes Benchmark: Measuring Blind Spots in Semantic Segmentation

Author(s):  
Hermann Blum ◽  
Paul-Edouard Sarlin ◽  
Juan Nieto ◽  
Roland Siegwart ◽  
Cesar Cadena

AbstractDeep learning has enabled impressive progress in the accuracy of semantic segmentation. Yet, the ability to estimate uncertainty and detect failure is key for safety-critical applications like autonomous driving. Existing uncertainty estimates have mostly been evaluated on simple tasks, and it is unclear whether these methods generalize to more complex scenarios. We present Fishyscapes, the first public benchmark for anomaly detection in a real-world task of semantic segmentation for urban driving. It evaluates pixel-wise uncertainty estimates towards the detection of anomalous objects. We adapt state-of-the-art methods to recent semantic segmentation models and compare uncertainty estimation approaches based on softmax confidence, Bayesian learning, density estimation, image resynthesis, as well as supervised anomaly detection methods. Our results show that anomaly detection is far from solved even for ordinary situations, while our benchmark allows measuring advancements beyond the state-of-the-art. Results, data and submission information can be found at https://fishyscapes.com/.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Aryan Mobiny ◽  
Pengyu Yuan ◽  
Supratik K. Moulik ◽  
Naveen Garg ◽  
Carol C. Wu ◽  
...  

AbstractDeep neural networks (DNNs) have achieved state-of-the-art performance in many important domains, including medical diagnosis, security, and autonomous driving. In domains where safety is highly critical, an erroneous decision can result in serious consequences. While a perfect prediction accuracy is not always achievable, recent work on Bayesian deep networks shows that it is possible to know when DNNs are more likely to make mistakes. Knowing what DNNs do not know is desirable to increase the safety of deep learning technology in sensitive applications; Bayesian neural networks attempt to address this challenge. Traditional approaches are computationally intractable and do not scale well to large, complex neural network architectures. In this paper, we develop a theoretical framework to approximate Bayesian inference for DNNs by imposing a Bernoulli distribution on the model weights. This method called Monte Carlo DropConnect (MC-DropConnect) gives us a tool to represent the model uncertainty with little change in the overall model structure or computational cost. We extensively validate the proposed algorithm on multiple network architectures and datasets for classification and semantic segmentation tasks. We also propose new metrics to quantify uncertainty estimates. This enables an objective comparison between MC-DropConnect and prior approaches. Our empirical results demonstrate that the proposed framework yields significant improvement in both prediction accuracy and uncertainty estimation quality compared to the state of the art.


2019 ◽  
Vol 11 (1) ◽  
Author(s):  
Vivian Rowoli Igenewari ◽  
Zakwan Skaf ◽  
Ian K. Jennions

Safety enhancement is a major goal of the aviation industry owing to the predicted increase in air travel. There is also the need to prevent fatalities, increase reliability and reduce monetary costs suffered as a result of delays and accidents that still occur. Accidents today are complex as a result of many causal factors acting alone but more often as a combination with other contributing factors. In tackling this trend, proactive measures have been put in place to find hazardous combinations that occur during flights in order to mitigate them before accidents occur. Flight Anomaly Detection (AD) methods are aimed at highlighting abnormal occurrences of a flight, that are different from the norm. As an improvement on the current state-of-the-art method, previous works have proposed different AD techniques for detection of previously unknown flight risks such as component faults, aircraft operational inefficiencies and some abnormal crew behaviour. However, current AD methods individually have limitations that prevent them from detecting certain significant anomalies in flight data. This paper surveys current flight AD approaches, their strengths and limitations as well as brings to light the benefits of a hybrid AD method to extend previous work and find safety-critical events, particularly those related to abnormal crew activity: a class of events known to amount for a substantial number of accidents/incidents today. It also highlights another emerging AD application opportunity, its challenges and how AD is beneficial in addressing them.


Agriculture ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 997
Author(s):  
Yun Peng ◽  
Aichen Wang ◽  
Jizhan Liu ◽  
Muhammad Faheem

Accurate fruit segmentation in images is the prerequisite and key step for precision agriculture. In this article, aiming at the segmentation of grape cluster with different varieties, 3 state-of-the-art semantic segmentation networks, i.e., Fully Convolutional Network (FCN), U-Net, and DeepLabv3+ applied on six different datasets were studied. We investigated: (1) the segmentation performance difference of the 3 studied networks; (2) The impact of different input representations on segmentation performance; (3) The effect of image enhancement method to improve the poor illumination of images and further improve the segmentation performance; (4) The impact of the distance between grape clusters and camera on segmentation performance. The experiment results show that compared with FCN and U-Net the DeepLabv3+ combined with transfer learning is more suitable for the task with an intersection over union (IoU) of 84.26%. Five different input representations, namely RGB, HSV, L*a*b, HHH, and YCrCb obtained different IoU, ranging from 81.5% to 88.44%. Among them, the L*a*b got the highest IoU. Besides, the adopted Histogram Equalization (HE) image enhancement method could improve the model’s robustness against poor illumination conditions. Through the HE preprocessing, the IoU of the enhanced dataset increased by 3.88%, from 84.26% to 88.14%. The distance between the target and camera also affects the segmentation performance, no matter in which dataset, the closer the distance, the better the segmentation performance was. In a word, the conclusion of this research provides some meaningful suggestions for the study of grape or other fruit segmentation.


Technologies ◽  
2020 ◽  
Vol 8 (2) ◽  
pp. 35
Author(s):  
Marco Toldo ◽  
Andrea Maracani ◽  
Umberto Michieli ◽  
Pietro Zanuttigh

The aim of this paper is to give an overview of the recent advancements in the Unsupervised Domain Adaptation (UDA) of deep networks for semantic segmentation. This task is attracting a wide interest since semantic segmentation models require a huge amount of labeled data and the lack of data fitting specific requirements is the main limitation in the deployment of these techniques. This field has been recently explored and has rapidly grown with a large number of ad-hoc approaches. This motivates us to build a comprehensive overview of the proposed methodologies and to provide a clear categorization. In this paper, we start by introducing the problem, its formulation and the various scenarios that can be considered. Then, we introduce the different levels at which adaptation strategies may be applied: namely, at the input (image) level, at the internal features representation and at the output level. Furthermore, we present a detailed overview of the literature in the field, dividing previous methods based on the following (non mutually exclusive) categories: adversarial learning, generative-based, analysis of the classifier discrepancies, self-teaching, entropy minimization, curriculum learning and multi-task learning. Novel research directions are also briefly introduced to give a hint of interesting open problems in the field. Finally, a comparison of the performance of the various methods in the widely used autonomous driving scenario is presented.


2018 ◽  
Vol 2018 ◽  
pp. 1-10 ◽  
Author(s):  
Zhongmin Liu ◽  
Zhicai Chen ◽  
Zhanming Li ◽  
Wenjin Hu

In recent years, techniques based on the deep detection model have achieved overwhelming improvements in the accuracy of detection, which makes them being the most adapted for the applications, such as pedestrian detection. However, speed and accuracy are a pair of contradictions that always exist and have long puzzled researchers. How to achieve the good trade-off between them is a problem we must consider while designing the detectors. To this end, we employ the general detector YOLOv2, a state-of-the-art method in the general detection tasks, in the pedestrian detection. Then we modify the network parameters and structures, according to the characteristics of the pedestrians, making this method more suitable for detecting pedestrians. Experimental results in INRIA pedestrian detection dataset show that it has a fairly high detection speed with a small precision gap compared with the state-of-the-art pedestrian detection methods. Furthermore, we add weak semantic segmentation networks after shared convolution layers to illuminate pedestrians and employ a scale-aware structure in our model according to the characteristics of the wide size range in Caltech pedestrian detection dataset, which make great progress under the original improvement.


2021 ◽  
Vol 14 (10) ◽  
pp. 1717-1729
Author(s):  
Paul Boniol ◽  
John Paparrizos ◽  
Themis Palpanas ◽  
Michael J. Franklin

With the increasing demand for real-time analytics and decision making, anomaly detection methods need to operate over streams of values and handle drifts in data distribution. Unfortunately, existing approaches have severe limitations: they either require prior domain knowledge or become cumbersome and expensive to use in situations with recurrent anomalies of the same type. In addition, subsequence anomaly detection methods usually require access to the entire dataset and are not able to learn and detect anomalies in streaming settings. To address these problems, we propose SAND, a novel online method suitable for domain-agnostic anomaly detection. SAND aims to detect anomalies based on their distance to a model that represents normal behavior. SAND relies on a novel steaming methodology to incrementally update such model, which adapts to distribution drifts and omits obsolete data. The experimental results on several real-world datasets demonstrate that SAND correctly identifies single and recurrent anomalies without prior knowledge of the characteristics of these anomalies. SAND outperforms by a large margin the current state-of-the-art algorithms in terms of accuracy while achieving orders of magnitude speedups.


Author(s):  
Bo Chen ◽  
Hua Zhang ◽  
Yonglong Li ◽  
Shuang Wang ◽  
Huaifang Zhou ◽  
...  

Abstract An increasing number of detection methods based on computer vision are applied to detect cracks in water conservancy infrastructure. However, most studies directly use existing feature extraction networks to extract cracks information, which are proposed for open-source datasets. As the cracks distribution and pixel features are different from these data, the extracted cracks information is incomplete. In this paper, a deep learning-based network for dam surface crack detection is proposed, which mainly addresses the semantic segmentation of cracks on the dam surface. Particularly, we design a shallow encoding network to extract features of crack images based on the statistical analysis of cracks. Further, to enhance the relevance of contextual information, we introduce an attention module into the decoding network. During the training, we use the sum of Cross-Entropy and Dice Loss as the loss function to overcome data imbalance. The quantitative information of cracks is extracted by the imaging principle after using morphological algorithms to extract the morphological features of the predicted result. We built a manual annotation dataset containing 1577 images to verify the effectiveness of the proposed method. This method achieves the state-of-the-art performance on our dataset. Specifically, the precision, recall, IoU, F1_measure, and accuracy achieve 90.81%, 81.54%, 75.23%, 85.93%, 99.76%, respectively. And the quantization error of cracks is less than 4%.


2020 ◽  
Author(s):  
Stefan Mayr ◽  
Igor Klein ◽  
Claudia Künzer ◽  
Martin Rutzinger

<p>Large-scale remote sensing products offer opportunities to address global society relevant questions. One of the most vital resources of our planet is fresh water. To monitor dynamics, the application of water surface time-series has proven to be an effective tool, but to access reliable information, validation efforts are essential. Furthermore, increased utilization of remote sensing time-series products can be seen in modelling applications. In this process, uncertainty estimation of input datasets is typically required. Especially for large-scale remote sensing products with high temporal resolution, common validation approaches as comparison to in situ data or intercomparison to similar products is hardly viable. Here we propose the use of supervised- and unsupervised outlier detection methods to yield pixel-wise uncertainty estimates in an internal validation. Therefore, several algorithms are applied on a global, MODIS (Moderate Resolution Imaging Spectroradiometer) based daily accessible water surface product (DLR Global WaterPack). Two main sources have been identified to introduce uncertainty to the binary classification of cloud free observations. As mixed pixels (water/non-water) and water impurities contribute to changes in the RED-NIR profile, we evaluate their effects by utilizing classified Landsat 8 images to determine water subpixel fractions and identify turbid water. Results are analyzed and compared in initial test regions across the globe.</p>


Electronics ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 11
Author(s):  
Xing Xie ◽  
Lin Bai ◽  
Xinming Huang

LiDAR has been widely used in autonomous driving systems to provide high-precision 3D geometric information about the vehicle’s surroundings for perception, localization, and path planning. LiDAR-based point cloud semantic segmentation is an important task with a critical real-time requirement. However, most of the existing convolutional neural network (CNN) models for 3D point cloud semantic segmentation are very complex and can hardly be processed at real-time on an embedded platform. In this study, a lightweight CNN structure was proposed for projection-based LiDAR point cloud semantic segmentation with only 1.9 M parameters that gave an 87% reduction comparing to the state-of-the-art networks. When evaluated on a GPU, the processing time was 38.5 ms per frame, and it achieved a 47.9% mIoU score on Semantic-KITTI dataset. In addition, the proposed CNN is targeted on an FPGA using an NVDLA architecture, which results in a 2.74x speedup over the GPU implementation with a 46 times improvement in terms of power efficiency.


Author(s):  
Yang Zhao ◽  
Wei Tian ◽  
Hong Cheng

AbstractWith the fast-developing deep learning models in the field of autonomous driving, the research on the uncertainty estimation of deep learning models has also prevailed. Herein, a pyramid Bayesian deep learning method is proposed for the model uncertainty evaluation of semantic segmentation. Semantic segmentation is one of the most important perception problems in understanding visual scene, which is critical for autonomous driving. This study to optimize Bayesian SegNet for uncertainty evaluation. This paper first simplifies the network structure of Bayesian SegNet by reducing the number of MC-Dropout layer and then introduces the pyramid pooling module to improve the performance of Bayesian SegNet. mIoU and mPAvPU are used as evaluation matrics to test the proposed method on the public Cityscapes dataset. The experimental results show that the proposed method improves the sampling effect of the Bayesian SegNet, shortens the sampling time, and improves the network performance.


Sign in / Sign up

Export Citation Format

Share Document