scholarly journals Looking for Change? Roll the Dice and Demand Attention

2021 ◽  
Vol 13 (18) ◽  
pp. 3707
Author(s):  
Foivos I. Diakogiannis ◽  
François Waldner ◽  
Peter Caccetta

Change detection, i.e., the identification per pixel of changes for some classes of interest from a set of bi-temporal co-registered images, is a fundamental task in the field of remote sensing. It remains challenging due to unrelated forms of change that appear at different times in input images. Here, we propose a deep learning framework for the task of semantic change detection in very high-resolution aerial images. Our framework consists of a new loss function, a new attention module, new feature extraction building blocks, and a new backbone architecture that is tailored for the task of semantic change detection. Specifically, we define a new form of set similarity that is based on an iterative evaluation of a variant of the Dice coefficient. We use this similarity metric to define a new loss function as well as a new, memory efficient, spatial and channel convolution Attention layer: the FracTAL. We introduce two new efficient self-contained feature extraction convolution units: the CEECNet and FracTALResNet units. Further, we propose a new encoder/decoder scheme, a network macro-topology, that is tailored for the task of change detection. The key insight in our approach is to facilitate the use of relative attention between two convolution layers in order to fuse them. We validate our approach by showing excellent performance and achieving state-of-the-art scores (F1 and Intersection over Union-hereafter IoU) on two building change detection datasets, namely, the LEVIRCD (F1: 0.918, IoU: 0.848) and the WHU (F1: 0.938, IoU: 0.882) datasets.

Author(s):  
Kunping Yang ◽  
Gui-Song Xia ◽  
Zicheng Liu ◽  
Bo Du ◽  
Wen Yang ◽  
...  

2005 ◽  
Vol 33 (1) ◽  
pp. 2-17 ◽  
Author(s):  
D. Colbry ◽  
D. Cherba ◽  
J. Luchini

Abstract Commercial databases containing images of tire tread patterns are currently used by product designers, forensic specialists and product application personnel to identify whether a given tread pattern matches an existing tire. Currently, this pattern matching process is almost entirely manual, requiring visual searches of extensive libraries of tire tread patterns. Our work explores a first step toward automating this pattern matching process by building on feature analysis techniques from computer vision and image processing to develop a new method for extracting and classifying features from tire tread patterns and automatically locating candidate matches from a database of existing tread pattern images. Our method begins with a selection of tire tread images obtained from multiple sources (including manufacturers' literature, Web site images, and Tire Guides, Inc.), which are preprocessed and normalized using Two-Dimensional Fast Fourier Transforms (2D-FFT). The results of this preprocessing are feature-rich images that are further analyzed using feature extraction algorithms drawn from research in computer vision. A new, feature extraction algorithm is developed based on the geometry of the 2D-FFT images of the tire. The resulting FFT-based analysis allows independent classification of the tire images along two dimensions, specifically by separating “rib” and “lug” features of the tread pattern. Dimensionality of (0,0) indicates a smooth treaded tire with no pattern; dimensionality of (1,0) and (0,1) are purely rib and lug tires; and dimensionality of (1,1) is an all-season pattern. This analysis technique allows a candidate tire to be classified according to the features of its tread pattern, and other tires with similar features and tread pattern classifications can be automatically retrieved from the database.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2803
Author(s):  
Rabeea Jaffari ◽  
Manzoor Ahmed Hashmani ◽  
Constantino Carlos Reyes-Aldasoro

The segmentation of power lines (PLs) from aerial images is a crucial task for the safe navigation of unmanned aerial vehicles (UAVs) operating at low altitudes. Despite the advances in deep learning-based approaches for PL segmentation, these models are still vulnerable to the class imbalance present in the data. The PLs occupy only a minimal portion (1–5%) of the aerial images as compared to the background region (95–99%). Generally, this class imbalance problem is addressed via the use of PL-specific detectors in conjunction with the popular class balanced cross entropy (BBCE) loss function. However, these PL-specific detectors do not work outside their application areas and a BBCE loss requires hyperparameter tuning for class-wise weights, which is not trivial. Moreover, the BBCE loss results in low dice scores and precision values and thus, fails to achieve an optimal trade-off between dice scores, model accuracy, and precision–recall values. In this work, we propose a generalized focal loss function based on the Matthews correlation coefficient (MCC) or the Phi coefficient to address the class imbalance problem in PL segmentation while utilizing a generic deep segmentation architecture. We evaluate our loss function by improving the vanilla U-Net model with an additional convolutional auxiliary classifier head (ACU-Net) for better learning and faster model convergence. The evaluation of two PL datasets, namely the Mendeley Power Line Dataset and the Power Line Dataset of Urban Scenes (PLDU), where PLs occupy around 1% and 2% of the aerial images area, respectively, reveal that our proposed loss function outperforms the popular BBCE loss by 16% in PL dice scores on both the datasets, 19% in precision and false detection rate (FDR) values for the Mendeley PL dataset and 15% in precision and FDR values for the PLDU with a minor degradation in the accuracy and recall values. Moreover, our proposed ACU-Net outperforms the baseline vanilla U-Net for the characteristic evaluation parameters in the range of 1–10% for both the PL datasets. Thus, our proposed loss function with ACU-Net achieves an optimal trade-off for the characteristic evaluation parameters without any bells and whistles. Our code is available at Github.


Author(s):  
Zhenzhen Yang ◽  
Pengfei Xu ◽  
Yongpeng Yang ◽  
Bing-Kun Bao

The U-Net has become the most popular structure in medical image segmentation in recent years. Although its performance for medical image segmentation is outstanding, a large number of experiments demonstrate that the classical U-Net network architecture seems to be insufficient when the size of segmentation targets changes and the imbalance happens between target and background in different forms of segmentation. To improve the U-Net network architecture, we develop a new architecture named densely connected U-Net (DenseUNet) network in this article. The proposed DenseUNet network adopts a dense block to improve the feature extraction capability and employs a multi-feature fuse block fusing feature maps of different levels to increase the accuracy of feature extraction. In addition, in view of the advantages of the cross entropy and the dice loss functions, a new loss function for the DenseUNet network is proposed to deal with the imbalance between target and background. Finally, we test the proposed DenseUNet network and compared it with the multi-resolutional U-Net (MultiResUNet) and the classic U-Net networks on three different datasets. The experimental results show that the DenseUNet network has significantly performances compared with the MultiResUNet and the classic U-Net networks.


2021 ◽  
Vol 180 ◽  
pp. 108098
Author(s):  
Supriya Supriya ◽  
Siuly Siuly ◽  
Hua Wang ◽  
Yanchun Zhang

2021 ◽  
Vol 7 (5) ◽  
pp. 89
Author(s):  
George K. Sidiropoulos ◽  
Polixeni Kiratsa ◽  
Petros Chatzipetrou ◽  
George A. Papakostas

This paper aims to provide a brief review of the feature extraction methods applied for finger vein recognition. The presented study is designed in a systematic way in order to bring light to the scientific interest for biometric systems based on finger vein biometric features. The analysis spans over a period of 13 years (from 2008 to 2020). The examined feature extraction algorithms are clustered into five categories and are presented in a qualitative manner by focusing mainly on the techniques applied to represent the features of the finger veins that uniquely prove a human’s identity. In addition, the case of non-handcrafted features learned in a deep learning framework is also examined. The conducted literature analysis revealed the increased interest in finger vein biometric systems as well as the high diversity of different feature extraction methods proposed over the past several years. However, last year this interest shifted to the application of Convolutional Neural Networks following the general trend of applying deep learning models in a range of disciplines. Finally, yet importantly, this work highlights the limitations of the existing feature extraction methods and describes the research actions needed to face the identified challenges.


Author(s):  
Ke Wang ◽  
Qingwen Xue ◽  
Jian John Lu

Identifying high-risk drivers before an accident happens is necessary for traffic accident control and prevention. Due to the class-imbalance nature of driving data, high-risk samples as the minority class are usually ill-treated by standard classification algorithms. Instead of applying preset sampling or cost-sensitive learning, this paper proposes a novel automated machine learning framework that simultaneously and automatically searches for the optimal sampling, cost-sensitive loss function, and probability calibration to handle class-imbalance problem in recognition of risky drivers. The hyperparameters that control sampling ratio and class weight, along with other hyperparameters, are optimized by Bayesian optimization. To demonstrate the performance of the proposed automated learning framework, we establish a risky driver recognition model as a case study, using video-extracted vehicle trajectory data of 2427 private cars on a German highway. Based on rear-end collision risk evaluation, only 4.29% of all drivers are labeled as risky drivers. The inputs of the recognition model are the discrete Fourier transform coefficients of target vehicle’s longitudinal speed, lateral speed, and the gap between the target vehicle and its preceding vehicle. Among 12 sampling methods, 2 cost-sensitive loss functions, and 2 probability calibration methods, the result of automated machine learning is consistent with manual searching but much more computation-efficient. We find that the combination of Support Vector Machine-based Synthetic Minority Oversampling TEchnique (SVMSMOTE) sampling, cost-sensitive cross-entropy loss function, and isotonic regression can significantly improve the recognition ability and reduce the error of predicted probability.


2016 ◽  
Vol 8 (12) ◽  
pp. 1030 ◽  
Author(s):  
Shouji Du ◽  
Yunsheng Zhang ◽  
Rongjun Qin ◽  
Zhihua Yang ◽  
Zhengrong Zou ◽  
...  

1998 ◽  
Author(s):  
Ajay Divakaran ◽  
Hiroshi Ito ◽  
Huifang Sun ◽  
Tommy Poon

Sign in / Sign up

Export Citation Format

Share Document