scholarly journals MS-AFF: A Novel Semantic Segmentation Approach for Buried Object Based on Multi-scale Attentional Feature Fusion

Author(s):  
Chao Lu ◽  
Fansheng Chen ◽  
Xiaofeng Su ◽  
Dan Zeng

Abstract Infrared technology is a widely used in precision guidance and mine detection since it can capture the heat radiated outward from the target object. We use infrared (IR) thermography to get the infrared image of the buried obje cts. Compared to the visible images, infrared images present poor resolution, low contrast, and fuzzy visual effect, which make it difficult to segment the target object, specifically in the complex backgrounds. In this condition, traditional segmentation methods cannot perform well in infrared images since they are easily disturbed by the noise and non-target objects in the images. With the advance of deep convolutional neural network (CNN), the deep learning-based methods have made significant improvements in semantic segmentation task. However, few of them research Infrared image semantic segmentation, which is a more challenging scenario compared to visible images. Moreover, the lack of an Infrared image dataset is also a problem for current methods based on deep learning. We raise a multi-scale attentional feature fusion (MS-AFF) module for infrared image semantic segmentation to solve this problem. Precisely, we integrate a series of feature maps from different levels by an atrous spatial pyramid structure. In this way, the model can obtain rich representation ability on the infrared images. Besides, a global spatial information attention module is employed to let the model focus on the target region and reduce disturbance in infrared images' background. In addition, we propose an infrared segmentation dataset based on the infrared thermal imaging system. Extensive experiments conducted in the infrared image segmentation dataset show the superiority of our method.

2021 ◽  
Vol 13 (18) ◽  
pp. 3650
Author(s):  
Ru Luo ◽  
Jin Xing ◽  
Lifu Chen ◽  
Zhouhao Pan ◽  
Xingmin Cai ◽  
...  

Although deep learning has achieved great success in aircraft detection from SAR imagery, its blackbox behavior has been criticized for low comprehensibility and interpretability. Such challenges have impeded the trustworthiness and wide application of deep learning techniques in SAR image analytics. In this paper, we propose an innovative eXplainable Artificial Intelligence (XAI) framework to glassbox deep neural networks (DNN) by using aircraft detection as a case study. This framework is composed of three parts: hybrid global attribution mapping (HGAM) for backbone network selection, path aggregation network (PANet), and class-specific confidence scores mapping (CCSM) for visualization of the detector. HGAM integrates the local and global XAI techniques to evaluate the effectiveness of DNN feature extraction; PANet provides advanced feature fusion to generate multi-scale prediction feature maps; while CCSM relies on visualization methods to examine the detection performance with given DNN and input SAR images. This framework can select the optimal backbone DNN for aircraft detection and map the detection performance for better understanding of the DNN. We verify its effectiveness with experiments using Gaofen-3 imagery. Our XAI framework offers an explainable approach to design, develop, and deploy DNN for SAR image analytics.


2020 ◽  
Author(s):  
Fengli Lu ◽  
Chengcai Fu ◽  
Guoying Zhang ◽  
Jie Shi

Abstract Accurate segmentation of fractures in coal rock CT images is important for safe production and the development of coalbed methane. However, the coal rock fractures formed through natural geological evolution, which are complex, low contrast and different scales. Furthermore, there is no published data set of coal rock. In this paper, we proposed adaptive multi-scale feature fusion based residual U-uet (AMSFFR-U-uet) for fracture segmentation in coal rock CT images. The dilated residual blocks (DResBlock) with dilated ratio (1,2,3) are embedded into encoding branch of the U-uet structure, which can improve the ability of extract feature of network and capture different scales fractures. Furthermore, feature maps of different sizes in the encoding branch are concatenated by adaptive multi-scale feature fusion (AMSFF) module. And AMSFF can not only capture different scales fractures but also improve the restoration of spatial information. To alleviate the lack of coal rock fractures training data, we applied a set of comprehensive data augmentation operations to increase the diversity of training samples. Our network, U-net and Res-U-net are tested on our test set of coal rock CT images with five different region coal rock samples. The experimental results show that our proposed approach improve the average Dice coefficient by 2.9%, the average precision by 7.2% and the average Recall by 9.1% , respectively. Therefore, AMSFFR-U-net can achieve better segmentation results of coal rock fractures, and has stronger generalization ability and robustness.


Author(s):  
Tao Hu ◽  
Pengwan Yang ◽  
Chiliang Zhang ◽  
Gang Yu ◽  
Yadong Mu ◽  
...  

Few-shot learning is a nascent research topic, motivated by the fact that traditional deep learning methods require tremendous amounts of data. The scarcity of annotated data becomes even more challenging in semantic segmentation since pixellevel annotation in segmentation task is more labor-intensive to acquire. To tackle this issue, we propose an Attentionbased Multi-Context Guiding (A-MCG) network, which consists of three branches: the support branch, the query branch, the feature fusion branch. A key differentiator of A-MCG is the integration of multi-scale context features between support and query branches, enforcing a better guidance from the support set. In addition, we also adopt a spatial attention along the fusion branch to highlight context information from several scales, enhancing self-supervision in one-shot learning. To address the fusion problem in multi-shot learning, Conv-LSTM is adopted to collaboratively integrate the sequential support features to elevate the final accuracy. Our architecture obtains state-of-the-art on unseen classes in a variant of PASCAL VOC12 dataset and performs favorably against previous work with large gains of 1.1%, 1.4% measured in mIoU in the 1-shot and 5-shot setting.


2021 ◽  
Vol 13 (2) ◽  
pp. 38
Author(s):  
Yao Xu ◽  
Qin Yu

Great achievements have been made in pedestrian detection through deep learning. For detectors based on deep learning, making better use of features has become the key to their detection effect. While current pedestrian detectors have made efforts in feature utilization to improve their detection performance, the feature utilization is still inadequate. To solve the problem of inadequate feature utilization, we proposed the Multi-Level Feature Fusion Module (MFFM) and its Multi-Scale Feature Fusion Unit (MFFU) sub-module, which connect feature maps of the same scale and different scales by using horizontal and vertical connections and shortcut structures. All of these connections are accompanied by weights that can be learned; thus, they can be used as adaptive multi-level and multi-scale feature fusion modules to fuse the best features. Then, we built a complete pedestrian detector, the Adaptive Feature Fusion Detector (AFFDet), which is an anchor-free one-stage pedestrian detector that can make full use of features for detection. As a result, compared with other methods, our method has better performance on the challenging Caltech Pedestrian Detection Benchmark (Caltech) and has quite competitive speed. It is the current state-of-the-art one-stage pedestrian detection method.


2020 ◽  
Author(s):  
Fengli Lu ◽  
Chengcai Fu ◽  
Guoying Zhang ◽  
Jie Shi

Abstract Accurate segmentation of fractures in coal rock CT images is important for safe production and the development of coalbed methane.However,to make segment coal rock fractures accurate,the challenges as the following:1)The coal rock CT images have the characteristics which are high background noise, sparse target, weak boundary information, uneven gray level, low contrast etc.; 2)There is no a public dataset of coal rock CT images;3)Limited coal rock CT images samples.In the paper,we proposed adaptive multi-scale feature fusion based residual U-uet(AMSFFRU-uet) for fracture segmentation in coal rock CT images to address the issues.In order to reduce the loss of tiny and weak fractures, dilated residual blocks (DResBlock) are embedded into the U-uet structure, which expand the receptive field and extract fracture information atdifferent scales.Furthermore, for reducing the loss of spatial information during the down-sampling process, feature maps of different sizes in the encoding branch are concatenated by adaptive multi-scale featurefusion module,which is as the input of the first up-sampling in the decoding branch.And we applieda set of comprehensive data augmentation operations to increase the diversity of training samples. Our network,U-net and ResU-net are tested on our dataset of coal rock CT images with 5 different textures.The experimental results show that compared with U-net and ResU-net, our proposed approach improve the average Dice coefficient by 5.1% and 2.9% and the average accuracy by 4.5% and 2%,respectively.Therefore,AMSFFRU-net can achieve better segmentation of coal rock fractures,and has stronger generalization ability and robustness.


2021 ◽  
Vol 13 (2) ◽  
pp. 328
Author(s):  
Wenkai Liang ◽  
Yan Wu ◽  
Ming Li ◽  
Yice Cao ◽  
Xin Hu

The classification of high-resolution (HR) synthetic aperture radar (SAR) images is of great importance for SAR scene interpretation and application. However, the presence of intricate spatial structural patterns and complex statistical nature makes SAR image classification a challenging task, especially in the case of limited labeled SAR data. This paper proposes a novel HR SAR image classification method, using a multi-scale deep feature fusion network and covariance pooling manifold network (MFFN-CPMN). MFFN-CPMN combines the advantages of local spatial features and global statistical properties and considers the multi-feature information fusion of SAR images in representation learning. First, we propose a Gabor-filtering-based multi-scale feature fusion network (MFFN) to capture the spatial pattern and get the discriminative features of SAR images. The MFFN belongs to a deep convolutional neural network (CNN). To make full use of a large amount of unlabeled data, the weights of each layer of MFFN are optimized by unsupervised denoising dual-sparse encoder. Moreover, the feature fusion strategy in MFFN can effectively exploit the complementary information between different levels and different scales. Second, we utilize a covariance pooling manifold network to extract further the global second-order statistics of SAR images over the fusional feature maps. Finally, the obtained covariance descriptor is more distinct for various land covers. Experimental results on four HR SAR images demonstrate the effectiveness of the proposed method and achieve promising results over other related algorithms.


2021 ◽  
Vol 10 (3) ◽  
pp. 125
Author(s):  
Junqing Huang ◽  
Liguo Weng ◽  
Bingyu Chen ◽  
Min Xia

Analyzing land cover using remote sensing images has broad prospects, the precise segmentation of land cover is the key to the application of this technology. Nowadays, the Convolution Neural Network (CNN) is widely used in many image semantic segmentation tasks. However, existing CNN models often exhibit poor generalization ability and low segmentation accuracy when dealing with land cover segmentation tasks. To solve this problem, this paper proposes Dual Function Feature Aggregation Network (DFFAN). This method combines image context information, gathers image spatial information, and extracts and fuses features. DFFAN uses residual neural networks as backbone to obtain different dimensional feature information of remote sensing images through multiple downsamplings. This work designs Affinity Matrix Module (AMM) to obtain the context of each feature map and proposes Boundary Feature Fusion Module (BFF) to fuse the context information and spatial information of an image to determine the location distribution of each image’s category. Compared with existing methods, the proposed method is significantly improved in accuracy. Its mean intersection over union (MIoU) on the LandCover dataset reaches 84.81%.


2021 ◽  
Vol 63 (9) ◽  
pp. 529-533
Author(s):  
Jiali Zhang ◽  
Yupeng Tian ◽  
LiPing Ren ◽  
Jiaheng Cheng ◽  
JinChen Shi

Reflection in images is common and the removal of complex noise such as image reflection is still being explored. The problem is difficult and ill-posed, not only because there is no mixing function but also because there are no constraints in the output space (the processed image). When it comes to detecting defects on metal surfaces using infrared thermography, reflection from smooth metal surfaces can easily affect the final detection results. Therefore, it is essential to remove the reflection interference in infrared images. With the continuous application and expansion of neural networks in the field of image processing, researchers have tried to apply neural networks to remove image reflection. However, they have mainly focused on reflection interference removal in visible images and it is believed that no researchers have applied neural networks to remove reflection interference in infrared images. In this paper, the authors introduce the concept of a conditional generative adversarial network (cGAN) and propose an end-to-end trained network based on this with two types of loss: perceptual loss and adversarial loss. A self-built infrared reflection image dataset from an infrared camera is used. The experimental results demonstrate the effectiveness of this GAN for removing infrared image reflection.


Sign in / Sign up

Export Citation Format

Share Document