scholarly journals SDFCNv2: An Improved FCN Framework for Remote Sensing Images Semantic Segmentation

2021 ◽  
Vol 13 (23) ◽  
pp. 4902
Author(s):  
Guanzhou Chen ◽  
Xiaoliang Tan ◽  
Beibei Guo ◽  
Kun Zhu ◽  
Puyun Liao ◽  
...  

Semantic segmentation is a fundamental task in remote sensing image analysis (RSIA). Fully convolutional networks (FCNs) have achieved state-of-the-art performance in the task of semantic segmentation of natural scene images. However, due to distinctive differences between natural scene images and remotely-sensed (RS) images, FCN-based semantic segmentation methods from the field of computer vision cannot achieve promising performances on RS images without modifications. In previous work, we proposed an RS image semantic segmentation framework SDFCNv1, combined with a majority voting postprocessing method. Nevertheless, it still has some drawbacks, such as small receptive field and large number of parameters. In this paper, we propose an improved semantic segmentation framework SDFCNv2 based on SDFCNv1, to conduct optimal semantic segmentation on RS images. We first construct a novel FCN model with hybrid basic convolutional (HBC) blocks and spatial-channel-fusion squeeze-and-excitation (SCFSE) modules, which occupies a larger receptive field and fewer network model parameters. We also put forward a data augmentation method based on spectral-specific stochastic-gamma-transform-based (SSSGT-based) during the model training process to improve generalizability of our model. Besides, we design a mask-weighted voting decision fusion postprocessing algorithm for image segmentation on overlarge RS images. We conducted several comparative experiments on two public datasets and a real surveying and mapping dataset. Extensive experimental results demonstrate that compared with the SDFCNv1 framework, our SDFCNv2 framework can increase the mIoU metric by up to 5.22% while only using about half of parameters.

2021 ◽  
Vol 26 (1) ◽  
pp. 200-215
Author(s):  
Muhammad Alam ◽  
Jian-Feng Wang ◽  
Cong Guangpei ◽  
LV Yunrong ◽  
Yuanfang Chen

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.


2014 ◽  
Vol 5 (2) ◽  
pp. 1-21 ◽  
Author(s):  
Arpita Sharma ◽  
Samiksha Goel

This paper proposes two novel nature inspired decision level fusion techniques, Cuckoo Search Decision Fusion (CSDF) and Improved Cuckoo Search Decision Fusion (ICSDF) for enhanced and refined extraction of terrain features from remote sensing data. The developed techniques derive their basis from a recently introduced bio-inspired meta-heuristic Cuckoo Search and modify it suitably to be used as a fusion technique. The algorithms are validated on remote sensing satellite images acquired by multispectral sensors namely LISS3 Sensor image of Alwar region in Rajasthan, India and LANDSAT Sensor image of Delhi region, India. Overall accuracies obtained are substantially better than those of the four individual terrain classifiers used for fusion. Results are also compared with majority voting and average weighing policy fusion strategies. A notable achievement of the proposed fusion techniques is that the two difficult to identify terrains namely barren and urban are identified with similar high accuracies as other well identified land cover types, which was not possible by single analyzers.


2019 ◽  
pp. 813-836
Author(s):  
Arpita Sharma ◽  
Samiksha Goel

This paper proposes two novel nature inspired decision level fusion techniques, Cuckoo Search Decision Fusion (CSDF) and Improved Cuckoo Search Decision Fusion (ICSDF) for enhanced and refined extraction of terrain features from remote sensing data. The developed techniques derive their basis from a recently introduced bio-inspired meta-heuristic Cuckoo Search and modify it suitably to be used as a fusion technique. The algorithms are validated on remote sensing satellite images acquired by multispectral sensors namely LISS3 Sensor image of Alwar region in Rajasthan, India and LANDSAT Sensor image of Delhi region, India. Overall accuracies obtained are substantially better than those of the four individual terrain classifiers used for fusion. Results are also compared with majority voting and average weighing policy fusion strategies. A notable achievement of the proposed fusion techniques is that the two difficult to identify terrains namely barren and urban are identified with similar high accuracies as other well identified land cover types, which was not possible by single analyzers.


Sensors ◽  
2021 ◽  
Vol 21 (20) ◽  
pp. 6873
Author(s):  
Chuan Chen ◽  
Huilin Zhao ◽  
Wei Cui ◽  
Xin He

Traditional pixel-based semantic segmentation methods for road extraction take each pixel as the recognition unit. Therefore, they are constrained by the restricted receptive field, in which pixels do not receive global road information. These phenomena greatly affect the accuracy of road extraction. To improve the limited receptive field, a non-local neural network is generated to let each pixel receive global information. However, its spatial complexity is enormous, and this method will lead to considerable information redundancy in road extraction. To optimize the spatial complexity, the Crisscross Network (CCNet), with a crisscross shaped attention area, is applied. The key aspect of CCNet is the Crisscross Attention (CCA) module. Compared with non-local neural networks, CCNet can let each pixel only perceive the correlation information from horizontal and vertical directions. However, when using CCNet in road extraction of remote sensing (RS) images, the directionality of its attention area is insufficient, which is restricted to the horizontal and vertical direction. Due to the recurrent mechanism, the similarity of some pixel pairs in oblique directions cannot be calculated correctly and will be intensely dilated. To address the above problems, we propose a special attention module called the Dual Crisscross Attention (DCCA) module for road extraction, which consists of the CCA module, Rotated Crisscross Attention (RCCA) module and Self-adaptive Attention Fusion (SAF) module. The DCCA module is embedded into the Dual Crisscross Network (DCNet). In the CCA module and RCCA module, the similarities of pixel pairs are represented by an energy map. In order to remove the influence from the heterogeneous part, a heterogeneous filter function (HFF) is used to filter the energy map. Then the SAF module can distribute the weights of the CCA module and RCCA module according to the actual road shape. The DCCA module output is the fusion of the CCA module and RCCA module with the help of the SAF module, which can let pixels perceive local information and eight-direction non-local information. The geometric information of roads improves the accuracy of road extraction. The experimental results show that DCNet with the DCCA module improves the road IOU by 4.66% compared to CCNet with a single CCA module and 3.47% compared to CCNet with a single RCCA module.


Author(s):  
Chandra Pal Kushwah ◽  
Kuruna Markam

Bidirectional in recent years, Deep learning performance in natural scene image processing has improved its use in remote sensing image analysis. In this paper, we used the semantic segmentation of remote sensing images for deep neural networks (DNN). To make it ideal for multi-target semantic segmentation of remote sensing image systems, we boost the Seg Net encoder-decoder CNN structures with index pooling & U-net. The findings reveal that the segmentation of various objects has its benefits and drawbacks for both models. Furthermore, we provide an integrated algorithm that incorporates two models. The test results indicate that the integrated algorithm proposed will take advantage of all multi-target segmentation models and obtain improved segmentation relative to two models.


2021 ◽  
Vol 13 (3) ◽  
pp. 528
Author(s):  
Zhiying Cao ◽  
Wenhui Diao ◽  
Xian Sun ◽  
Xiaode Lyu ◽  
Menglong Yan ◽  
...  

Semantic segmentation of multi-modal remote sensing images is an important branch of remote sensing image interpretation. Multi-modal data has been proven to provide rich complementary information to deal with complex scenes. In recent years, semantic segmentation based on deep learning methods has made remarkable achievements. It is common to simply concatenate multi-modal data or use parallel branches to extract multi-modal features separately. However, most existing works ignore the effects of noise and redundant features from different modalities, which may not lead to satisfactory results. On the one hand, existing networks do not learn the complementary information of different modalities and suppress the mutual interference between different modalities, which may lead to a decrease in segmentation accuracy. On the other hand, the introduction of multi-modal data greatly increases the running time of the pixel-level dense prediction. In this work, we propose an efficient C3Net that strikes a balance between speed and accuracy. More specifically, C3Net contains several backbones for extracting features of different modalities. Then, a plug-and-play module is designed to effectively recalibrate and aggregate multi-modal features. In order to reduce the number of model parameters while remaining the model performance, we redesign the semantic contextual extraction module based on the lightweight convolutional groups. Besides, a multi-level knowledge distillation strategy is proposed to improve the performance of the compact model. Experiments on ISPRS Vaihingen dataset demonstrate the superior performance of C3Net with 15× fewer FLOPs than the state-of-the-art baseline network while providing comparable overall accuracy.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Yiqin Wang

A remote sensing image semantic segmentation algorithm based on improved ENet network is proposed to improve the accuracy of segmentation. First, dilated convolution and decomposition convolution are introduced in the coding stage. They are used in conjunction with ordinary convolution to increase the receptive field of the model. Each convolution output contains a larger range of image information. Second, in the decoding stage, the image information of different scales is obtained through the upsampling operation and then through the compression, excitation, and reweighting operations of the Squeeze and Excitation (SE) module. The weight of each feature channel is recalibrated to improve the accuracy of the network. Finally, the Softmax activation function and the Argmax function are used to obtain the final segmentation result. Experiments show that our algorithm can significantly improve the accuracy of remote sensing image semantic segmentation.


Sign in / Sign up

Export Citation Format

Share Document