scholarly journals Dual Crisscross Attention Module for Road Extraction from Remote Sensing Images

Sensors ◽  
2021 ◽  
Vol 21 (20) ◽  
pp. 6873
Author(s):  
Chuan Chen ◽  
Huilin Zhao ◽  
Wei Cui ◽  
Xin He

Traditional pixel-based semantic segmentation methods for road extraction take each pixel as the recognition unit. Therefore, they are constrained by the restricted receptive field, in which pixels do not receive global road information. These phenomena greatly affect the accuracy of road extraction. To improve the limited receptive field, a non-local neural network is generated to let each pixel receive global information. However, its spatial complexity is enormous, and this method will lead to considerable information redundancy in road extraction. To optimize the spatial complexity, the Crisscross Network (CCNet), with a crisscross shaped attention area, is applied. The key aspect of CCNet is the Crisscross Attention (CCA) module. Compared with non-local neural networks, CCNet can let each pixel only perceive the correlation information from horizontal and vertical directions. However, when using CCNet in road extraction of remote sensing (RS) images, the directionality of its attention area is insufficient, which is restricted to the horizontal and vertical direction. Due to the recurrent mechanism, the similarity of some pixel pairs in oblique directions cannot be calculated correctly and will be intensely dilated. To address the above problems, we propose a special attention module called the Dual Crisscross Attention (DCCA) module for road extraction, which consists of the CCA module, Rotated Crisscross Attention (RCCA) module and Self-adaptive Attention Fusion (SAF) module. The DCCA module is embedded into the Dual Crisscross Network (DCNet). In the CCA module and RCCA module, the similarities of pixel pairs are represented by an energy map. In order to remove the influence from the heterogeneous part, a heterogeneous filter function (HFF) is used to filter the energy map. Then the SAF module can distribute the weights of the CCA module and RCCA module according to the actual road shape. The DCCA module output is the fusion of the CCA module and RCCA module with the help of the SAF module, which can let pixels perceive local information and eight-direction non-local information. The geometric information of roads improves the accuracy of road extraction. The experimental results show that DCNet with the DCCA module improves the road IOU by 4.66% compared to CCNet with a single CCA module and 3.47% compared to CCNet with a single RCCA module.

2021 ◽  
Vol 10 (4) ◽  
pp. 245
Author(s):  
Cheng Ding ◽  
Liguo Weng ◽  
Min Xia ◽  
Haifeng Lin

Building and road extraction from remote sensing images is of great significance to urban planning. At present, most of building and road extraction models adopt deep learning semantic segmentation method. However, the existing semantic segmentation methods did not pay enough attention to the feature information between hidden layers, which led to the neglect of the category of context pixels in pixel classification, resulting in these two problems of large-scale misjudgment of buildings and disconnection of road extraction. In order to solve these problem, this paper proposes a Non-Local Feature Search Network (NFSNet) that can improve the segmentation accuracy of remote sensing images of buildings and roads, and to help achieve accurate urban planning. By strengthening the exploration of hidden layer feature information, it can effectively reduce the large area misclassification of buildings and road disconnection in the process of segmentation. Firstly, a Self-Attention Feature Transfer (SAFT) module is proposed, which searches the importance of hidden layer on channel dimension, it can obtain the correlation between channels. Secondly, the Global Feature Refinement (GFR) module is introduced to integrate the features extracted from the backbone network and SAFT module, it enhances the semantic information of the feature map and obtains more detailed segmentation output. The comparative experiments demonstrate that the proposed method outperforms state-of-the-art methods, and the model complexity is the lowest.


2019 ◽  
Vol 2019 ◽  
pp. 1-9 ◽  
Author(s):  
Aziguli Wulamu ◽  
Zuxian Shi ◽  
Dezheng Zhang ◽  
Zheyu He

Recent advances in convolutional neural networks (CNNs) have shown impressive results in semantic segmentation. Among the successful CNN-based methods, U-Net has achieved exciting performance. In this paper, we proposed a novel network architecture based on U-Net and atrous spatial pyramid pooling (ASPP) to deal with the road extraction task in the remote sensing field. On the one hand, U-Net structure can effectively extract valuable features; on the other hand, ASPP is able to utilize multiscale context information in remote sensing images. Compared to the baseline, this proposed model has improved the pixelwise mean Intersection over Union (mIoU) of 3 points. Experimental results show that the proposed network architecture can deal with different types of road surface extraction tasks under various terrains in Yinchuan city, solve the road connectivity problem to some extent, and has certain tolerance to shadows and occlusion.


2021 ◽  
Vol 13 (23) ◽  
pp. 4902
Author(s):  
Guanzhou Chen ◽  
Xiaoliang Tan ◽  
Beibei Guo ◽  
Kun Zhu ◽  
Puyun Liao ◽  
...  

Semantic segmentation is a fundamental task in remote sensing image analysis (RSIA). Fully convolutional networks (FCNs) have achieved state-of-the-art performance in the task of semantic segmentation of natural scene images. However, due to distinctive differences between natural scene images and remotely-sensed (RS) images, FCN-based semantic segmentation methods from the field of computer vision cannot achieve promising performances on RS images without modifications. In previous work, we proposed an RS image semantic segmentation framework SDFCNv1, combined with a majority voting postprocessing method. Nevertheless, it still has some drawbacks, such as small receptive field and large number of parameters. In this paper, we propose an improved semantic segmentation framework SDFCNv2 based on SDFCNv1, to conduct optimal semantic segmentation on RS images. We first construct a novel FCN model with hybrid basic convolutional (HBC) blocks and spatial-channel-fusion squeeze-and-excitation (SCFSE) modules, which occupies a larger receptive field and fewer network model parameters. We also put forward a data augmentation method based on spectral-specific stochastic-gamma-transform-based (SSSGT-based) during the model training process to improve generalizability of our model. Besides, we design a mask-weighted voting decision fusion postprocessing algorithm for image segmentation on overlarge RS images. We conducted several comparative experiments on two public datasets and a real surveying and mapping dataset. Extensive experimental results demonstrate that compared with the SDFCNv1 framework, our SDFCNv2 framework can increase the mIoU metric by up to 5.22% while only using about half of parameters.


2021 ◽  
Vol 14 (1) ◽  
pp. 102
Author(s):  
Xin Li ◽  
Tao Li ◽  
Ziqi Chen ◽  
Kaiwen Zhang ◽  
Runliang Xia

Semantic segmentation has been a fundamental task in interpreting remote sensing imagery (RSI) for various downstream applications. Due to the high intra-class variants and inter-class similarities, inflexibly transferring natural image-specific networks to RSI is inadvisable. To enhance the distinguishability of learnt representations, attention modules were developed and applied to RSI, resulting in satisfactory improvements. However, these designs capture contextual information by equally handling all the pixels regardless of whether they around edges. Therefore, blurry boundaries are generated, rising high uncertainties in classifying vast adjacent pixels. Hereby, we propose an edge distribution attention module (EDA) to highlight the edge distributions of leant feature maps in a self-attentive fashion. In this module, we first formulate and model column-wise and row-wise edge attention maps based on covariance matrix analysis. Furthermore, a hybrid attention module (HAM) that emphasizes the edge distributions and position-wise dependencies is devised combing with non-local block. Consequently, a conceptually end-to-end neural network, termed as EDENet, is proposed to integrate HAM hierarchically for the detailed strengthening of multi-level representations. EDENet implicitly learns representative and discriminative features, providing available and reasonable cues for dense prediction. The experimental results evaluated on ISPRS Vaihingen, Potsdam and DeepGlobe datasets show the efficacy and superiority to the state-of-the-art methods on overall accuracy (OA) and mean intersection over union (mIoU). In addition, the ablation study further validates the effects of EDA.


2021 ◽  
Vol 11 (1) ◽  
pp. 9
Author(s):  
Shengfu Li ◽  
Cheng Liao ◽  
Yulin Ding ◽  
Han Hu ◽  
Yang Jia ◽  
...  

Efficient and accurate road extraction from remote sensing imagery is important for applications related to navigation and Geographic Information System updating. Existing data-driven methods based on semantic segmentation recognize roads from images pixel by pixel, which generally uses only local spatial information and causes issues of discontinuous extraction and jagged boundary recognition. To address these problems, we propose a cascaded attention-enhanced architecture to extract boundary-refined roads from remote sensing images. Our proposed architecture uses spatial attention residual blocks on multi-scale features to capture long-distance relations and introduce channel attention layers to optimize the multi-scale features fusion. Furthermore, a lightweight encoder-decoder network is connected to adaptively optimize the boundaries of the extracted roads. Our experiments showed that the proposed method outperformed existing methods and achieved state-of-the-art results on the Massachusetts dataset. In addition, our method achieved competitive results on more recent benchmark datasets, e.g., the DeepGlobe and the Huawei Cloud road extraction challenge.


2019 ◽  
Vol 11 (21) ◽  
pp. 2499 ◽  
Author(s):  
Jiang Xin ◽  
Xinchang Zhang ◽  
Zhiqiang Zhang ◽  
Wu Fang

Road network extraction is one of the significant assignments for disaster emergency response, intelligent transportation systems, and real-time updating road network. Road extraction base on high-resolution remote sensing images has become a hot topic. Presently, most of the researches are based on traditional machine learning algorithms, which are complex and computational because of impervious surfaces such as roads and buildings that are discernible in the images. Given the above problems, we propose a new method to extract the road network from remote sensing images using a DenseUNet model with few parameters and robust characteristics. DenseUNet consists of dense connection units and skips connections, which strengthens the fusion of different scales by connections at various network layers. The performance of the advanced method is validated on two datasets of high-resolution images by comparison with three classical semantic segmentation methods. The experimental results show that the method can be used for road extraction in complex scenes.


2021 ◽  
Vol 10 (1) ◽  
pp. 39
Author(s):  
Kai Zhou ◽  
Yan Xie ◽  
Zhan Gao ◽  
Fang Miao ◽  
Lei Zhang

Road semantic segmentation is unique and difficult. Road extraction from remote sensing imagery often produce fragmented road segments leading to road network disconnection due to the occlusion of trees, buildings, shadows, cloud, etc. In this paper, we propose a novel fusion network (FuNet) with fusion of remote sensing imagery and location data, which plays an important role of location data in road connectivity reasoning. A universal iteration reinforcement (IteR) module is embedded into FuNet to enhance the ability of network learning. We designed the IteR formula to repeatedly integrate original information and prediction information and designed the reinforcement loss function to control the accuracy of road prediction output. Another contribution of this paper is the use of histogram equalization data pre-processing to enhance image contrast and improve the accuracy by nearly 1%. We take the excellent D-LinkNet as the backbone network, designing experiments based on the open dataset. The experiment result shows that our method improves over the compared advanced road extraction methods, which not only increases the accuracy of road extraction, but also improves the road topological connectivity.


Sign in / Sign up

Export Citation Format

Share Document