A refined marker controlled watershed for building extraction from DSM and imagery

Deep-learning technologies, especially convolutional neural networks (CNNs), have achieved great success in building extraction from areal images. However, shape details are often lost during the down-sampling process, which results in discontinuous segmentation or inaccurate segmentation boundary. In order to compensate for the loss of shape information, two shape-related auxiliary tasks (i.e., boundary prediction and distance estimation) were jointly learned with building segmentation task in our proposed network. Meanwhile, two consistency constraint losses were designed based on the multi-task network to exploit the duality between the mask prediction and two shape-related information predictions. Specifically, an atrous spatial pyramid pooling (ASPP) module was appended to the top of the encoder of a U-shaped network to obtain multi-scale features. Based on the multi-scale features, one regression loss and two classification losses were used for predicting the distance-transform map, segmentation, and boundary. Two inter-task consistency-loss functions were constructed to ensure the consistency between distance maps and masks, and the consistency between masks and boundary maps. Experimental results on three public aerial image data sets showed that our method achieved superior performance over the recent state-of-the-art models.

Download Full-text

Building Extraction from Stereo Aerial Imagery Using Dynamic Programming

Arabian Journal for Science and Engineering ◽

10.1007/s13369-020-05224-9 ◽

2021 ◽

Vol 46 (5) ◽

pp. 5089-5103

Author(s):

H. A. Sadeq

Keyword(s):

Dynamic Programming ◽

Aerial Imagery ◽

Building Extraction

Download Full-text

Mask R‐CNN‐based building extraction from VHR satellite data in operational humanitarian action: An example related to Covid‐19 response in Khartoum, Sudan

Transactions in GIS ◽

10.1111/tgis.12766 ◽

2021 ◽

Author(s):

Dirk Tiede ◽

Gina Schwendemann ◽

Ahmad Alobaidi ◽

Lorenz Wendt ◽

Stefan Lang

Keyword(s):

Satellite Data ◽

Building Extraction ◽

Humanitarian Action

Download Full-text

Building Multi-Feature Fusion Refined Network for Building Extraction from High-Resolution Remote Sensing Images

Remote Sensing ◽

10.3390/rs13142794 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2794

Author(s):

Shuhao Ran ◽

Xianjun Gao ◽

Yuanwei Yang ◽

Shaohua Li ◽

Guangbin Zhang ◽

...

Keyword(s):

Feature Fusion ◽

Small Scale ◽

Automatic Extraction ◽

Learning Approaches ◽

Building Extraction ◽

Visual Interpretation ◽

Learning Capacity ◽

Convolutional Networks ◽

Fully Convolutional Networks ◽

Great Progress

Deep learning approaches have been widely used in building automatic extraction tasks and have made great progress in recent years. However, the missing detection and wrong detection causing by spectrum confusion is still a great challenge. The existing fully convolutional networks (FCNs) cannot effectively distinguish whether the feature differences are from one building or the building and its adjacent non-building objects. In order to overcome the limitations, a building multi-feature fusion refined network (BMFR-Net) was presented in this paper to extract buildings accurately and completely. BMFR-Net is based on an encoding and decoding structure, mainly consisting of two parts: the continuous atrous convolution pyramid (CACP) module and the multiscale output fusion constraint (MOFC) structure. The CACP module is positioned at the end of the contracting path and it effectively minimizes the loss of effective information in multiscale feature extraction and fusion by using parallel continuous small-scale atrous convolution. To improve the ability to aggregate semantic information from the context, the MOFC structure performs predictive output at each stage of the expanding path and integrates the results into the network. Furthermore, the multilevel joint weighted loss function effectively updates parameters well away from the output layer, enhancing the learning capacity of the network for low-level abstract features. The experimental results demonstrate that the proposed BMFR-Net outperforms the other five state-of-the-art approaches in both visual interpretation and quantitative evaluation.

Download Full-text

Boundary-Assisted Learning for Building Extraction from Optical Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13040760 ◽

2021 ◽

Vol 13 (4) ◽

pp. 760

Author(s):

Sheng He ◽

Wanshou Jiang

Keyword(s):

Remote Sensing ◽

Receptive Fields ◽

Morphological Characteristics ◽

Learning Task ◽

Aerial Image ◽

Model Parameters ◽

Optical Remote Sensing ◽

Building Extraction ◽

Convolutional Network ◽

Remote Sensing Imagery

Deep learning methods have been shown to significantly improve the performance of building extraction from optical remote sensing imagery. However, keeping the morphological characteristics, especially the boundaries, is still a challenge that requires further study. In this paper, we propose a novel fully convolutional network (FCN) for accurately extracting buildings, in which a boundary learning task is embedded to help maintain the boundaries of buildings. Specifically, in the training phase, our framework simultaneously learns the extraction of buildings and boundary detection and only outputs extraction results while testing. In addition, we introduce spatial variation fusion (SVF) to establish an association between the two tasks, thus coupling them and making them share the latent semantics and interact with each other. On the other hand, we utilize separable convolution with a larger kernel to enlarge the receptive fields while reducing the number of model parameters and adopt the convolutional block attention module (CBAM) to boost the network. The proposed framework was extensively evaluated on the WHU Building Dataset and the Inria Aerial Image Labeling Dataset. The experiments demonstrate that our method achieves state-of-the-art performance on building extraction. With the assistance of boundary learning, the boundary maintenance of buildings is ameliorated.

Download Full-text

Self-Attention in Reconstruction Bias U-Net for Semantic Segmentation of Building Rooftops in Optical Remote Sensing Images

Remote Sensing ◽

10.3390/rs13132524 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2524

Author(s):

Ziyi Chen ◽

Dilong Li ◽

Wentao Fan ◽

Haiyan Guan ◽

Cheng Wang ◽

...

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Semantic Segmentation ◽

Extraction Methods ◽

The Self ◽

Optical Remote Sensing ◽

Building Extraction ◽

Learning Models ◽

Remote Sensing Images ◽

Segmentation Methods

Deep learning models have brought great breakthroughs in building extraction from high-resolution optical remote-sensing images. Among recent research, the self-attention module has called up a storm in many fields, including building extraction. However, most current deep learning models loading with the self-attention module still lose sight of the reconstruction bias’s effectiveness. Through tipping the balance between the abilities of encoding and decoding, i.e., making the decoding network be much more complex than the encoding network, the semantic segmentation ability will be reinforced. To remedy the research weakness in combing self-attention and reconstruction-bias modules for building extraction, this paper presents a U-Net architecture that combines self-attention and reconstruction-bias modules. In the encoding part, a self-attention module is added to learn the attention weights of the inputs. Through the self-attention module, the network will pay more attention to positions where there may be salient regions. In the decoding part, multiple large convolutional up-sampling operations are used for increasing the reconstruction ability. We test our model on two open available datasets: the WHU and Massachusetts Building datasets. We achieve IoU scores of 89.39% and 73.49% for the WHU and Massachusetts Building datasets, respectively. Compared with several recently famous semantic segmentation methods and representative building extraction methods, our method’s results are satisfactory.

Download Full-text

Multiscale Semantic Feature Optimization and Fusion Network for Building Extraction Using High-Resolution Aerial Images and LiDAR Data

Remote Sensing ◽

10.3390/rs13132473 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2473

Author(s):

Qinglie Yuan ◽

Helmi Zulhaidi Mohd Shafri ◽

Aidi Hizami Alias ◽

Shaiful Jahari Hashim

Keyword(s):

High Resolution ◽

Large Scale ◽

Spatial Information ◽

Feature Fusion ◽

Aerial Images ◽

Semantic Gap ◽

Superior Performance ◽

Lidar Data ◽

Building Extraction ◽

Hierarchical Features

Automatic building extraction has been applied in many domains. It is also a challenging problem because of the complex scenes and multiscale. Deep learning algorithms, especially fully convolutional neural networks (FCNs), have shown robust feature extraction ability than traditional remote sensing data processing methods. However, hierarchical features from encoders with a fixed receptive field perform weak ability to obtain global semantic information. Local features in multiscale subregions cannot construct contextual interdependence and correlation, especially for large-scale building areas, which probably causes fragmentary extraction results due to intra-class feature variability. In addition, low-level features have accurate and fine-grained spatial information for tiny building structures but lack refinement and selection, and the semantic gap of across-level features is not conducive to feature fusion. To address the above problems, this paper proposes an FCN framework based on the residual network and provides the training pattern for multi-modal data combining the advantage of high-resolution aerial images and LiDAR data for building extraction. Two novel modules have been proposed for the optimization and integration of multiscale and across-level features. In particular, a multiscale context optimization module is designed to adaptively generate the feature representations for different subregions and effectively aggregate global context. A semantic guided spatial attention mechanism is introduced to refine shallow features and alleviate the semantic gap. Finally, hierarchical features are fused via the feature pyramid network. Compared with other state-of-the-art methods, experimental results demonstrate superior performance with 93.19 IoU, 97.56 OA on WHU datasets and 94.72 IoU, 97.84 OA on the Boston dataset, which shows that the proposed network can improve accuracy and achieve better performance for building extraction.

Download Full-text

Boundary Enhancement Semantic Segmentation for Building Extraction From Remote Sensed Image

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3108781 ◽

2021 ◽

pp. 1-12

Author(s):

Hoin Jung ◽

Han-Soo Choi ◽

Myungjoo Kang

Keyword(s):

Semantic Segmentation ◽

Building Extraction

Download Full-text