EU-Net: An Efficient Fully Convolutional Network for Building Extraction from Optical Remote Sensing Images

Automatic building extraction from high-resolution remote sensing images has many practical applications, such as urban planning and supervision. However, fine details and various scales of building structures in high-resolution images bring new challenges to building extraction. An increasing number of neural network-based models have been proposed to handle these issues, while they are not efficient enough, and still suffer from the error ground truth labels. To this end, we propose an efficient end-to-end model, EU-Net, in this paper. We first design the dense spatial pyramid pooling (DSPP) to extract dense and multi-scale features simultaneously, which facilitate the extraction of buildings at all scales. Then, the focal loss is used in reverse to suppress the impact of the error labels in ground truth, making the training stage more stable. To assess the universality of the proposed model, we tested it on three public aerial remote sensing datasets: WHU aerial imagery dataset, Massachusetts buildings dataset, and Inria aerial image labeling dataset. Experimental results show that the proposed EU-Net is superior to the state-of-the-art models of all three datasets and increases the prediction efficiency by two to four times.

Download Full-text

Building Extraction from High-Resolution Aerial Imagery Using a Generative Adversarial Network with Spatial and Channel Attention Mechanisms

Remote Sensing ◽

10.3390/rs11080917 ◽

2019 ◽

Vol 11 (8) ◽

pp. 917 ◽

Cited By ~ 15

Author(s):

Xuran Pan ◽

Fan Yang ◽

Lianru Gao ◽

Zhengchao Chen ◽

Bing Zhang ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Ground Truth ◽

Semantic Segmentation ◽

Aerial Image ◽

Remote Sensing Images ◽

Generative Adversarial Network ◽

Practical Applications ◽

Image Labeling ◽

Adversarial Network

Segmentation of high-resolution remote sensing images is an important challenge with wide practical applications. The increasing spatial resolution provides fine details for image segmentation but also incurs segmentation ambiguities. In this paper, we propose a generative adversarial network with spatial and channel attention mechanisms (GAN-SCA) for the robust segmentation of buildings in remote sensing images. The segmentation network (generator) of the proposed framework is composed of the well-known semantic segmentation architecture (U-Net) and the spatial and channel attention mechanisms (SCA). The adoption of SCA enables the segmentation network to selectively enhance more useful features in specific positions and channels and enables improved results closer to the ground truth. The discriminator is an adversarial network with channel attention mechanisms that can properly discriminate the outputs of the generator and the ground truth maps. The segmentation network and adversarial network are trained in an alternating fashion on the Inria aerial image labeling dataset and Massachusetts buildings dataset. Experimental results show that the proposed GAN-SCA achieves a higher score (the overall accuracy and intersection over the union of Inria aerial image labeling dataset are 96.61% and 77.75%, respectively, and the F1-measure of the Massachusetts buildings dataset is 96.36%) and outperforms several state-of-the-art approaches.

Download Full-text

Boundary-Assisted Learning for Building Extraction from Optical Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13040760 ◽

2021 ◽

Vol 13 (4) ◽

pp. 760

Author(s):

Sheng He ◽

Wanshou Jiang

Keyword(s):

Remote Sensing ◽

Receptive Fields ◽

Morphological Characteristics ◽

Learning Task ◽

Aerial Image ◽

Model Parameters ◽

Optical Remote Sensing ◽

Building Extraction ◽

Convolutional Network ◽

Remote Sensing Imagery

Deep learning methods have been shown to significantly improve the performance of building extraction from optical remote sensing imagery. However, keeping the morphological characteristics, especially the boundaries, is still a challenge that requires further study. In this paper, we propose a novel fully convolutional network (FCN) for accurately extracting buildings, in which a boundary learning task is embedded to help maintain the boundaries of buildings. Specifically, in the training phase, our framework simultaneously learns the extraction of buildings and boundary detection and only outputs extraction results while testing. In addition, we introduce spatial variation fusion (SVF) to establish an association between the two tasks, thus coupling them and making them share the latent semantics and interact with each other. On the other hand, we utilize separable convolution with a larger kernel to enlarge the receptive fields while reducing the number of model parameters and adopt the convolutional block attention module (CBAM) to boost the network. The proposed framework was extensively evaluated on the WHU Building Dataset and the Inria Aerial Image Labeling Dataset. The experiments demonstrate that our method achieves state-of-the-art performance on building extraction. With the assistance of boundary learning, the boundary maintenance of buildings is ameliorated.

Download Full-text

Full-Level Domain Adaptation for Building Extraction in Very-High-Resolution Optical Remote-Sensing Images

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2021.3093004 ◽

2021 ◽

pp. 1-17

Author(s):

Daifeng Peng ◽

Haiyan Guan ◽

Yufu Zang ◽

Lorenzo Bruzzone

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Domain Adaptation ◽

Optical Remote Sensing ◽

Building Extraction ◽

Remote Sensing Images ◽

Very High

Download Full-text

Self-Attention in Reconstruction Bias U-Net for Semantic Segmentation of Building Rooftops in Optical Remote Sensing Images

Remote Sensing ◽

10.3390/rs13132524 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2524

Author(s):

Ziyi Chen ◽

Dilong Li ◽

Wentao Fan ◽

Haiyan Guan ◽

Cheng Wang ◽

...

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Semantic Segmentation ◽

Extraction Methods ◽

The Self ◽

Optical Remote Sensing ◽

Building Extraction ◽

Learning Models ◽

Remote Sensing Images ◽

Segmentation Methods

Deep learning models have brought great breakthroughs in building extraction from high-resolution optical remote-sensing images. Among recent research, the self-attention module has called up a storm in many fields, including building extraction. However, most current deep learning models loading with the self-attention module still lose sight of the reconstruction bias’s effectiveness. Through tipping the balance between the abilities of encoding and decoding, i.e., making the decoding network be much more complex than the encoding network, the semantic segmentation ability will be reinforced. To remedy the research weakness in combing self-attention and reconstruction-bias modules for building extraction, this paper presents a U-Net architecture that combines self-attention and reconstruction-bias modules. In the encoding part, a self-attention module is added to learn the attention weights of the inputs. Through the self-attention module, the network will pay more attention to positions where there may be salient regions. In the decoding part, multiple large convolutional up-sampling operations are used for increasing the reconstruction ability. We test our model on two open available datasets: the WHU and Massachusetts Building datasets. We achieve IoU scores of 89.39% and 73.49% for the WHU and Massachusetts Building datasets, respectively. Compared with several recently famous semantic segmentation methods and representative building extraction methods, our method’s results are satisfactory.

Download Full-text

Efficient Patch-Wise Semantic Segmentation for Large-Scale Remote Sensing Images

Sensors ◽

10.3390/s18103232 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3232 ◽

Cited By ~ 17

Author(s):

Yan Liu ◽

Qirui Ren ◽

Jiahui Geng ◽

Meng Ding ◽

Jiangyun Li

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Large Scale ◽

Semantic Segmentation ◽

Remote Sensing Image ◽

Training Data ◽

Land Resources ◽

Remote Sensing Images ◽

Training Strategy ◽

The Impact

Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, a novel patch-wise semantic segmentation method with a new training strategy based on fully convolutional networks is presented to segment common land resources. First, to handle the high-resolution image, the images are split as local patches and then a patch-wise network is built. Second, training data is preprocessed in several ways to meet the specific characteristics of remote sensing images, i.e., color imbalance, object rotation variations and lens distortion. Third, a multi-scale training strategy is developed to solve the severe scale variation problem. In addition, the impact of conditional random field (CRF) is studied to improve the precision. The proposed method was evaluated on a dataset collected from a capital city in West China with the Gaofen-2 satellite. The dataset contains ten common land resources (Grassland, Road, etc.). The experimental results show that the proposed algorithm achieves 54.96% in terms of mean intersection over union (MIoU) and outperforms other state-of-the-art methods in remote sensing image segmentation.

Download Full-text

Multi-layer Feature Extraction Network for Military Ship Detection from High-resolution Optical Remote Sensing Images

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing ◽

10.1109/jstars.2021.3123080 ◽

2021 ◽

pp. 1-1

Author(s):

Peng Qin ◽

Yulin Cai ◽

Jia Liu ◽

Puran Fan

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

High Resolution ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Ship Detection

Download Full-text

CMGFNet: A deep cross-modal gated fusion network for building extraction from very high-resolution remote sensing images

ISPRS Journal of Photogrammetry and Remote Sensing ◽

10.1016/j.isprsjprs.2021.12.007 ◽

2022 ◽

Vol 184 ◽

pp. 96-115

Author(s):

Hamidreza Hosseinpour ◽

Farhad Samadzadegan ◽

Farzaneh Dadrass Javan

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Building Extraction ◽

Remote Sensing Images ◽

Very High

Download Full-text

BRRNet: A Fully Convolutional Neural Network for Automatic Building Extraction From High-Resolution Remote Sensing Images

Remote Sensing ◽

10.3390/rs12061050 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1050 ◽

Cited By ~ 5

Author(s):

Zhenfeng Shao ◽

Penghao Tang ◽

Zhongyuan Wang ◽

Nayyer Saleem ◽

Sarath Yam ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Input Image ◽

Economic Forecast ◽

Building Extraction ◽

Remote Sensing Images ◽

Global Features ◽

Prediction Module ◽

Deep Learning Network ◽

The One

Building extraction from high-resolution remote sensing images is of great significance in urban planning, population statistics, and economic forecast. However, automatic building extraction from high-resolution remote sensing images remains challenging. On the one hand, the extraction results of buildings are partially missing and incomplete due to the variation of hue and texture within a building, especially when the building size is large. On the other hand, the building footprint extraction of buildings with complex shapes is often inaccurate. To this end, we propose a new deep learning network, termed Building Residual Refine Network (BRRNet), for accurate and complete building extraction. BRRNet consists of such two parts as the prediction module and the residual refinement module. The prediction module based on an encoder–decoder structure introduces atrous convolution of different dilation rates to extract more global features, by gradually increasing the receptive field during feature extraction. When the prediction module outputs the preliminary building extraction results of the input image, the residual refinement module takes the output of the prediction module as an input. It further refines the residual between the result of the prediction module and the real result, thus improving the accuracy of building extraction. In addition, we use Dice loss as the loss function during training, which effectively alleviates the problem of data imbalance and further improves the accuracy of building extraction. The experimental results on Massachusetts Building Dataset show that our method outperforms other five state-of-the-art methods in terms of the integrity of buildings and the accuracy of complex building footprints.

Download Full-text

Correction to ``Scene-Driven Multitask Parallel Attention Network for Building Extraction in High-Resolution Remote Sensing Images''

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2020.3021422 ◽

2020 ◽

pp. 1-1

Author(s):

Haonan Guo ◽

Qian Shi ◽

Bo Du ◽

Liangpei Zhang ◽

Dongzhi Wang ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Building Extraction ◽

Remote Sensing Images ◽

Attention Network

Download Full-text

Building Extraction from High-Resolution Remote Sensing Images Based on GrabCut with Automatic Selection of Foreground and Background Samples

Photogrammetric Engineering & Remote Sensing ◽

10.14358/pers.86.4.235 ◽

2020 ◽

Vol 86 (4) ◽

pp. 235-245 ◽

Cited By ~ 1

Author(s):

Ka Zhang ◽

Hui Chen ◽

Wen Xiao ◽

Yehua Sheng ◽

Dong Su ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Geodesic Distance ◽

Extraction Methods ◽

Building Extraction ◽

Remote Sensing Images ◽

Canny Operator ◽

Contour Lines ◽

Average Accuracy ◽

Seed Points

This article proposes a new building extraction method from high-resolution remote sensing images, based on GrabCut, which can automatically select foreground and background samples under the constraints of building elevation contour lines. First the image is rotated according to the direction of pixel displacement calculated by the rational function Model. Second, the Canny operator, combined with morphology and the Hough transform, is used to extract the building's elevation contour lines. Third, seed points and interesting points of the building are selected under the constraint of the contour line and the geodesic distance. Then foreground and background samples are obtained according to these points. Fourth, GrabCut and geometric features are used to carry out image segmentation and extract buildings. Finally, WorldView satellite images are used to verify the proposed method. Experimental results show that the average accuracy can reach 86.34%, which is 15.12% higher than other building extraction methods.

Download Full-text