Building Extraction and Number Statistics in WUI Areas Based on UNet Structure and Ensemble Learning

Following the advancement and progression of urbanization, management problems of the wildland–urban interface (WUI) have become increasingly serious. WUI regional governance issues involve many factors including climate, humanities, etc., and have attracted attention and research from all walks of life. Building research plays a vital part in the WUI area. Building location is closely related with the planning and management of the WUI area, and the number of buildings is related to the rescue arrangement. There are two major methods to obtain this building information: one is to obtain them from relevant agencies, which is slow and lacks timeliness, while the other approach is to extract them from high-resolution remote sensing images, which is relatively inexpensive and offers improved timeliness. Inspired by the recent successful application of deep learning, in this paper, we propose a method for extracting building information from high-resolution remote sensing images based on deep learning, which is combined with ensemble learning to extract the building location. Further, we use the idea of image anomaly detection to estimate the number of buildings. After verification on two datasets, we obtain superior semantic segmentation results and achieve better building contour extraction and number estimation.

Download Full-text

Self-Attention in Reconstruction Bias U-Net for Semantic Segmentation of Building Rooftops in Optical Remote Sensing Images

Remote Sensing ◽

10.3390/rs13132524 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2524

Author(s):

Ziyi Chen ◽

Dilong Li ◽

Wentao Fan ◽

Haiyan Guan ◽

Cheng Wang ◽

...

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Semantic Segmentation ◽

Extraction Methods ◽

The Self ◽

Optical Remote Sensing ◽

Building Extraction ◽

Learning Models ◽

Remote Sensing Images ◽

Segmentation Methods

Deep learning models have brought great breakthroughs in building extraction from high-resolution optical remote-sensing images. Among recent research, the self-attention module has called up a storm in many fields, including building extraction. However, most current deep learning models loading with the self-attention module still lose sight of the reconstruction bias’s effectiveness. Through tipping the balance between the abilities of encoding and decoding, i.e., making the decoding network be much more complex than the encoding network, the semantic segmentation ability will be reinforced. To remedy the research weakness in combing self-attention and reconstruction-bias modules for building extraction, this paper presents a U-Net architecture that combines self-attention and reconstruction-bias modules. In the encoding part, a self-attention module is added to learn the attention weights of the inputs. Through the self-attention module, the network will pay more attention to positions where there may be salient regions. In the decoding part, multiple large convolutional up-sampling operations are used for increasing the reconstruction ability. We test our model on two open available datasets: the WHU and Massachusetts Building datasets. We achieve IoU scores of 89.39% and 73.49% for the WHU and Massachusetts Building datasets, respectively. Compared with several recently famous semantic segmentation methods and representative building extraction methods, our method’s results are satisfactory.

Download Full-text

U-net Network for Building Information Extraction of Remote-Sensing Imagery

International Journal of Online and Biomedical Engineering (iJOE) ◽

10.3991/ijoe.v14i12.9335 ◽

2018 ◽

Vol 14 (12) ◽

pp. 179

Author(s):

Jingtan Li ◽

Maolin Xu ◽

Hongling Xiu

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Information Extraction ◽

Image Data ◽

Semantic Segmentation ◽

Remote Sensing Image ◽

Remote Sensing Images ◽

Training Set ◽

Building Information ◽

The Face

With the resolution of remote sensing images is getting higher and higher, high-resolution remote sensing images are widely used in many areas. Among them, image information extraction is one of the basic applications of remote sensing images. In the face of massive high-resolution remote sensing image data, the traditional method of target recognition is difficult to cope with. Therefore, this paper proposes a remote sensing image extraction based on U-net network. Firstly, the U-net semantic segmentation network is used to train the training set, and the validation set is used to verify the training set at the same time, and finally the test set is used for testing. The experimental results show that U-net can be applied to the extraction of buildings.

Download Full-text

Building Extraction from Very-High-Resolution Remote Sensing Images Using Semi-Supervised Semantic Edge Detection

Remote Sensing ◽

10.3390/rs13112187 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2187

Author(s):

Liegang Xia ◽

Xiongbo Zhang ◽

Junxia Zhang ◽

Haiping Yang ◽

Tingting Chen

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

High Resolution ◽

Edge Detection ◽

Network Architecture ◽

Semantic Segmentation ◽

Google Earth ◽

Remote Sensing Images ◽

Building Roof ◽

Social Applications

The automated detection of buildings in remote sensing images enables understanding the distribution information of buildings, which is indispensable for many geographic and social applications, such as urban planning, change monitoring and population estimation. The performance of deep learning in images often depends on a large number of manually labeled samples, the production of which is time-consuming and expensive. Thus, this study focuses on reducing the number of labeled samples used and proposing a semi-supervised deep learning approach based on an edge detection network (SDLED), which is the first to introduce semi-supervised learning to the edge detection neural network for extracting building roof boundaries from high-resolution remote sensing images. This approach uses a small number of labeled samples and abundant unlabeled images for joint training. An expert-level semantic edge segmentation model is trained based on labeled samples, which guides unlabeled images to generate pseudo-labels automatically. The inaccurate label sets and manually labeled samples are used to update the semantic edge model together. Particularly, we modified the semantic segmentation network D-LinkNet to obtain high-quality pseudo-labels. Specifically, the main network architecture of D-LinkNet is retained while the multi-scale fusion is added in its second half to improve its performance on edge detection. The SDLED was tested on high-spatial-resolution remote sensing images taken from Google Earth. Results show that the SDLED performs better than the fully supervised method. Moreover, when the trained models were used to predict buildings in the neighboring counties, our approach was superior to the supervised way, with line IoU improvement of at least 6.47% and F1 score improvement of at least 7.49%.

Download Full-text

Improved Anchor-Free Instance Segmentation for Building Extraction from High-Resolution Remote Sensing Images

Remote Sensing ◽

10.3390/rs12182910 ◽

2020 ◽

Vol 12 (18) ◽

pp. 2910

Author(s):

Tong Wu ◽

Yuan Hu ◽

Ling Peng ◽

Ruonan Chen

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Aerial Images ◽

Building Extraction ◽

Remote Sensing Images ◽

Highly Sensitive ◽

Segmentation Methods ◽

Speed And Accuracy ◽

Instance Segmentation

Building extraction from high-resolution remote sensing images plays a vital part in urban planning, safety supervision, geographic databases updates, and some other applications. Several researches are devoted to using convolutional neural network (CNN) to extract buildings from high-resolution satellite/aerial images. There are two major methods, one is the CNN-based semantic segmentation methods, which can not distinguish different objects of the same category and may lead to edge connection. The other one is CNN-based instance segmentation methods, which rely heavily on pre-defined anchors, and result in the highly sensitive, high computation/storage cost and imbalance between positive and negative samples. Therefore, in this paper, we propose an improved anchor-free instance segmentation method based on CenterMask with spatial and channel attention-guided mechanisms and improved effective backbone network for accurate extraction of buildings in high-resolution remote sensing images. Then we analyze the influence of different parameters and network structure on the performance of the model, and compare the performance for building extraction of Mask R-CNN, Mask Scoring R-CNN, CenterMask, and the improved CenterMask in this paper. Experimental results show that our improved CenterMask method can successfully well-balanced performance in terms of speed and accuracy, which achieves state-of-the-art performance at real-time speed.

Download Full-text

A Stacking Ensemble Deep Learning Model for Building Extraction from Remote Sensing Images

Remote Sensing ◽

10.3390/rs13193898 ◽

2021 ◽

Vol 13 (19) ◽

pp. 3898

Author(s):

Duanguang Cao ◽

Hanfa Xing ◽

Man Sing Wong ◽

Mei-Po Kwan ◽

Huaqiao Xing ◽

...

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Optimization Method ◽

Learning Model ◽

Ensemble Model ◽

Building Extraction ◽

Remote Sensing Images ◽

Basic Model ◽

Building Information ◽

Deep Learning Model

Automatically extracting buildings from remote sensing images with deep learning is of great significance to urban planning, disaster prevention, change detection, and other applications. Various deep learning models have been proposed to extract building information, showing both strengths and weaknesses in capturing the complex spectral and spatial characteristics of buildings in remote sensing images. To integrate the strengths of individual models and obtain fine-scale spatial and spectral building information, this study proposed a stacking ensemble deep learning model. First, an optimization method for the prediction results of the basic model is proposed based on fully connected conditional random fields (CRFs). On this basis, a stacking ensemble model (SENet) based on a sparse autoencoder integrating U-NET, SegNet, and FCN-8s models is proposed to combine the features of the optimized basic model prediction results. Utilizing several cities in Hebei Province, China as a case study, a building dataset containing attribute labels is established to assess the performance of the proposed model. The proposed SENet is compared with three individual models (U-NET, SegNet and FCN-8s), and the results show that the accuracy of SENet is 0.954, approximately 6.7%, 6.1%, and 9.8% higher than U-NET, SegNet, and FCN-8s models, respectively. The identification of building features, including colors, sizes, shapes, and shadows, is also evaluated, showing that the accuracy, recall, F1 score, and intersection over union (IoU) of the SENet model are higher than those of the three individual models. This suggests that the proposed ensemble model can effectively depict the different features of buildings and provides an alternative approach to building extraction with higher accuracy.

Download Full-text

Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images

Mobile Networks and Applications ◽

10.1007/s11036-020-01703-3 ◽

2021 ◽

Vol 26 (1) ◽

pp. 200-215

Author(s):

Muhammad Alam ◽

Jian-Feng Wang ◽

Cong Guangpei ◽

LV Yunrong ◽

Yuanfang Chen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Semantic Segmentation ◽

Natural Scene ◽

Remote Sensing Images ◽

Advantages And Disadvantages ◽

Target Segmentation

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.

Download Full-text

Conditional Generative Adversarial Network-Based Training Sample Set Improvement Model for the Semantic Segmentation of High-Resolution Remote Sensing Images

IEEE Transactions on Geoscience and Remote Sensing ◽

10.1109/tgrs.2020.3033816 ◽

2020 ◽

pp. 1-17

Author(s):

Xin Pan ◽

Jian Zhao ◽

Jun Xu

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Training Sample ◽

Remote Sensing Images ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Sample Set

Download Full-text

Semantic Segmentation of High Resolution Remote Sensing Images with Extra Context Attention Mechanism

2020 IEEE 20th International Conference on Communication Technology (ICCT) ◽

10.1109/icct50939.2020.9295814 ◽

2020 ◽

Author(s):

Weifu Fu ◽

Qing Peng ◽

Yanxiang Gong ◽

Mei Xie ◽

Shicheng Wang ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Semantic Segmentation ◽

Attention Mechanism ◽

Remote Sensing Images

Download Full-text

HRCNet: High-Resolution Context Extraction Network for Semantic Segmentation of Remote Sensing Images

Remote Sensing ◽

10.3390/rs13010071 ◽

2020 ◽

Vol 13 (1) ◽

pp. 71

Author(s):

Zhiyong Xu ◽

Weicun Zhang ◽

Tianxiang Zhang ◽

Jiangyun Li

Keyword(s):

Remote Sensing ◽

Feature Extraction ◽

High Resolution ◽

Spatial Information ◽

Semantic Segmentation ◽

Context Information ◽

Remote Sensing Images ◽

Global Context ◽

Boundary Information ◽

Extraction Stage

Semantic segmentation is a significant method in remote sensing image (RSIs) processing and has been widely used in various applications. Conventional convolutional neural network (CNN)-based semantic segmentation methods are likely to lose the spatial information in the feature extraction stage and usually pay little attention to global context information. Moreover, the imbalance of category scale and uncertain boundary information meanwhile exists in RSIs, which also brings a challenging problem to the semantic segmentation task. To overcome these problems, a high-resolution context extraction network (HRCNet) based on a high-resolution network (HRNet) is proposed in this paper. In this approach, the HRNet structure is adopted to keep the spatial information. Moreover, the light-weight dual attention (LDA) module is designed to obtain global context information in the feature extraction stage and the feature enhancement feature pyramid (FEFP) structure is promoted and employed to fuse the contextual information of different scales. In addition, to achieve the boundary information, we design the boundary aware (BA) module combined with the boundary aware loss (BAloss) function. The experimental results evaluated on Potsdam and Vaihingen datasets show that the proposed approach can significantly improve the boundary and segmentation performance up to 92.0% and 92.3% on overall accuracy scores, respectively. As a consequence, it is envisaged that the proposed HRCNet model will be an advantage in remote sensing images segmentation.

Download Full-text

Analysis of Encoder-Decoder Based Deep Learning Architectures for Semantic Segmentation in Remote Sensing Images

Advances in Intelligent Systems and Computing - Intelligent Systems Design and Applications ◽

10.1007/978-3-030-16660-1_33 ◽

2019 ◽

pp. 332-341

Author(s):

R. Sivagami ◽

J. Srihari ◽

K. S. Ravichandran

Keyword(s):

Remote Sensing ◽

Deep Learning ◽

Semantic Segmentation ◽

Remote Sensing Images ◽

Learning Architectures

Download Full-text