Boundary-Aware Refined Network for Automatic Building Extraction in Very High-Resolution Urban Aerial Images

Convolutional Neural Networks (CNNs), such as U-Net, have shown competitive performance in the automatic extraction of buildings from Very High-Resolution (VHR) aerial images. However, due to the unstable multi-scale context aggregation, the insufficient combination of multi-level features and the lack of consideration of the semantic boundary, most existing CNNs produce incomplete segmentation for large-scale buildings and result in predictions with huge uncertainty at building boundaries. This paper presents a novel network with a special boundary-aware loss embedded, called the Boundary-Aware Refined Network (BARNet), to address the gap above. The unique properties of the proposed BARNet are the gated-attention refined fusion unit, the denser atrous spatial pyramid pooling module, and the boundary-aware loss. The performance of the BARNet is tested on two popular data sets that include various urban scenes and diverse patterns of buildings. Experimental results demonstrate that the proposed method outperforms several state-of-the-art approaches in both visual interpretation and quantitative evaluations.

Download Full-text

Multiscale Semantic Feature Optimization and Fusion Network for Building Extraction Using High-Resolution Aerial Images and LiDAR Data

Remote Sensing ◽

10.3390/rs13132473 ◽

2021 ◽

Vol 13 (13) ◽

pp. 2473

Author(s):

Qinglie Yuan ◽

Helmi Zulhaidi Mohd Shafri ◽

Aidi Hizami Alias ◽

Shaiful Jahari Hashim

Keyword(s):

High Resolution ◽

Large Scale ◽

Spatial Information ◽

Feature Fusion ◽

Aerial Images ◽

Semantic Gap ◽

Superior Performance ◽

Lidar Data ◽

Building Extraction ◽

Hierarchical Features

Automatic building extraction has been applied in many domains. It is also a challenging problem because of the complex scenes and multiscale. Deep learning algorithms, especially fully convolutional neural networks (FCNs), have shown robust feature extraction ability than traditional remote sensing data processing methods. However, hierarchical features from encoders with a fixed receptive field perform weak ability to obtain global semantic information. Local features in multiscale subregions cannot construct contextual interdependence and correlation, especially for large-scale building areas, which probably causes fragmentary extraction results due to intra-class feature variability. In addition, low-level features have accurate and fine-grained spatial information for tiny building structures but lack refinement and selection, and the semantic gap of across-level features is not conducive to feature fusion. To address the above problems, this paper proposes an FCN framework based on the residual network and provides the training pattern for multi-modal data combining the advantage of high-resolution aerial images and LiDAR data for building extraction. Two novel modules have been proposed for the optimization and integration of multiscale and across-level features. In particular, a multiscale context optimization module is designed to adaptively generate the feature representations for different subregions and effectively aggregate global context. A semantic guided spatial attention mechanism is introduced to refine shallow features and alleviate the semantic gap. Finally, hierarchical features are fused via the feature pyramid network. Compared with other state-of-the-art methods, experimental results demonstrate superior performance with 93.19 IoU, 97.56 OA on WHU datasets and 94.72 IoU, 97.84 OA on the Boston dataset, which shows that the proposed network can improve accuracy and achieve better performance for building extraction.

Download Full-text

A Multi-Scale Filtering Building Index for Building Extraction in Very High-Resolution Satellite Imagery

Remote Sensing ◽

10.3390/rs11050482 ◽

2019 ◽

Vol 11 (5) ◽

pp. 482 ◽

Cited By ~ 6

Author(s):

Qi Bi ◽

Kun Qin ◽

Han Zhang ◽

Ye Zhang ◽

Zhili Li ◽

...

Keyword(s):

Remote Sensing ◽

High Resolution ◽

Common Knowledge ◽

Remote Sensing Image ◽

Morphological Operations ◽

Building Extraction ◽

Multi Scale ◽

Training Samples ◽

Image Building ◽

Very High

Building extraction plays a significant role in many high-resolution remote sensing image applications. Many current building extraction methods need training samples while it is common knowledge that different samples often lead to different generalization ability. Morphological building index (MBI), representing morphological features of building regions in an index form, can effectively extract building regions especially in Chinese urban regions without any training samples and has drawn much attention. However, some problems like the heavy computation cost of multi-scale and multi-direction morphological operations still exist. In this paper, a multi-scale filtering building index (MFBI) is proposed in the hope of overcoming these drawbacks and dealing with the increasing noise in very high-resolution remote sensing image. The profile of multi-scale average filtering is averaged and normalized to generate this index. Moreover, to fully utilize the relatively little spectral information in very high-resolution remote sensing image, two scenarios to generate the multi-channel multi-scale filtering index (MMFBI) are proposed. While no high-resolution remote sensing image building extraction dataset is open to the public now and the current very high-resolution remote sensing image building extraction datasets usually contain samples from the Northern American or European regions, we offer a very high-resolution remote sensing image building extraction datasets in which the samples contain multiple building styles from multiple Chinese regions. The proposed MFBI and MMFBI outperform MBI and the currently used object based segmentation method on the dataset, with a high recall and F-score. Meanwhile, the computation time of MFBI and MBI is compared on three large-scale very high-resolution satellite image and the sensitivity analysis demonstrates the robustness of the proposed method.

Download Full-text

Large-scale building extraction in very high-resolution aerial imagery using Mask R-CNN

2019 Joint Urban Remote Sensing Event (JURSE) ◽

10.1109/jurse.2019.8808977 ◽

2019 ◽

Cited By ~ 2

Author(s):

Dorothee Stiller ◽

Thomas Stark ◽

Michael Wurm ◽

Stefan Dech ◽

Hannes Taubenbock

Keyword(s):

High Resolution ◽

Large Scale ◽

Aerial Imagery ◽

Building Extraction ◽

Very High

Download Full-text

Semiautomatic right-angle building extraction from very high-resolution aerial images using graph cuts with star shape constraint and regularization

Journal of Applied Remote Sensing ◽

10.1117/1.jrs.12.026005 ◽

2018 ◽

Vol 12 (02) ◽

pp. 1 ◽

Cited By ~ 2

Author(s):

Chunsen Zhang ◽

Yan Hu ◽

WeiHong Cui

Keyword(s):

High Resolution ◽

Graph Cuts ◽

Aerial Images ◽

Building Extraction ◽

Shape Constraint ◽

Very High

Download Full-text

Building Extraction from Very High Resolution Stereo Satellite Images Using OBIA and Topographic Information

Environmental Sciences Proceedings ◽

10.3390/iecg2020-08908 ◽

2020 ◽

Vol 5 (1) ◽

pp. 1

Author(s):

Minakshi Kumar ◽

Ashutosh Bhardwaj

Keyword(s):

High Resolution ◽

Urban Areas ◽

Large Scale ◽

Decision Rules ◽

Urban Environments ◽

Fuzzy Rules ◽

Building Extraction ◽

Digital Elevation ◽

Elevation Model ◽

Very High

The availability of very high resolution (VHR) satellite imagery (<1 m) has opened new vistas in large-scale mapping and information management in urban environments. Buildings are the most essential dynamic incremental factor in the urban environment, and hence their extraction is the most challenging activity. Extracting the urban features, particularly buildings using traditional pixel-based classification approaches as a function of spectral tonal value, produces relatively less accurate results for these VHR Imageries. The present study demonstrates building extraction using Pleiades panchromatic (PAN) and multispectral stereo satellite datasets of highly planned and dense urban areas in parts of Chandigarh, India. The stereo datasets were processed in a photogrammetric environment to obtain the digital elevation model (DEM) and corresponding orthoimages. DEM’s were generated at 0.5 m and 2.0 m from stereo PAN and multispectral datasets, respectively. The orthoimages thus generated were segmented using object-based image analysis (OBIA) tools. The object primitives such as scale parameter, shape, textural parameters, and DEM derivatives were used for segmentation and subsequently to determine threshold values for building fuzzy rules for building extraction and classification. The rule-based classification was carried out with defined decision rules based on object primitives and fuzzy rules. Two different methods were utilized for the performance evaluation of the proposed automatic building approach. Overall accuracy, correctness, and completeness were evaluated for extracted buildings. It was observed that overall accuracy was higher (>93%) in areas having larger buildings and that were sparsely built-up as compared to areas having smaller buildings and being densely built-up.

Download Full-text

A Multi-Task Network with Distance–Mask–Boundary Consistency Constraints for Building Extraction from Aerial Images

Remote Sensing ◽

10.3390/rs13142656 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2656

Author(s):

Furong Shi ◽

Tong Zhang

Keyword(s):

Distance Estimation ◽

Image Data ◽

Learning Technologies ◽

Aerial Images ◽

Superior Performance ◽

Aerial Image ◽

Great Success ◽

Building Extraction ◽

Shape Information ◽

Multi Scale

Deep-learning technologies, especially convolutional neural networks (CNNs), have achieved great success in building extraction from areal images. However, shape details are often lost during the down-sampling process, which results in discontinuous segmentation or inaccurate segmentation boundary. In order to compensate for the loss of shape information, two shape-related auxiliary tasks (i.e., boundary prediction and distance estimation) were jointly learned with building segmentation task in our proposed network. Meanwhile, two consistency constraint losses were designed based on the multi-task network to exploit the duality between the mask prediction and two shape-related information predictions. Specifically, an atrous spatial pyramid pooling (ASPP) module was appended to the top of the encoder of a U-shaped network to obtain multi-scale features. Based on the multi-scale features, one regression loss and two classification losses were used for predicting the distance-transform map, segmentation, and boundary. Two inter-task consistency-loss functions were constructed to ensure the consistency between distance maps and masks, and the consistency between masks and boundary maps. Experimental results on three public aerial image data sets showed that our method achieved superior performance over the recent state-of-the-art models.

Download Full-text

Characterizing the Spatial and Temporal Availability of Very High Resolution Satellite Imagery in Google Earth and Microsoft Bing Maps as a Source of Reference Data

Land ◽

10.3390/land7040118 ◽

2018 ◽

Vol 7 (4) ◽

pp. 118 ◽

Cited By ~ 18

Author(s):

Myroslava Lesiv ◽

Linda See ◽

Juan Laso Bayas ◽

Tobias Sturn ◽

Dmitry Schepaschenko ◽

...

Keyword(s):

High Resolution ◽

Satellite Imagery ◽

Urban Areas ◽

Reference Data ◽

Temporal Distribution ◽

Google Earth ◽

Training Data ◽

Visual Interpretation ◽

The Usa ◽

Very High

Very high resolution (VHR) satellite imagery from Google Earth and Microsoft Bing Maps is increasingly being used in a variety of applications from computer sciences to arts and humanities. In the field of remote sensing, one use of this imagery is to create reference data sets through visual interpretation, e.g., to complement existing training data or to aid in the validation of land-cover products. Through new applications such as Collect Earth, this imagery is also being used for monitoring purposes in the form of statistical surveys obtained through visual interpretation. However, little is known about where VHR satellite imagery exists globally or the dates of the imagery. Here we present a global overview of the spatial and temporal distribution of VHR satellite imagery in Google Earth and Microsoft Bing Maps. The results show an uneven availability globally, with biases in certain areas such as the USA, Europe and India, and with clear discontinuities at political borders. We also show that the availability of VHR imagery is currently not adequate for monitoring protected areas and deforestation, but is better suited for monitoring changes in cropland or urban areas using visual interpretation.

Download Full-text

Building Extraction in Very High Resolution Imagery by Dense-Attention Networks

Remote Sensing ◽

10.3390/rs10111768 ◽

2018 ◽

Vol 10 (11) ◽

pp. 1768 ◽

Cited By ~ 24

Author(s):

Hui Yang ◽

Penghai Wu ◽

Xuedong Yao ◽

Yanlan Wu ◽

Biao Wang ◽

...

Keyword(s):

Deep Learning ◽

High Resolution ◽

Building Extraction ◽

Learning Networks ◽

Feature Maps ◽

Low Level ◽

High Resolution Imagery ◽

Very High Resolution Imagery ◽

High Level ◽

Very High

Building extraction from very high resolution (VHR) imagery plays an important role in urban planning, disaster management, navigation, updating geographic databases, and several other geospatial applications. Compared with the traditional building extraction approaches, deep learning networks have recently shown outstanding performance in this task by using both high-level and low-level feature maps. However, it is difficult to utilize different level features rationally with the present deep learning networks. To tackle this problem, a novel network based on DenseNets and the attention mechanism was proposed, called the dense-attention network (DAN). The DAN contains an encoder part and a decoder part which are separately composed of lightweight DenseNets and a spatial attention fusion module. The proposed encoder–decoder architecture can strengthen feature propagation and effectively bring higher-level feature information to suppress the low-level feature and noises. Experimental results based on public international society for photogrammetry and remote sensing (ISPRS) datasets with only red–green–blue (RGB) images demonstrated that the proposed DAN achieved a higher score (96.16% overall accuracy (OA), 92.56% F1 score, 90.56% mean intersection over union (MIOU), less training and response time and higher-quality value) when compared with other deep learning methods.

Download Full-text

Pine wilt disease detection in high-resolution UAV images using object-oriented classification

Journal of Forestry Research ◽

10.1007/s11676-021-01420-x ◽

2021 ◽

Author(s):

Zhao Sun ◽

Yifu Wang ◽

Lei Pan ◽

Yunhong Xie ◽

Bo Zhang ◽

...

Keyword(s):

High Resolution ◽

Large Scale ◽

Digital Camera ◽

Object Oriented ◽

Pine Wilt Disease ◽

Wilt Disease ◽

Segmentation Algorithm ◽

Tree Crown ◽

Multi Scale ◽

Pine Wilt

AbstractPine wilt disease (PWD) is currently one of the main causes of large-scale forest destruction. To control the spread of PWD, it is essential to detect affected pine trees quickly. This study investigated the feasibility of using the object-oriented multi-scale segmentation algorithm to identify trees discolored by PWD. We used an unmanned aerial vehicle (UAV) platform equipped with an RGB digital camera to obtain high spatial resolution images, and multi-scale segmentation was applied to delineate the tree crown, coupling the use of object-oriented classification to classify trees discolored by PWD. Then, the optimal segmentation scale was implemented using the estimation of scale parameter (ESP2) plug-in. The feature space of the segmentation results was optimized, and appropriate features were selected for classification. The results showed that the optimal scale, shape, and compactness values of the tree crown segmentation algorithm were 56, 0.5, and 0.8, respectively. The producer’s accuracy (PA), user’s accuracy (UA), and F1 score were 0.722, 0.605, and 0.658, respectively. There were no significant classification errors in the final classification results, and the low accuracy was attributed to the low number of objects count caused by incorrect segmentation. The multi-scale segmentation and object-oriented classification method could accurately identify trees discolored by PWD with a straightforward and rapid processing. This study provides a technical method for monitoring the occurrence of PWD and identifying the discolored trees of disease using UAV-based high-resolution images.

Download Full-text