scholarly journals Efficient Building Extraction for High Spatial Resolution Images Based on Dual Attention Network

Author(s):  
Dandong Zhao ◽  
Haishi Zhao ◽  
Renchu Guan ◽  
Chen Yang

Building extraction with high spatial resolution images becomes an important research in the field of computer vision for urban-related applications. Due to the rich detailed information and complex texture features presented in high spatial resolution images, the distribution of buildings is non-proportional and their difference of scales is obvious. General methods often provide confusion results with other ground objects. In this paper, a building extraction framework based on deep residual neural network with a self-attention mechanism is proposed. This mechanism contains two parts: one is the spatial attention module, which is used to aggregate and relate the local and global features at each position (short and long distance context information) of buildings; the other is channel attention module, in which the representation of comprehensive features (includes color, texture, geometric and high-level semantic feature) are improved. The combination of the dual attention modules makes buildings can be extracted from the complex backgrounds. The effectiveness of our method is validated by the experiments counted on a wide range high spatial resolution image, i.e., Jilin-1 Gaofen 02A imagery. Compared with some state-of-the-art segmentation methods, i.e., DeepLab-v3+, PSPNet, and PSANet algorithms, the proposed dual attention network-based method achieved high accuracy and intersection-over-union for extraction performance and show finest recognition integrity of buildings.

Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7241
Author(s):  
Dengji Zhou ◽  
Guizhou Wang ◽  
Guojin He ◽  
Tengfei Long ◽  
Ranyu Yin ◽  
...  

Building extraction from high spatial resolution remote sensing images is a hot spot in the field of remote sensing applications and computer vision. This paper presents a semantic segmentation model, which is a supervised method, named Pyramid Self-Attention Network (PISANet). Its structure is simple, because it contains only two parts: one is the backbone of the network, which is used to learn the local features (short distance context information around the pixel) of buildings from the image; the other part is the pyramid self-attention module, which is used to obtain the global features (long distance context information with other pixels in the image) and the comprehensive features (includes color, texture, geometric and high-level semantic feature) of the building. The network is an end-to-end approach. In the training stage, the input is the remote sensing image and corresponding label, and the output is probability map (the probability that each pixel is or is not building). In the prediction stage, the input is the remote sensing image, and the output is the extraction result of the building. The complexity of the network structure was reduced so that it is easy to implement. The proposed PISANet was tested on two datasets. The result shows that the overall accuracy reached 94.50 and 96.15%, the intersection-over-union reached 77.45 and 87.97%, and F1 index reached 87.27 and 93.55%, respectively. In experiments on different datasets, PISANet obtained high overall accuracy, low error rate and improved integrity of individual buildings.


2019 ◽  
Vol 11 (24) ◽  
pp. 2912 ◽  
Author(s):  
Wei Liu ◽  
MengYuan Yang ◽  
Meng Xie ◽  
Zihui Guo ◽  
ErZhu Li ◽  
...  

Accurate extraction of buildings using high spatial resolution imagery is essential to a wide range of urban applications. However, it is difficult to extract semantic features from a variety of complex scenes (e.g., suburban, urban and urban village areas) because various complex man-made objects usually appear heterogeneous with large intra-class and low inter-class variations. The automatic extraction of buildings is thus extremely challenging. The fully convolutional neural networks (FCNs) developed in recent years have performed well in the extraction of urban man-made objects due to their ability to learn state-of-the-art features and to label pixels end-to-end. One of the most successful FCNs used in building extraction is U-net. However, the commonly used skip connection and feature fusion refinement modules in U-net often ignore the problem of feature selection, and the ability to extract smaller buildings and refine building boundaries needs to be improved. In this paper, we propose a trainable chain fully convolutional neural network (CFCN), which fuses high spatial resolution unmanned aerial vehicle (UAV) images and the digital surface model (DSM) for building extraction. Multilevel features are obtained from the fusion data, and an improved U-net is used for the coarse extraction of the building. To solve the problem of incomplete extraction of building boundaries, a U-net network is introduced by chain, which is used for the introduction of a coarse building boundary constraint, hole filling, and "speckle" removal. Typical areas such as suburban, urban, and urban villages were selected for building extraction experiments. The results show that the CFCN achieved recall of 98.67%, 98.62%, and 99.52% and intersection over union (IoU) of 96.23%, 96.43%, and 95.76% in suburban, urban, and urban village areas, respectively. Considering the IoU in conjunction with the CFCN and U-net resulted in improvements of 6.61%, 5.31%, and 6.45% in suburban, urban, and urban village areas, respectively. The proposed method can extract buildings with higher accuracy and with clearer and more complete boundaries.


2019 ◽  
Vol 11 (3) ◽  
Author(s):  
Jefferson Francisco Soares ◽  
Gláucia Miranda Ramirez ◽  
Mirléia Aparecida de Carvalho ◽  
Marcelo de Carvalho Alves ◽  
Christiany Mattioli Sarmiento ◽  
...  

The maintenance of riparian forests is considered one of the main vegetative practices for mitigating the degradation of water resources and is mandatory by law. However, in Brazil there is still a progressive and constant decharacterization of these areas. Facing this reality, it is necessary to broaden researches that identify the occurring changes and provide efficient solutions at a fast pace and low cost. Remote sensing techniques show great application potential in characterizing natural resources. The objective of this work was to map, to characterize the land use and occupation and to verify the best method of high spatial resolution image classification of the Permanent Preservation Areas of the Funil Hydroelectric Power Plant reservoir, located between the municipalities of Lavras, Perdões, Bom Sucesso, Ibituruna, Ijací and Itumirim, in the state of Minas Gerais. The methods used to classify the high spatial resolution image from the Quickbird satellite were visual, object-oriented and pixel-by-pixel. Results showed the best method for mapping land use and occupation of the study area was object-oriented classification using the K-nearest neighbor algorithm, with kappa coefficient of 0.88 and global accuracy of 91.40%.


2019 ◽  
Vol 11 (2) ◽  
pp. 108 ◽  
Author(s):  
Lu Xu ◽  
Dongping Ming ◽  
Wen Zhou ◽  
Hanqing Bao ◽  
Yangyang Chen ◽  
...  

Extracting farmland from high spatial resolution remote sensing images is a basic task for agricultural information management. According to Tobler’s first law of geography, closer objects have a stronger relation. Meanwhile, due to the scale effect, there are differences on both spatial and attribute scales among different kinds of objects. Thus, it is not appropriate to segment images with unique or fixed parameters for different kinds of objects. In view of this, this paper presents a stratified object-based farmland extraction method, which includes two key processes: one is image region division on a rough scale and the other is scale parameter pre-estimation within local regions. Firstly, the image in RGB color space is converted into HSV color space, and then the texture features of the hue layer are calculated using the grey level co-occurrence matrix method. Thus, the whole image can be divided into different regions based on the texture features, such as the mean and homogeneity. Secondly, within local regions, the optimal spatial scale segmentation parameter was pre-estimated by average local variance and its first-order and second-order rate of change. The optimal attribute scale segmentation parameter can be estimated based on the histogram of local variance. Through stratified regionalization and local segmentation parameters estimation, fine farmland segmentation can be achieved. GF-2 and Quickbird images were used in this paper, and mean-shift and multi-resolution segmentation algorithms were applied as examples to verify the validity of the proposed method. The experimental results have shown that the stratified processing method can release under-segmentation and over-segmentation phenomena to a certain extent, which ultimately benefits the accurate farmland information extraction.


Sign in / Sign up

Export Citation Format

Share Document