scholarly journals Accurate Building Extraction from Fused DSM and UAV Images Using a Chain Fully Convolutional Neural Network

2019 ◽  
Vol 11 (24) ◽  
pp. 2912 ◽  
Author(s):  
Wei Liu ◽  
MengYuan Yang ◽  
Meng Xie ◽  
Zihui Guo ◽  
ErZhu Li ◽  
...  

Accurate extraction of buildings using high spatial resolution imagery is essential to a wide range of urban applications. However, it is difficult to extract semantic features from a variety of complex scenes (e.g., suburban, urban and urban village areas) because various complex man-made objects usually appear heterogeneous with large intra-class and low inter-class variations. The automatic extraction of buildings is thus extremely challenging. The fully convolutional neural networks (FCNs) developed in recent years have performed well in the extraction of urban man-made objects due to their ability to learn state-of-the-art features and to label pixels end-to-end. One of the most successful FCNs used in building extraction is U-net. However, the commonly used skip connection and feature fusion refinement modules in U-net often ignore the problem of feature selection, and the ability to extract smaller buildings and refine building boundaries needs to be improved. In this paper, we propose a trainable chain fully convolutional neural network (CFCN), which fuses high spatial resolution unmanned aerial vehicle (UAV) images and the digital surface model (DSM) for building extraction. Multilevel features are obtained from the fusion data, and an improved U-net is used for the coarse extraction of the building. To solve the problem of incomplete extraction of building boundaries, a U-net network is introduced by chain, which is used for the introduction of a coarse building boundary constraint, hole filling, and "speckle" removal. Typical areas such as suburban, urban, and urban villages were selected for building extraction experiments. The results show that the CFCN achieved recall of 98.67%, 98.62%, and 99.52% and intersection over union (IoU) of 96.23%, 96.43%, and 95.76% in suburban, urban, and urban village areas, respectively. Considering the IoU in conjunction with the CFCN and U-net resulted in improvements of 6.61%, 5.31%, and 6.45% in suburban, urban, and urban village areas, respectively. The proposed method can extract buildings with higher accuracy and with clearer and more complete boundaries.

Sensors ◽  
2020 ◽  
Vol 20 (5) ◽  
pp. 1465 ◽  
Author(s):  
Lili Zhang ◽  
Jisen Wu ◽  
Yu Fan ◽  
Hongmin Gao ◽  
Yehong Shao

In this paper, we consider building extraction from high spatial resolution remote sensing images. At present, most building extraction methods are based on artificial features. However, the diversity and complexity of buildings mean that building extraction methods still face great challenges, so methods based on deep learning have recently been proposed. In this paper, a building extraction framework based on a convolution neural network and edge detection algorithm is proposed. The method is called Mask R-CNN Fusion Sobel. Because of the outstanding achievement of Mask R-CNN in the field of image segmentation, this paper improves it and then applies it in remote sensing image building extraction. Our method consists of three parts. First, the convolutional neural network is used for rough location and pixel level classification, and the problem of false and missed extraction is solved by automatically discovering semantic features. Second, Sobel edge detection algorithm is used to segment building edges accurately so as to solve the problem of edge extraction and the integrity of the object of deep convolutional neural networks in semantic segmentation. Third, buildings are extracted by the fusion algorithm. We utilize the proposed framework to extract the building in high-resolution remote sensing images from Chinese satellite GF-2, and the experiments show that the average value of IOU (intersection over union) of the proposed method was 88.7% and the average value of Kappa was 87.8%, respectively. Therefore, our method can be applied to the recognition and segmentation of complex buildings and is superior to the classical method in accuracy.


2020 ◽  
Author(s):  
Wenmei Li ◽  
Juan Wang ◽  
Ziteng Wang ◽  
Yu Wang ◽  
Yan Jia ◽  
...  

Deep convolutional neural network (DeCNN) is considered one of promising techniques for classifying the high spatial resolution remote sensing (HSRRS) scenes, due to its powerful feature extraction capabilities. It is well-known that huge high quality labeled datasets are required for achieving the better classification performances and preventing over-fitting, during the training DeCNN model process. However, the lack of high quality datasets often limits the applications of DeCNN. In order to solve this problem, in this paper, we propose a HSRRS image scene classification method using transfer learning and DeCNN (TL-DeCNN) model in few shot HSRRS scene samples. Specifically, three typical DeCNNs of VGG19, ResNet50 and InceptionV3, trained on the ImageNet2015, the weights of their convolutional layer for that of the TL-DeCNN are transferred, respectively. Then, TL-DeCNN just needs to fine-tune its classification module on the few shot HSRRS scene samples in a few epochs. Experimental results indicate that our proposed TL-DeCNN method provides absolute dominance results without over-fitting, when compared with the VGG19, ResNet50 and InceptionV3, directly trained on the few shot samples.


2020 ◽  
Author(s):  
Wenmei Li ◽  
Juan Wang ◽  
Ziteng Wang ◽  
Yu Wang ◽  
Yan Jia ◽  
...  

Deep convolutional neural network (DeCNN) is considered one of promising techniques for classifying the high spatial resolution remote sensing (HSRRS) scenes, due to its powerful feature extraction capabilities. It is well-known that huge high quality labeled datasets are required for achieving the better classification performances and preventing over-fitting, during the training DeCNN model process. However, the lack of high quality datasets often limits the applications of DeCNN. In order to solve this problem, in this paper, we propose a HSRRS image scene classification method using transfer learning and DeCNN (TL-DeCNN) model in few shot HSRRS scene samples. Specifically, three typical DeCNNs of VGG19, ResNet50 and InceptionV3, trained on the ImageNet2015, the weights of their convolutional layer for that of the TL-DeCNN are transferred, respectively. Then, TL-DeCNN just needs to fine-tune its classification module on the few shot HSRRS scene samples in a few epochs. Experimental results indicate that our proposed TL-DeCNN method provides absolute dominance results without over-fitting, when compared with the VGG19, ResNet50 and InceptionV3, directly trained on the few shot samples.


2020 ◽  
Author(s):  
Seungtaek Jeong ◽  
Jonghan Ko ◽  
Gwanyong Jeong ◽  
Myungjin Choi

<p>A satellite image-based classification for crop types can provide information on an arable land area and its changes over time. The classified information is also useful as a base dataset for various geospatial projects to retrieve crop growth and production processes for a wide area. Convolutional neural network (CNN) algorithms based on a deep neural network technique have been frequently applied for land cover classification using satellite images with a high spatial resolution, producing consistent classification outcomes. However, it is still challenging to adopt the coarse resolution images such as Moderate Resolution Imaging Spectroradiometer (MODIS) for classification purposes mainly because of uncertainty from mixed pixels, which can cause difficulty in collecting and labeling actual land cover data. Nevertheless, using coarse images is a very efficient approach for obtaining high temporal and continuous land spectral information for comparatively extensive areas (e.g., those at national and continental scales). In this study, we will classify paddy fields applying a CNN algorithm to MODIS images in Northeast Asia. Time series features of vegetation indices that appear only in paddy fields will be created as 2-dimensional images to use inputs for the classification algorithm. We will use reference land cover maps with a high spatial resolution in Korea and Japan as training and test datasets, employing identified data in person for validation. The current research effort would propose that the CNN-based classification approach using coarse spatial resolution images could have its applicability and reliability for the land cover classification process at a continental scale, providing a direction of its solution for the cause of errors in satellite images with a low spatial resolution.</p>


Author(s):  
Dandong Zhao ◽  
Haishi Zhao ◽  
Renchu Guan ◽  
Chen Yang

Building extraction with high spatial resolution images becomes an important research in the field of computer vision for urban-related applications. Due to the rich detailed information and complex texture features presented in high spatial resolution images, the distribution of buildings is non-proportional and their difference of scales is obvious. General methods often provide confusion results with other ground objects. In this paper, a building extraction framework based on deep residual neural network with a self-attention mechanism is proposed. This mechanism contains two parts: one is the spatial attention module, which is used to aggregate and relate the local and global features at each position (short and long distance context information) of buildings; the other is channel attention module, in which the representation of comprehensive features (includes color, texture, geometric and high-level semantic feature) are improved. The combination of the dual attention modules makes buildings can be extracted from the complex backgrounds. The effectiveness of our method is validated by the experiments counted on a wide range high spatial resolution image, i.e., Jilin-1 Gaofen 02A imagery. Compared with some state-of-the-art segmentation methods, i.e., DeepLab-v3+, PSPNet, and PSANet algorithms, the proposed dual attention network-based method achieved high accuracy and intersection-over-union for extraction performance and show finest recognition integrity of buildings.


Sign in / Sign up

Export Citation Format

Share Document