Automatic Building Extraction from High-Resolution Satellite Images Using Deep Learning Techniques

Author(s):  
Mayank Dixit ◽  
Kuldeep Chaurasia ◽  
Vipul Kumar Mishra
2019 ◽  
Vol 11 (4) ◽  
pp. 403 ◽  
Author(s):  
Weijia Li ◽  
Conghui He ◽  
Jiarui Fang ◽  
Juepeng Zheng ◽  
Haohuan Fu ◽  
...  

Automatic extraction of building footprints from high-resolution satellite imagery has become an important and challenging research issue receiving greater attention. Many recent studies have explored different deep learning-based semantic segmentation methods for improving the accuracy of building extraction. Although they record substantial land cover and land use information (e.g., buildings, roads, water, etc.), public geographic information system (GIS) map datasets have rarely been utilized to improve building extraction results in existing studies. In this research, we propose a U-Net-based semantic segmentation method for the extraction of building footprints from high-resolution multispectral satellite images using the SpaceNet building dataset provided in the DeepGlobe Satellite Challenge of IEEE Conference on Computer Vision and Pattern Recognition 2018 (CVPR 2018). We explore the potential of multiple public GIS map datasets (OpenStreetMap, Google Maps, and MapWorld) through integration with the WorldView-3 satellite datasets in four cities (Las Vegas, Paris, Shanghai, and Khartoum). Several strategies are designed and combined with the U-Net–based semantic segmentation model, including data augmentation, post-processing, and integration of the GIS map data and satellite images. The proposed method achieves a total F1-score of 0.704, which is an improvement of 1.1% to 12.5% compared with the top three solutions in the SpaceNet Building Detection Competition and 3.0% to 9.2% compared with the standard U-Net–based method. Moreover, the effect of each proposed strategy and the possible reasons for the building footprint extraction results are analyzed substantially considering the actual situation of the four cities.


2020 ◽  
Vol 12 (3) ◽  
pp. 458 ◽  
Author(s):  
Ugur Alganci ◽  
Mehmet Soydas ◽  
Elif Sertel

Object detection from satellite images has been a challenging problem for many years. With the development of effective deep learning algorithms and advancement in hardware systems, higher accuracies have been achieved in the detection of various objects from very high-resolution (VHR) satellite images. This article provides a comparative evaluation of the state-of-the-art convolutional neural network (CNN)-based object detection models, which are Faster R-CNN, Single Shot Multi-box Detector (SSD), and You Look Only Once-v3 (YOLO-v3), to cope with the limited number of labeled data and to automatically detect airplanes in VHR satellite images. Data augmentation with rotation, rescaling, and cropping was applied on the test images to artificially increase the number of training data from satellite images. Moreover, a non-maximum suppression algorithm (NMS) was introduced at the end of the SSD and YOLO-v3 flows to get rid of the multiple detection occurrences near each detected object in the overlapping areas. The trained networks were applied to five independent VHR test images that cover airports and their surroundings to evaluate their performance objectively. Accuracy assessment results of the test regions proved that Faster R-CNN architecture provided the highest accuracy according to the F1 scores, average precision (AP) metrics, and visual inspection of the results. The YOLO-v3 ranked as second, with a slightly lower performance but providing a balanced trade-off between accuracy and speed. The SSD provided the lowest detection performance, but it was better in object localization. The results were also evaluated in terms of the object size and detection accuracy manner, which proved that large- and medium-sized airplanes were detected with higher accuracy.


Sign in / Sign up

Export Citation Format

Share Document