2D/3D information fusion for building extraction from high-resolution satellite stereo images using kernel graph cuts

2019 ◽  
Vol 40 (15) ◽  
pp. 5835-5860 ◽  
Author(s):  
Hamid Mohammadi ◽  
Farhad Samadzadegan ◽  
Peter Reinartz
Author(s):  
T. Krauss ◽  
P. d'Angelo ◽  
G. Kuschk ◽  
J. Tian ◽  
T. Partovi

In this paper we show the pre-processing and potential for environmental applications of very high resolution (VHR) satellite stereo imagery like these from WorldView-2 or Pl´eiades with ground sampling distances (GSD) of half a metre to a metre. To process such data first a dense digital surface model (DSM) has to be generated. Afterwards from this a digital terrain model (DTM) representing the ground and a so called normalized digital elevation model (nDEM) representing off-ground objects are derived. Combining these elevation based data with a spectral classification allows detection and extraction of objects from the satellite scenes. Beside the object extraction also the DSM and DTM can directly be used for simulation and monitoring of environmental issues. Examples are the simulation of floodings, building-volume and people estimation, simulation of noise from roads, wave-propagation for cellphones, wind and light for estimating renewable energy sources, 3D change detection, earthquake preparedness and crisis relief, urban development and sprawl of informal settlements and much more. Also outside of urban areas volume information brings literally a new dimension to earth oberservation tasks like the volume estimations of forests and illegal logging, volume of (illegal) open pit mining activities, estimation of flooding or tsunami risks, dike planning, etc. In this paper we present the preprocessing from the original level-1 satellite data to digital surface models (DSMs), corresponding VHR ortho images and derived digital terrain models (DTMs). From these components we present how a monitoring and decision fusion based 3D change detection can be realized by using different acquisitions. The results are analyzed and assessed to derive quality parameters for the presented method. Finally the usability of 3D information fusion from VHR satellite imagery is discussed and evaluated.


2021 ◽  
Vol 13 (13) ◽  
pp. 2473
Author(s):  
Qinglie Yuan ◽  
Helmi Zulhaidi Mohd Shafri ◽  
Aidi Hizami Alias ◽  
Shaiful Jahari Hashim

Automatic building extraction has been applied in many domains. It is also a challenging problem because of the complex scenes and multiscale. Deep learning algorithms, especially fully convolutional neural networks (FCNs), have shown robust feature extraction ability than traditional remote sensing data processing methods. However, hierarchical features from encoders with a fixed receptive field perform weak ability to obtain global semantic information. Local features in multiscale subregions cannot construct contextual interdependence and correlation, especially for large-scale building areas, which probably causes fragmentary extraction results due to intra-class feature variability. In addition, low-level features have accurate and fine-grained spatial information for tiny building structures but lack refinement and selection, and the semantic gap of across-level features is not conducive to feature fusion. To address the above problems, this paper proposes an FCN framework based on the residual network and provides the training pattern for multi-modal data combining the advantage of high-resolution aerial images and LiDAR data for building extraction. Two novel modules have been proposed for the optimization and integration of multiscale and across-level features. In particular, a multiscale context optimization module is designed to adaptively generate the feature representations for different subregions and effectively aggregate global context. A semantic guided spatial attention mechanism is introduced to refine shallow features and alleviate the semantic gap. Finally, hierarchical features are fused via the feature pyramid network. Compared with other state-of-the-art methods, experimental results demonstrate superior performance with 93.19 IoU, 97.56 OA on WHU datasets and 94.72 IoU, 97.84 OA on the Boston dataset, which shows that the proposed network can improve accuracy and achieve better performance for building extraction.


Sign in / Sign up

Export Citation Format

Share Document