scholarly journals Automatic Building Footprint Extraction from Multi-Resolution Remote Sensing Images Using a Hybrid FCN

2019 ◽  
Vol 8 (4) ◽  
pp. 191 ◽  
Author(s):  
Philipp Schuegraf ◽  
Ksenia Bittner

Recent technical developments made it possible to supply large-scale satellite image coverage. This poses the challenge of efficient discovery of imagery. One very important task in applications like urban planning and reconstruction is to automatically extract building footprints. The integration of different information, which is presently achievable due to the availability of high-resolution remote sensing data sources, makes it possible to improve the quality of the extracted building outlines. Recently, deep neural networks were extended from image-level to pixel-level labelling, allowing to densely predict semantic labels. Based on these advances, we propose an end-to-end U-shaped neural network, which efficiently merges depth and spectral information within two parallel networks combined at the late stage for binary building mask generation. Moreover, as satellites usually provide high-resolution panchromatic images, but only low-resolution multi-spectral images, we tackle this issue by using a residual neural network block. It fuses those images with different spatial resolution at the early stage, before passing the fused information to the Unet stream, responsible for processing spectral information. In a parallel stream, a stereo digital surface model (DSM) is also processed by the Unet. Additionally, we demonstrate that our method generalizes for use in cities which are not included in the training data.

Sensors ◽  
2018 ◽  
Vol 18 (10) ◽  
pp. 3232 ◽  
Author(s):  
Yan Liu ◽  
Qirui Ren ◽  
Jiahui Geng ◽  
Meng Ding ◽  
Jiangyun Li

Efficient and accurate semantic segmentation is the key technique for automatic remote sensing image analysis. While there have been many segmentation methods based on traditional hand-craft feature extractors, it is still challenging to process high-resolution and large-scale remote sensing images. In this work, a novel patch-wise semantic segmentation method with a new training strategy based on fully convolutional networks is presented to segment common land resources. First, to handle the high-resolution image, the images are split as local patches and then a patch-wise network is built. Second, training data is preprocessed in several ways to meet the specific characteristics of remote sensing images, i.e., color imbalance, object rotation variations and lens distortion. Third, a multi-scale training strategy is developed to solve the severe scale variation problem. In addition, the impact of conditional random field (CRF) is studied to improve the precision. The proposed method was evaluated on a dataset collected from a capital city in West China with the Gaofen-2 satellite. The dataset contains ten common land resources (Grassland, Road, etc.). The experimental results show that the proposed algorithm achieves 54.96% in terms of mean intersection over union (MIoU) and outperforms other state-of-the-art methods in remote sensing image segmentation.


Author(s):  
C. Xiao ◽  
R. Qin ◽  
X. Huang ◽  
J. Li

<p><strong>Abstract.</strong> Individual tree detection and counting are critical for the forest inventory management. In almost all of these methods that based on remote sensing data, the treetop detection is the most important and essential part. However, due to the diversities of the tree attributes, such as crown size and branch distribution, it is hard to find a universal treetop detector and most of the current detectors need to be carefully designed based on the heuristic or prior knowledge. Hence, to find an efficient and versatile detector, we apply deep neural network to extract and learn the high-level semantic treetop features. In contrast to using manually labelled training data, we innovatively train the network with the pseudo ones that come from the result of the conventional non-supervised treetop detectors which may be not robust in different scenarios. In this study, we use multi-view high-resolution satellite imagery derived DSM (Digital Surface Model) and multispectral orthophoto as data and apply the top-hat by reconstruction (THR) operation to find treetops as the pseudo labels. The FCN (fully convolutional network) is adopted as a pixel-level classification network to segment the input image into treetops and non-treetops pixels. Our experiments show that the FCN based treetop detector is able to achieve a detection accuracy of 99.7<span class="thinspace"></span>% at the prairie area and 66.3<span class="thinspace"></span>% at the complicated town area which shows better performance than THR in the various scenarios. This study demonstrates that without manual labels, the FCN treetop detector can be trained by the pseudo labels that generated using the non-supervised detector and achieve better and robust results in different scenarios.</p>


2020 ◽  
Vol 12 (21) ◽  
pp. 3501
Author(s):  
Qingsong Xu ◽  
Xin Yuan ◽  
Chaojun Ouyang ◽  
Yue Zeng

Unlike conventional natural (RGB) images, the inherent large scale and complex structures of remote sensing images pose major challenges such as spatial object distribution diversity and spectral information extraction when existing models are directly applied for image classification. In this study, we develop an attention-based pyramid network for segmentation and classification of remote sensing datasets. Attention mechanisms are used to develop the following modules: (i) a novel and robust attention-based multi-scale fusion method effectively fuses useful spatial or spectral information at different and same scales; (ii) a region pyramid attention mechanism using region-based attention addresses the target geometric size diversity in large-scale remote sensing images; and (iii) cross-scale attention in our adaptive atrous spatial pyramid pooling network adapts to varied contents in a feature-embedded space. Different forms of feature fusion pyramid frameworks are established by combining these attention-based modules. First, a novel segmentation framework, called the heavy-weight spatial feature fusion pyramid network (FFPNet), is proposed to address the spatial problem of high-resolution remote sensing images. Second, an end-to-end spatial-spectral FFPNet is presented for classifying hyperspectral images. Experiments conducted on ISPRS Vaihingen and ISPRS Potsdam high-resolution datasets demonstrate the competitive segmentation accuracy achieved by the proposed heavy-weight spatial FFPNet. Furthermore, experiments on the Indian Pines and the University of Pavia hyperspectral datasets indicate that the proposed spatial-spectral FFPNet outperforms the current state-of-the-art methods in hyperspectral image classification.


2019 ◽  
Vol 11 (7) ◽  
pp. 755 ◽  
Author(s):  
Xiaodong Zhang ◽  
Kun Zhu ◽  
Guanzhou Chen ◽  
Xiaoliang Tan ◽  
Lifei Zhang ◽  
...  

Object detection on very-high-resolution (VHR) remote sensing imagery has attracted a lot of attention in the field of image automatic interpretation. Region-based convolutional neural networks (CNNs) have been vastly promoted in this domain, which first generate candidate regions and then accurately classify and locate the objects existing in these regions. However, the overlarge images, the complex image backgrounds and the uneven size and quantity distribution of training samples make the detection tasks more challenging, especially for small and dense objects. To solve these problems, an effective region-based VHR remote sensing imagery object detection framework named Double Multi-scale Feature Pyramid Network (DM-FPN) was proposed in this paper, which utilizes inherent multi-scale pyramidal features and combines the strong-semantic, low-resolution features and the weak-semantic, high-resolution features simultaneously. DM-FPN consists of a multi-scale region proposal network and a multi-scale object detection network, these two modules share convolutional layers and can be trained end-to-end. We proposed several multi-scale training strategies to increase the diversity of training data and overcome the size restrictions of the input images. We also proposed multi-scale inference and adaptive categorical non-maximum suppression (ACNMS) strategies to promote detection performance, especially for small and dense objects. Extensive experiments and comprehensive evaluations on large-scale DOTA dataset demonstrate the effectiveness of the proposed framework, which achieves mean average precision (mAP) value of 0.7927 on validation dataset and the best mAP value of 0.793 on testing dataset.


Author(s):  
Y. Han ◽  
S. Wang ◽  
D. Gong ◽  
Y. Wang ◽  
Y. Wang ◽  
...  

Abstract. Data from the optical satellite imaging sensors running 24/7, is collecting in embarrassing abundance nowadays. Besides more suitable for large-scale mapping, multi-view high-resolution satellite images (HRSI) are cheaper when comparing to Light Detection And Ranging (LiDAR) data and aerial remotely sensed images, which are more accessible sources for digital surface modelling and updating. Digital Surface Model (DSM) generation is one of the most critical steps for mapping, 3D modelling, and semantic interpretation. Computing DSM from this dataset is relatively new, and several solutions exist in the market, both commercial and open-source solutions, the performances of these solutions have not yet been comprehensively analyzed. Although some works and challenges have focused on the DSM generation pipeline and the geometric accuracy of the generated DSM, the evaluations, however, do not consider the latest solutions as the fast development in this domain. In this work, we discussed the pipeline of the considered both commercial and opensource solutions, assessed the accuracy of the multi-view satellite image-based DSMs generation methods with LiDAR-derived DSM as the ground truth. Three solutions, including Satellite Stereo Pipeline (S2P), PCI Geomatica, and Agisoft Metashape, are evaluated on a WorldView-3 multi-view satellite dataset both quantitatively and qualitatively with the LiDAR ground truth. Our comparison and findings are presented in the experimental section.


2021 ◽  
Vol 13 (16) ◽  
pp. 3243
Author(s):  
Pengfei Shi ◽  
Qigang Jiang ◽  
Chao Shi ◽  
Jing Xi ◽  
Guofang Tao ◽  
...  

Oil is an important resource for the development of modern society. Accurate detection of oil wells is of great significance to the investigation of oil exploitation status and the formulation of an exploitation plan. However, detecting small objects in large-scale and high-resolution remote sensing images, such as oil wells, is a challenging task due to the problems of large number, limited pixels, and complex background. In order to overcome this problem, first, we create our own oil well dataset to conduct experiments given the lack of a public dataset. Second, we provide a comparative assessment of two state-of-the-art object detection algorithms, SSD and YOLO v4, for oil well detection in our image dataset. The results show that both of them have good performance, but YOLO v4 has better accuracy in oil well detection because of its better feature extraction capability for small objects. In view of the fact that small objects are currently difficult to be detected in large-scale and high-resolution remote sensing images, this article proposes an improved algorithm based on YOLO v4 with sliding slices and discarding edges. The algorithm effectively solves the problems of repeated detection and inaccurate positioning of oil well detection in large-scale and high-resolution remote sensing images, and the accuracy of detection result increases considerably. In summary, this study investigates an appropriate algorithm for oil well detection, improves the algorithm, and achieves an excellent effect on a large-scale and high-resolution satellite image. It provides a new idea for small objects detection in large-scale and high-resolution remote sensing images.


2021 ◽  
Author(s):  
Benjamin Kellenberger ◽  
Thor Veen ◽  
Eelke Folmer ◽  
Devis Tuia

&lt;p&gt;Recently, Unmanned Aerial Vehicles (UAVs) equipped with high-resolution imaging sensors have become a viable alternative for ecologists to conduct wildlife censuses, compared to foot surveys. They cause less disturbance by sensing remotely, they provide coverage of otherwise inaccessible areas, and their images can be reviewed and double-checked in controlled screening sessions. However, the amount of data they generate often makes this photo-interpretation stage prohibitively time-consuming.&lt;/p&gt;&lt;p&gt;In this work, we automate the detection process with deep learning [4]. We focus on counting coastal seabirds on sand islands off the West African coast, where species like the African Royal Tern are at the top of the food chain [5]. Monitoring their abundance provides invaluable insights into biodiversity in this area [7]. In a first step, we obtained orthomosaics from nadir-looking UAVs over six sand islands with 1cm resolution. We then fully labelled one of them with points for four seabird species, which required three weeks for five annotators to do and resulted in over 21,000 individuals. Next, we further labelled the other five orthomosaics, but in an incomplete manner; we aimed for a low number of only 200 points per species. These points, together with a few background polygons, served as training data for our ResNet-based [2] detection model. This low number of points required multiple strategies to obtain stable predictions, including curriculum learning [1] and post-processing by a Markov random field [6]. In the end, our model was able to accurately predict the 21,000 birds of the test image with 90% precision at 90% recall (Fig. 1) [3]. Furthermore, this model required a mere 4.5 hours from creating training data to the final prediction, which is a fraction of the three weeks needed for the manual labelling process. Inference time is only a few minutes, which makes the model scale favourably to many more islands. In sum, the combination of UAVs and machine learning-based detectors simultaneously provides census possibilities with unprecedentedly high accuracy and comparably minuscule execution time.&lt;/p&gt;&lt;p&gt;&lt;img src=&quot;https://contentmanager.copernicus.org/fileStorageProxy.php?f=gnp.bc5211f4f60067568601161/sdaolpUECMynit/12UGE&amp;app=m&amp;a=0&amp;c=eeda7238e992b9591c2fec19197f67dc&amp;ct=x&amp;pn=gnp.elif&amp;d=1&quot; alt=&quot;&quot;&gt;&lt;/p&gt;&lt;p&gt;&lt;em&gt;Fig. 1: Our model is able to predict over 21,000 birds in high-resolution UAV images in a fraction of time compared to weeks of manual labelling.&lt;/em&gt;&lt;/p&gt;&lt;p&gt;&amp;#160;&lt;/p&gt;&lt;p&gt;References&lt;/p&gt;&lt;p&gt;1. Bengio, Yoshua, et al. &quot;Curriculum learning.&quot; Proceedings of the 26th annual international conference on machine learning. 2009.&lt;/p&gt;&lt;p&gt;2. He, Kaiming, et al. &quot;Deep residual learning for image recognition.&quot; Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.&lt;/p&gt;&lt;p&gt;3. Kellenberger, Benjamin, et al. &amp;#8220;21,000 Birds in 4.5 Hours: Efficient Large-scale Seabird Detection with Machine Learning.&amp;#8221; Remote Sensing in Ecology and Conservation. Under review.&lt;/p&gt;&lt;p&gt;4. LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. &quot;Deep learning.&quot; nature 521.7553 (2015): 436-444.&lt;/p&gt;&lt;p&gt;5. Parsons, Matt, et al. &quot;Seabirds as indicators of the marine environment.&quot; ICES Journal of Marine Science 65.8 (2008): 1520-1526.&lt;/p&gt;&lt;p&gt;6. Tuia, Devis, Michele Volpi, and Gabriele Moser. &quot;Decision fusion with multiple spatial supports by conditional random fields.&quot; IEEE Transactions on Geoscience and Remote Sensing 56.6 (2018): 3277-3289.&lt;/p&gt;&lt;p&gt;7. Veen, Jan, Hanneke Dallmeijer, and Thor Veen. &quot;Selecting piscivorous bird species for monitoring environmental change in the Banc d'Arguin, Mauritania.&quot; Ardea 106.1 (2018): 5-18.&lt;/p&gt;


2010 ◽  
Vol 11 (2) ◽  
pp. 253-275 ◽  
Author(s):  
Justin Sheffield ◽  
Eric F. Wood ◽  
Francisco Munoz-Arriola

Abstract The development and evaluation of a long-term high-resolution dataset of potential and actual evapotranspiration for Mexico based on remote sensing data are described. Evapotranspiration is calculated using a modified version of the Penman–Monteith algorithm, with input radiation and meteorological data from the International Satellite Cloud Climatology Project (ISCCP) and vegetation distribution derived from Advanced Very High Resolution Radiometer (AVHRR) products. The ISCCP data are downscaled to ⅛° resolution using statistical relationships with data from the North American Regional Reanalysis (NARR). The final product is available at ⅛°, daily, for 1984–2006 for all Mexico. Comparisons are made with the NARR offline land surface model and measurements from approximately 1800 pan stations. The remote sensing estimate follows well the seasonal cycle and spatial pattern of the comparison datasets, with a peak in late summer at the height of the North American monsoon and highest values in low-lying and coastal regions. The spatial average over Mexico is biased low by about 0.3 mm day−1, with a monthly rmse of about 0.5 mm day−1. The underestimation may be related to the lack of a model for canopy evaporation, which is estimated to be up to 30% of total evapotranspiration. Uncertainties in both the remote sensing–based estimates (because of input data uncertainties) and the true value of evapotranspiration (represented by the spread in the comparison datasets) are up to 0.5 and 1.2 mm day−1, respectively. This study is a first step in quantifying the long-term variation in global land evapotranspiration from remote sensing data.


Sign in / Sign up

Export Citation Format

Share Document