2D Image-To-3D Model: Knowledge-Based 3D Building Reconstruction (3DBR) Using Single Aerial Images and Convolutional Neural Networks (CNNs)

In this study, a deep learning (DL)-based approach is proposed for the detection and reconstruction of buildings from a single aerial image. The pre-required knowledge to reconstruct the 3D shapes of buildings, including the height data as well as the linear elements of individual roofs, is derived from the RGB image using an optimized multi-scale convolutional–deconvolutional network (MSCDN). The proposed network is composed of two feature extraction levels to first predict the coarse features, and then automatically refine them. The predicted features include the normalized digital surface models (nDSMs) and linear elements of roofs in three classes of eave, ridge, and hip lines. Then, the prismatic models of buildings are generated by analyzing the eave lines. The parametric models of individual roofs are also reconstructed using the predicted ridge and hip lines. The experiments show that, even in the presence of noises in height values, the proposed method performs well on 3D reconstruction of buildings with different shapes and complexities. The average root mean square error (RMSE) and normalized median absolute deviation (NMAD) metrics are about 3.43 m and 1.13 m, respectively for the predicted nDSM. Moreover, the quality of the extracted linear elements is about 91.31% and 83.69% for the Potsdam and Zeebrugge test data, respectively. Unlike the state-of-the-art methods, the proposed approach does not need any additional or auxiliary data and employs a single image to reconstruct the 3D models of buildings with the competitive precision of about 1.2 m and 0.8 m for the horizontal and vertical RMSEs over the Potsdam data and about 3.9 m and 2.4 m over the Zeebrugge test data.

Download Full-text

A Multi-Task Network with Distance–Mask–Boundary Consistency Constraints for Building Extraction from Aerial Images

Remote Sensing ◽

10.3390/rs13142656 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2656

Author(s):

Furong Shi ◽

Tong Zhang

Keyword(s):

Distance Estimation ◽

Image Data ◽

Learning Technologies ◽

Aerial Images ◽

Superior Performance ◽

Aerial Image ◽

Great Success ◽

Building Extraction ◽

Shape Information ◽

Multi Scale

Deep-learning technologies, especially convolutional neural networks (CNNs), have achieved great success in building extraction from areal images. However, shape details are often lost during the down-sampling process, which results in discontinuous segmentation or inaccurate segmentation boundary. In order to compensate for the loss of shape information, two shape-related auxiliary tasks (i.e., boundary prediction and distance estimation) were jointly learned with building segmentation task in our proposed network. Meanwhile, two consistency constraint losses were designed based on the multi-task network to exploit the duality between the mask prediction and two shape-related information predictions. Specifically, an atrous spatial pyramid pooling (ASPP) module was appended to the top of the encoder of a U-shaped network to obtain multi-scale features. Based on the multi-scale features, one regression loss and two classification losses were used for predicting the distance-transform map, segmentation, and boundary. Two inter-task consistency-loss functions were constructed to ensure the consistency between distance maps and masks, and the consistency between masks and boundary maps. Experimental results on three public aerial image data sets showed that our method achieved superior performance over the recent state-of-the-art models.

Download Full-text

BUILDING OUTLINE EXTRACTION FROM AERIAL IMAGES USING CONVOLUTIONAL NEURAL NETWORKS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-w18-57-2019 ◽

2019 ◽

Vol XLII-4/W18 ◽

pp. 57-61

Author(s):

F. Alidoost ◽

H. Arefi ◽

F. Tombari

Keyword(s):

Active Contour Model ◽

Aerial Images ◽

Aerial Image ◽

Building Detection ◽

Single Image ◽

Cloud Data ◽

Additional Information ◽

Multi Scale ◽

Bounding Boxes ◽

Fully Connected

Abstract. Automatic detection and extraction of buildings from aerial images are considerable challenges in many applications, including disaster management, navigation, urbanization monitoring, emergency responses, 3D city mapping and reconstruction. However, the most important problem is to precisely localize buildings from single aerial images where there is no additional information such as LiDAR point cloud data or high resolution Digital Surface Models (DSMs). In this paper, a Deep Learning (DL)-based approach is proposed to localize buildings, estimate the relative height information, and extract the buildings’ boundaries using a single aerial image. In order to detect buildings and extract the bounding boxes, a Fully Connected Convolutional Neural Network (FC-CNN) is trained to classify building and non-building objects. We also introduced a novel Multi-Scale Convolutional-Deconvolutional Network (MS-CDN) including skip connection layers to predict normalized DSMs (nDSMs) from a single image. The extracted bounding boxes as well as predicted nDSMs are then employed by an Active Contour Model (ACM) to provide precise boundaries of buildings. The experiments show that, even having noises in the predicted nDSMs, the proposed method performs well on single aerial images with different building shapes. The quality rate for building detection is about 86% and the RMSE for nDSM prediction is about 4 m. Also, the accuracy of boundary extraction is about 68%. Since the proposed framework is based on a single image, it could be employed for real time applications.

Download Full-text

FUSION OF MULTI-VIEW AND MULTI-SCALE AERIAL IMAGERY FOR REAL-TIME SITUATION AWARENESS APPLICATIONS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsarchives-xl-1-w4-201-2015 ◽

2015 ◽

Vol XL-1/W4 ◽

pp. 201-206 ◽

Cited By ~ 1

Author(s):

X. Zhuo ◽

F. Kurz ◽

P. Reinartz

Keyword(s):

Real Time ◽

Reference Frame ◽

Situation Awareness ◽

Large Scale ◽

Bundle Adjustment ◽

Aerial Imagery ◽

Aerial Images ◽

Aerial Image ◽

Main Concern ◽

Multi Scale

Manned aircraft has long been used for capturing large-scale aerial images, yet the high costs and weather dependence restrict its availability in emergency situations. In recent years, MAV (Micro Aerial Vehicle) emerged as a novel modality for aerial image acquisition. Its maneuverability and flexibility enable a rapid awareness of the scene of interest. Since these two platforms deliver scene information from different scale and different view, it makes sense to fuse these two types of complimentary imagery to achieve a quick, accurate and detailed description of the scene, which is the main concern of real-time situation awareness. This paper proposes a method to fuse multi-view and multi-scale aerial imagery by establishing a common reference frame. In particular, common features among MAV images and geo-referenced airplane images can be extracted by a scale invariant feature detector like SIFT. From the tie point of geo-referenced images we derive the coordinate of corresponding ground points, which are then utilized as ground control points in global bundle adjustment of MAV images. In this way, the MAV block is aligned to the reference frame. Experiment results show that this method can achieve fully automatic geo-referencing of MAV images even if GPS/IMU acquisition has dropouts, and the orientation accuracy is improved compared to the GPS/IMU based georeferencing. The concept for a subsequent 3D classification method is also described in this paper.

Download Full-text

ON THE USE OF UAVS IN MINING AND ARCHAEOLOGY - GEO-ACCURATE 3D RECONSTRUCTIONS USING VARIOUS PLATFORMS AND TERRESTRIAL VIEWS

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprsannals-ii-1-w1-15-2015 ◽

2015 ◽

Vol II-1/W1 ◽

pp. 15-22 ◽

Cited By ~ 15

Author(s):

A. Tscharf ◽

M. Rumpler ◽

F. Fraundorfer ◽

G. Mayer ◽

H. Bischof

Keyword(s):

Spatial Data ◽

Control Point ◽

3D Models ◽

Bundle Adjustment ◽

Aerial Images ◽

Aerial Image ◽

Image Block ◽

3D Reconstructions ◽

Commercial Applications ◽

Inspection Tasks

During the last decades photogrammetric computer vision systems have been well established in scientific and commercial applications. Especially the increasing affordability of unmanned aerial vehicles (UAVs) in conjunction with automated multi-view processing pipelines have resulted in an easy way of acquiring spatial data and creating realistic and accurate 3D models. With the use of multicopter UAVs, it is possible to record highly overlapping images from almost terrestrial camera positions to oblique and nadir aerial images due to the ability to navigate slowly, hover and capture images at nearly any possible position. Multi-copter UAVs thus are bridging the gap between terrestrial and traditional aerial image acquisition and are therefore ideally suited to enable easy and safe data collection and inspection tasks in complex or hazardous environments. In this paper we present a fully automated processing pipeline for precise, metric and geo-accurate 3D reconstructions of complex geometries using various imaging platforms. Our workflow allows for georeferencing of UAV imagery based on GPS-measurements of camera stations from an on-board GPS receiver as well as tie and control point information. Ground control points (GCPs) are integrated directly in the bundle adjustment to refine the georegistration and correct for systematic distortions of the image block. We discuss our approach based on three different case studies for applications in mining and archaeology and present several accuracy related analyses investigating georegistration, camera network configuration and ground sampling distance. Our approach is furthermore suited for seamlessly matching and integrating images from different view points and cameras (aerial and terrestrial as well as inside views) into one single reconstruction. Together with aerial images from a UAV, we are able to enrich 3D models by combining terrestrial images as well inside views of an object by joint image processing to generate highly detailed, accurate and complete reconstructions.

Download Full-text

EANet: Edge-Aware Network for the Extraction of Buildings from Aerial Images

Remote Sensing ◽

10.3390/rs12132161 ◽

2020 ◽

Vol 12 (13) ◽

pp. 2161 ◽

Cited By ~ 3

Author(s):

Guang Yang ◽

Qian Zhang ◽

Guixu Zhang

Keyword(s):

Remote Sensing ◽

Image Segmentation ◽

Data Augmentation ◽

State Of The Art ◽

Receptive Fields ◽

Aerial Images ◽

Remote Sensing Images ◽

Global Features ◽

Multi Scale

Deep learning methods have been used to extract buildings from remote sensing images and have achieved state-of-the-art performance. Most previous work has emphasized the multi-scale fusion of features or the enhancement of more receptive fields to achieve global features rather than focusing on low-level details such as the edges. In this work, we propose a novel end-to-end edge-aware network, the EANet, and an edge-aware loss for getting accurate buildings from aerial images. Specifically, the architecture is composed of image segmentation networks and edge perception networks that, respectively, take charge of building prediction and edge investigation. The International Society for Photogrammetry and Remote Sensing (ISPRS) Potsdam segmentation benchmark and the Wuhan University (WHU) building benchmark were used to evaluate our approach, which, respectively, was found to achieve 90.19% and 93.33% intersection-over-union and top performance without using additional datasets, data augmentation, and post-processing. The EANet is effective in extracting buildings from aerial images, which shows that the quality of image segmentation can be improved by focusing on edge details.

Download Full-text

A Slimmer Network with Polymorphic and Group Attention Modules for More Efficient Object Detection in Aerial Images

Remote Sensing ◽

10.3390/rs12223750 ◽

2020 ◽

Vol 12 (22) ◽

pp. 3750

Author(s):

Wei Guo ◽

Weihong Li ◽

Zhenghao Li ◽

Weiguo Gong ◽

Jinkai Cui ◽

...

Keyword(s):

Object Detection ◽

Detection Efficiency ◽

Aerial Images ◽

Aerial Image ◽

Detection Methods ◽

Detection Accuracy ◽

Practical Applications ◽

Multi Scale ◽

High Detection Efficiency ◽

Object Features

Object detection is one of the core technologies in aerial image processing and analysis. Although existing aerial image object detection methods based on deep learning have made some progress, there are still some problems remained: (1) Most existing methods fail to simultaneously consider multi-scale and multi-shape object characteristics in aerial images, which may lead to some missing or false detections; (2) high precision detection generally requires a large and complex network structure, which usually makes it difficult to achieve the high detection efficiency and deploy the network on resource-constrained devices for practical applications. To solve these problems, we propose a slimmer network for more efficient object detection in aerial images. Firstly, we design a polymorphic module (PM) for simultaneously learning the multi-scale and multi-shape object features, so as to better detect the hugely different objects in aerial images. Then, we design a group attention module (GAM) for better utilizing the diversiform concatenation features in the network. By designing multiple detection headers with adaptive anchors and the above-mentioned two modules, we propose a one-stage network called PG-YOLO for realizing the higher detection accuracy. Based on the proposed network, we further propose a more efficient channel pruning method, which can slim the network parameters from 63.7 million (M) to 3.3M that decreases the parameter size by 94.8%, so it can significantly improve the detection efficiency for real-time detection. Finally, we execute the comparative experiments on three public aerial datasets, and the experimental results show that the proposed method outperforms the state-of-the-art methods.

Download Full-text

Building Extraction of Aerial Images by a Global and Multi-Scale Encoder-Decoder Network

Remote Sensing ◽

10.3390/rs12152350 ◽

2020 ◽

Vol 12 (15) ◽

pp. 2350 ◽

Cited By ~ 3

Author(s):

Jingjing Ma ◽

Linlin Wu ◽

Xu Tang ◽

Fang Liu ◽

Xiangrong Zhang ◽

...

Keyword(s):

Semantic Segmentation ◽

Aerial Images ◽

Aerial Image ◽

Building Extraction ◽

Target Level ◽

Practical Application ◽

Multi Scale ◽

Key Points ◽

Level Information ◽

Global And Local

Semantic segmentation is an important and challenging task in the aerial image community since it can extract the target level information for understanding the aerial image. As a practical application of aerial image semantic segmentation, building extraction always attracts researchers’ attention as the building is the specific land cover in the aerial images. There are two key points for building extraction from aerial images. One is learning the global and local features to fully describe the buildings with diverse shapes. The other one is mining the multi-scale information to discover the buildings with different resolutions. Taking these two key points into account, we propose a new method named global multi-scale encoder-decoder network (GMEDN) in this paper. Based on the encoder-decoder framework, GMEDN is developed with a local and global encoder and a distilling decoder. The local and global encoder aims at learning the representative features from the aerial images for describing the buildings, while the distilling decoder focuses on exploring the multi-scale information for the final segmentation masks. Combining them together, the building extraction is accomplished in an end-to-end manner. The effectiveness of our method is validated by the experiments counted on two public aerial image datasets. Compared with some existing methods, our model can achieve better performance.

Download Full-text

Fig Plant Segmentation from Aerial Images Using a Deep Convolutional Encoder-Decoder Network

Remote Sensing ◽

10.3390/rs11101157 ◽

2019 ◽

Vol 11 (10) ◽

pp. 1157 ◽

Cited By ~ 8

Author(s):

Jorge Fuentes-Pacheco ◽

Juan Torres-Olivares ◽

Edgar Roman-Rangel ◽

Salvador Cervantes ◽

Porfirio Juarez-Lopez ◽

...

Keyword(s):

Precision Agriculture ◽

Image Data ◽

Ground Truth ◽

Aerial Images ◽

Aerial Image ◽

Data Set ◽

Visual Appearance ◽

Aerial Robots ◽

Lighting Conditions ◽

Convolutional Encoder

Crop segmentation is an important task in Precision Agriculture, where the use of aerial robots with an on-board camera has contributed to the development of new solution alternatives. We address the problem of fig plant segmentation in top-view RGB (Red-Green-Blue) images of a crop grown under open-field difficult circumstances of complex lighting conditions and non-ideal crop maintenance practices defined by local farmers. We present a Convolutional Neural Network (CNN) with an encoder-decoder architecture that classifies each pixel as crop or non-crop using only raw colour images as input. Our approach achieves a mean accuracy of 93.85% despite the complexity of the background and a highly variable visual appearance of the leaves. We make available our CNN code to the research community, as well as the aerial image data set and a hand-made ground truth segmentation with pixel precision to facilitate the comparison among different algorithms.

Download Full-text

Using Benford’s law to assess the quality of COVID-19 register data in Brazil

Journal of Public Health ◽

10.1093/pubmed/fdaa193 ◽

2020 ◽

Author(s):

Lucas Silva ◽

Dalson Figueiredo Filho

Keyword(s):

Goodness Of Fit ◽

Epidemiological Surveillance ◽

Evidence Based ◽

Mean Absolute Deviation ◽

Chi Square ◽

Absolute Deviation ◽

Distortion Factor ◽

Benford Law ◽

Official Data

Abstract We employ Newcomb–Benford law (NBL) to evaluate the reliability of COVID-19 figures in Brazil. Using official data from February 25 to September 15, we apply a first digit test for a national aggregate dataset of total cases and cumulative deaths. We find strong evidence that Brazilian reports do not conform to the NBL theoretical expectations. These results are robust to different goodness of fit (chi-square, mean absolute deviation and distortion factor) and data sources (John Hopkins University and Our World in Data). Despite the growing appreciation for evidence-based-policymaking, which requires valid and reliable data, we show that the Brazilian epidemiological surveillance system fails to provide trustful data under the NBL assumption on the COVID-19 epidemic.

Download Full-text

Simple Index to Assess the Calibration Quality of Safety Performance Functions Based on Multiple Goodness-of-Fit Metrics

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211008896 ◽

2021 ◽

pp. 036119812110088

Author(s):

Raul E. Avelar ◽

Karen Dixon ◽

Boniphace Kutela ◽

Sam Klump ◽

Beth Wemple ◽

...

Keyword(s):

Goodness Of Fit ◽

Synthetic Data ◽

Calibration Procedure ◽

Safety Performance ◽

Absolute Deviation ◽

Data Set ◽

Safety Database ◽

Simple Index ◽

Safety Performance Functions

The calibration of safety performance functions (SPFs) is a mechanism included in the Highway Safety Manual (HSM) to adjust SPFs in the HSM for use in intended jurisdictions. Critically, the quality of the calibration procedure must be assessed before using the calibrated SPFs. Multiple resources to aid practitioners in calibrating SPFs have been developed in the years following the publication of the HSM 1st edition. Similarly, the literature suggests multiple ways to assess the goodness-of-fit (GOF) of a calibrated SPF to a data set from a given jurisdiction. This paper uses the calibration results of multiple intersection SPFs to a large Mississippi safety database to examine the relations between multiple GOF metrics. The goal is to develop a sensible single index that leverages the joint information from multiple GOF metrics to assess overall quality of calibration. A factor analysis applied to the calibration results revealed three underlying factors explaining 76% of the variability in the data. From these results, the authors developed an index and performed a sensitivity analysis. The key metrics were found to be, in descending order: the deviation of the cumulative residual (CURE) plot from the 95% confidence area, the mean absolute deviation, the modified R-squared, and the value of the calibration factor. This paper also presents comparisons between the index and alternative scoring strategies, as well as an effort to verify the results using synthetic data. The developed index is recommended to comprehensively assess the quality of the calibrated intersection SPFs.

Download Full-text