IMG2nDSM: Height Estimation from Single Airborne RGB Images with Deep Learning

Estimating the height of buildings and vegetation in single aerial images is a challenging problem. A task-focused Deep Learning (DL) model that combines architectural features from successful DL models (U-NET and Residual Networks) and learns the mapping from a single aerial imagery to a normalized Digital Surface Model (nDSM) was proposed. The model was trained on aerial images whose corresponding DSM and Digital Terrain Models (DTM) were available and was then used to infer the nDSM of images with no elevation information. The model was evaluated with a dataset covering a large area of Manchester, UK, as well as the 2018 IEEE GRSS Data Fusion Contest LiDAR dataset. The results suggest that the proposed DL architecture is suitable for the task and surpasses other state-of-the-art DL approaches by a large margin.

Download Full-text

Towards Scalable Economic Photovoltaic Potential Analysis Using Aerial Images and Deep Learning

Energies ◽

10.3390/en14133800 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3800

Author(s):

Sebastian Krapf ◽

Nils Kemmerzell ◽

Syed Khawaja Haseeb Khawaja Haseeb Uddin ◽

Manuel Hack Hack Vázquez ◽

Fabian Netzler ◽

...

Keyword(s):

Deep Learning ◽

System Analysis ◽

State Of The Art ◽

Critical Role ◽

Semantic Segmentation ◽

Energy System ◽

Aerial Images ◽

Potential Analysis ◽

3D Data ◽

Challenges And Opportunities

Roof-mounted photovoltaic systems play a critical role in the global transition to renewable energy generation. An analysis of roof photovoltaic potential is an important tool for supporting decision-making and for accelerating new installations. State of the art uses 3D data to conduct potential analyses with high spatial resolution, limiting the study area to places with available 3D data. Recent advances in deep learning allow the required roof information from aerial images to be extracted. Furthermore, most publications consider the technical photovoltaic potential, and only a few publications determine the photovoltaic economic potential. Therefore, this paper extends state of the art by proposing and applying a methodology for scalable economic photovoltaic potential analysis using aerial images and deep learning. Two convolutional neural networks are trained for semantic segmentation of roof segments and superstructures and achieve an Intersection over Union values of 0.84 and 0.64, respectively. We calculated the internal rate of return of each roof segment for 71 buildings in a small study area. A comparison of this paper’s methodology with a 3D-based analysis discusses its benefits and disadvantages. The proposed methodology uses only publicly available data and is potentially scalable to the global level. However, this poses a variety of research challenges and opportunities, which are summarized with a focus on the application of deep learning, economic photovoltaic potential analysis, and energy system analysis.

Download Full-text

Colorisation of archival aerial imagery using deep learning

10.5194/egusphere-egu21-11925 ◽

2021 ◽

Author(s):

Ryusei Ishii ◽

Patrice Carbonneau ◽

Hitoshi Miyamoto

Keyword(s):

Deep Learning ◽

Urban Expansion ◽

Research Work ◽

Fine Tuning ◽

Aerial Images ◽

Model Parameters ◽

Style Transfer ◽

Freeze Layer ◽

Rgb Images ◽

Japanese Rivers

<p>Archival imagery dating back to the mid-twentieth century holds information that pre-dates urban expansion and the worst impacts of climate change.&#160; In this research, we examine deep learning colorisation methods applied to historical aerial images in Japan.&#160; Specifically, we attempt to colorize monochrome images of river basins by applying the method of Neural Style Transfer (NST).&#160;&#160;&#160; First, we created RGB orthomosaics (1m) for reaches of 3 Japanese rivers, the Kurobe, Ishikari, and Kinu rivers.&#160; From the orthomosaics, we extract 60 thousand image tiles of `100 x100` pixels in order to train the CNN used in NST.&#160; The Image tiles were classified into 6 classes: urban, river, forest, tree, grass, and paddy field.&#160; Second, we use the VGG16 model pre-trained on ImageNet data in a transfer learning approach where we freeze a variable number of layers.&#160; We fine-tuned the training epochs, learning rate, and frozen layers in VGG16 in order to derive the optimal CNN used in NST.&#160; The fine tuning resulted in the F-measure accuracy of 0.961, 0.947, and 0.917 for the freeze layer in 7,11,15, respectively.&#160; Third, we colorize monochrome aerial images by the NST with the retrained model weights.&#160; Here used RGB images for 7 Japanese rivers and the corresponding grayscale versions to evaluate the present NST colorization performance.&#160; The RMSE between the RGB and resultant colorized images showed the best performance with the model parameters of lower content layer (6), shallower freeze layer (7), and larger style/content weighting ratio (1.0 x10&#8309;).&#160; The NST hyperparameter analysis indicated that the colorized images became rougher when the content layer selected deeper in the VGG model.&#160; This is because the deeper the layer, the more features were extracted from the original image.&#160; It was also confirmed that the Kurobe and Ishikari rivers indicated higher accuracy in colorisation.&#160; It might come from the fact that the training dataset of the fine tuning was extracted from these river images.&#160; Finally, we colorized historical monochrome images of Kurobe river with the best NST parameters, resulting in quality high enough compared with the RGB images.&#160; The result indicated that the fine tuning of the NST model could achieve high performance to proceed further land cover classification in future research work.</p>

Download Full-text

Photogrammetric surveying forests and woodlands with UAVs: techniques for automatic removal of vegetation and digital terrain model production for hydrological applications

Journal of Unmanned Vehicle Systems ◽

10.1139/juvs-2016-0023 ◽

2019 ◽

Vol 7 (1) ◽

pp. 1-20

Author(s):

Fotis Giagkas ◽

Petros Patias ◽

Charalampos Georgiadis

Keyword(s):

Vegetation Type ◽

Digital Terrain Model ◽

Point Clouds ◽

Aerial Images ◽

Surface Model ◽

Terrain Model ◽

Vegetation Height ◽

Digital Terrain ◽

Dense Point ◽

Height Model

The purpose of this study is the photogrammetric survey of a forested area using unmanned aerial vehicles (UAV), and the estimation of the digital terrain model (DTM) of the area, based on the photogrammetrically produced digital surface model (DSM). Furthermore, through the classification of the height difference between a DSM and a DTM, a vegetation height model is estimated, and a vegetation type map is produced. Finally, the generated DTM was used in a hydrological analysis study to determine its suitability compared to the usage of the DSM. The selected study area was the forest of Seih-Sou (Thessaloniki). The DTM extraction methodology applies classification and filtering of point clouds, and aims to produce a surface model including only terrain points (DTM). The method yielded a DTM that functioned satisfactorily as a basis for the hydrological analysis. Also, by classifying the DSM–DTM difference, a vegetation height model was generated. For the photogrammetric survey, 495 aerial images were used, taken by a UAV from a height of ∼200 m. A total of 44 ground control points were measured with an accuracy of 5 cm. The accuracy of the aerial triangulation was approximately 13 cm. The produced dense point cloud, counted 146 593 725 points.

Download Full-text

SeDAR: Reading Floorplans Like a Human—Using Deep Learning to Enable Human-Inspired Localisation

International Journal of Computer Vision ◽

10.1007/s11263-019-01239-4 ◽

2019 ◽

Vol 128 (5) ◽

pp. 1286-1310 ◽

Cited By ~ 3

Author(s):

Oscar Mendez ◽

Simon Hadfield ◽

Nicolas Pugeault ◽

Richard Bowden

Keyword(s):

Deep Learning ◽

Semantic Information ◽

State Of The Art ◽

Depth Information ◽

Semantic Maps ◽

Novel Method ◽

Rgb Images ◽

High Level ◽

Robotic Tasks ◽

And Robotics

Abstract The use of human-level semantic information to aid robotic tasks has recently become an important area for both Computer Vision and Robotics. This has been enabled by advances in Deep Learning that allow consistent and robust semantic understanding. Leveraging this semantic vision of the world has allowed human-level understanding to naturally emerge from many different approaches. Particularly, the use of semantic information to aid in localisation and reconstruction has been at the forefront of both fields. Like robots, humans also require the ability to localise within a structure. To aid this, humans have designed high-level semantic maps of our structures called floorplans. We are extremely good at localising in them, even with limited access to the depth information used by robots. This is because we focus on the distribution of semantic elements, rather than geometric ones. Evidence of this is that humans are normally able to localise in a floorplan that has not been scaled properly. In order to grant this ability to robots, it is necessary to use localisation approaches that leverage the same semantic information humans use. In this paper, we present a novel method for semantically enabled global localisation. Our approach relies on the semantic labels present in the floorplan. Deep Learning is leveraged to extract semantic labels from RGB images, which are compared to the floorplan for localisation. While our approach is able to use range measurements if available, we demonstrate that they are unnecessary as we can achieve results comparable to state-of-the-art without them.

Download Full-text

BUILDING OUTLINE EXTRACTION FROM AERIAL IMAGERY AND DIGITAL SURFACE MODEL WITH A FRAME FIELD LEARNING FRAMEWORK

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2021-487-2021 ◽

2021 ◽

Vol XLIII-B2-2021 ◽

pp. 487-493

Author(s):

X. Sun ◽

W. Zhao ◽

R. V. Maretto ◽

C. Persello

Keyword(s):

Deep Learning ◽

Semantic Segmentation ◽

Aerial Images ◽

Surface Model ◽

Digital Surface Model ◽

Frame Field ◽

Learning Framework ◽

Elevation Data ◽

3D Information ◽

Direction Information

Abstract. Deep learning-based semantic segmentation models for building delineation face the challenge of producing precise and regular building outlines. Recently, a building delineation method based on frame field learning was proposed by Girard et al. (2020) to extract regular building footprints as vector polygons directly from aerial RGB images. A fully convolution network (FCN) is trained to learn simultaneously the building mask, contours, and frame field followed by a polygonization method. With the direction information of the building contours stored in the frame field, the polygonization algorithm produces regular outlines accurately detecting edges and corners. This paper investigated the contribution of elevation data from the normalized digital surface model (nDSM) to extract accurate and regular building polygons. The 3D information provided by the nDSM overcomes the aerial images’ limitations and contributes to distinguishing the buildings from the background more accurately. Experiments conducted in Enschede, the Netherlands, demonstrate that the nDSM improves building outlines’ accuracy, resulting in better-aligned building polygons and prevents false positives. The investigated deep learning approach (fusing RGB + nDSM) results in a mean intersection over union (IOU) of 0.70 in the urban area. The baseline method (using RGB only) results in an IOU of 0.58 in the same area. A qualitative analysis of the results shows that the investigated model predicts more precise and regular polygons for large and complex structures.

Download Full-text

Using high‐resolution aerial imagery and deep learning to detect tree spatio-temporal dynamics at the treeline

10.5194/egusphere-egu21-14548 ◽

2021 ◽

Author(s):

Mirela Beloiu ◽

Dimitris Poursanidis ◽

Samuel Hoffmann ◽

Nektarios Chrysoulakis ◽

Carl Beierkuhnlein

Keyword(s):

Deep Learning ◽

High Resolution ◽

Object Detection ◽

Temporal Dynamics ◽

Tree Density ◽

Aerial Imagery ◽

Aerial Images ◽

Remote Areas ◽

Spatio Temporal ◽

Study Sites

<p>Recent advances in deep learning techniques for object detection and the availability of high-resolution images facilitate the analysis of both temporal and spatial vegetation patterns in remote areas. High-resolution satellite imagery has been used successfully to detect trees in small areas with homogeneous rather than heterogeneous forests, in which single tree species have a strong contrast compared to their neighbors and landscape. However, no research to date has detected trees at the treeline in the remote and complex heterogeneous landscape of Greece using deep learning methods. We integrated high-resolution aerial images, climate data, and topographical characteristics to study the treeline dynamic over 70 years in the Samaria National Park on the Mediterranean island of Crete, Greece. We combined mapping techniques with deep learning approaches to detect and analyze spatio-temporal dynamics in treeline position and tree density. We use visual image interpretation to detect single trees on high-resolution aerial imagery from 1945, 2008, and 2015. Using the RGB aerial images from 2008 and 2015 we test a Convolution Neural Networks (CNN)-object detection approach (SSD) and a CNN-based segmentation technique (U-Net). Based on the mapping and deep learning approach, we have not detected a shift in treeline elevation over the last 70 years, despite warming, although tree density has increased. However, we show that CNN approach accurately detects and maps tree position and density at the treeline. We also reveal that the treeline elevation on Crete varies with topography. Treeline elevation decreases from the southern to the northern study sites. We explain these differences between study sites by the long-term interaction between topographical characteristics and meteorological factors. The study highlights the feasibility of using deep learning and high-resolution imagery as a promising technique for monitoring forests in remote areas.</p>

Download Full-text

Multipass hierarchical stereo matching for generation of digital terrain models from aerial images

Machine Vision and Applications ◽

10.1007/s001380050079 ◽

1998 ◽

Vol 10 (5-6) ◽

pp. 280-291 ◽

Cited By ~ 17

Author(s):

Yi-Ping Hung ◽

Chu-Song Chen ◽

Kuan-Chung Hung ◽

Yong-Sheng Chen ◽

Chiou-Shann Fuh

Keyword(s):

Stereo Matching ◽

Aerial Images ◽

Digital Terrain Models ◽

Digital Terrain ◽

Terrain Models

Download Full-text

Effects of Jpeg Compression on the Accuracy of Digital Terrain Models Automatically Derived from Digital Aerial Images

The Photogrammetric Record ◽

10.1111/0031-868x.00187 ◽

2001 ◽

Vol 17 (98) ◽

pp. 331-342 ◽

Cited By ~ 8

Author(s):

Kent W. Lam ◽

Zhilin Li ◽

Xiuxiao Yuan

Keyword(s):

Jpeg Compression ◽

Aerial Images ◽

Digital Terrain Models ◽

Digital Terrain ◽

Terrain Models

Download Full-text

USE OF HISTORICAL AERIAL IMAGES FOR 3D MODELLING OF GLACIERS IN THE PROVINCE OF TRENTO

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xliii-b2-2020-1151-2020 ◽

2020 ◽

Vol XLIII-B2-2020 ◽

pp. 1151-1158

Author(s):

D. Poli ◽

C. Casarotto ◽

M. Strudl ◽

E. Bollmann ◽

K. Moe ◽

...

Keyword(s):

Image Matching ◽

Aerial Images ◽

Visual Interpretation ◽

Data Set ◽

Digital Terrain ◽

Area Of Interest ◽

Terrain Models ◽

Dense Image ◽

Acceptable Quality ◽

The Impact

Abstract. Historical aerial images represent a source of information of great value for glacier monitoring, as they cover the area of interest at a well-defined epoch and allow for visual interpretation and metric analysis. Typically, the aerial images are used to produce orthophotos and manually digitize the perimeters of the glaciers for analysis of the surface extent of the glaciers, while the extraction of height information is more challenging due to data quality and characteristics. This article discusses the potential of historical aerial images for glacier modelling. More specifically, it analyses the impact of their coverage, radiometric- and geometric accuracy, state of preservation and completeness on the photogrammetric workflow. The data set used consists of scans of 300 (analog) aerial images acquired between August and October 1954 by the U.S. Air Force with a Fairchild KF7660 camera over the entire Province of Trento. For the modelling of the glaciers, different techniques such as manual stereoscopic measurement and dense image matching were tested on sample glaciers and the results were analysed in detail. Due to local radiometric saturation in a large part of the glacial surfaces and other disturbances affecting the historical images (e.g. scratches, scanning errors, dark shadows), dense image matching did not produce any valuable results, and stereo plotting could be used only on images (or image parts) with acceptable quality. The derived Digital Terrain Models (DTMs) were compared with a reference DTM obtained with dense image matching from digital aerial images acquired in September 2015 with an UltraCam Eagle sensor, and, for some glaciers, to a DTM obtained with dense image matching from scanned aerial images acquired in September 1983 with a RC30 analog camera. The differences between 1954 and 2015 DTMs showed values up to 70–80 m in height and a behaviour that is confirmed by the models employed by the glaciology experts in Trento.

Download Full-text

Updating Geospatial Data by Creating a High Resolution Digital Surface Model

Journal of Applied Engineering Sciences ◽

10.2478/jaes-2018-0018 ◽

2018 ◽

Vol 8 (2) ◽

pp. 51-58 ◽

Cited By ~ 1

Author(s):

Iuliana Adriana Cuibac Picu

Keyword(s):

Laser Scanning ◽

Smart Cities ◽

Geographic Information ◽

Geospatial Data ◽

Aerial Images ◽

Aerial Photographs ◽

Planning System ◽

Surface Model ◽

Digital Surface Model ◽

Digital Terrain

Abstract Smart Cities are no longer just an aspiration, they are a necessity. For a city to be smart, accurate data collection or improvement the existing ones is needed, also an infrastructure that allows the integration of heterogeneous geographic information and sensor networks at a common technological point. Over the past two decades, laser scanning technology, also known as LiDAR (Light Detection and Ranging), has become a very important measurement method, providing high accuracy data and information on land topography, vegetation, buildings, and so on. Proving to be a great way to create Digital Terrain Models. The digital terrain model is a statistical representation of the terrain surface, including in its dataset the elements on its surface, such as construction or vegetation. The data use in the following article is from the LAKI II project “Services for producing a digital model of land by aerial scanning, aerial photographs and production of new maps and orthophotomaps for approximately 50 000 sqKm in 6 counties: Bihor, Arad, Hunedoara, Alba, Mures, Harghita including the High Risk Flood Zone (the border area with the Republic of Hungary in Arad and Bihor)”, which are obtained through LiDAR technology with a point density of 8 points per square meter. The purpose of this article is to update geospatial data with a higher resolution digital surface model and to demonstrate the differences between a digital surface models obtain by aerial images and one obtain by LiDAR technology. The digital surface model will be included in the existing geographic information system of the city Marghita in Bihor County, and it will be used to help develop studies on land use, transport planning system and geological applications. It could also be used to detect changes over time to archaeological sites, to create countur lines maps, flight simulation programs, or other viewing and modelling applications.

Download Full-text