scholarly journals UNSUPERVISED MULTI-CONSTRAINT DEEP NEURAL NETWORK FOR DENSE IMAGE MATCHING

Author(s):  
W. Yuan ◽  
Z. Fan ◽  
X. Yuan ◽  
J. Gong ◽  
R. Shibasaki

Abstract. Dense image matching is essential to photogrammetry applications, including Digital Surface Model (DSM) generation, three dimensional (3D) reconstruction, and object detection and recognition. The development of an efficient and robust method for dense image matching has been one of the technical challenges due to high variations in illumination and ground features of aerial images of large areas. Nowadays, due to the development of deep learning technology, deep neural network-based algorithms outperform traditional methods on a variety of tasks such as object detection, semantic segmentation and stereo matching. The proposed network includes cost-volume computation, cost-volume aggregation, and disparity prediction. It starts with a pre-trained VGG-16 network as a backend and using the U-net architecture with nine layers for feature map extraction and a correlation layer for cost volume calculation, after that a guided filter based cost aggregation is adopted for cost volume filtering and finally the soft Argmax function is utilized for disparity prediction. The experimental conducted on a UAV dataset demonstrated that the proposed method achieved the RMSE (root mean square error) of the reprojection error better than 1 pixel in image coordinate and in-ground positioning accuracy within 2.5 ground sample distance. The comparison experiments on KITTI 2015 dataset shows the proposed unsupervised method even comparably with other supervised methods.

Author(s):  
N. Yastikli ◽  
H. Bayraktar ◽  
Z. Erisir

The digital surface models (DSM) are the most popular products to determine visible surface of Earth which includes all non-terrain objects such as vegetation, forest, and man-made constructions. The airborne light detection and ranging (LiDAR) is the preferred technique for high resolution DSM generation in local coverage. The automatic generation of the high resolution DSM is also possible with stereo image matching using the aerial images. The image matching algorithms usually rely on the feature based matching for DSM generation. First, feature points are extracted and then corresponding features are searched in the overlapping images. These image matching algorithms face with the problems in the areas which have repetitive pattern such as urban structure and forest. <br><br> The recent innovation in camera technology and image matching algorithm enabled the automatic dense DSM generation for large scale city and environment modelling. The new pixel-wise matching approaches are generates very high resolution DSMs which corresponds to the ground sample distance (GSD) of the original images. The numbers of the research institutes and photogrammetric software vendors are currently developed software tools for dense DSM generation using the aerial images. This new approach can be used high resolution DSM generation for the larger cities, rural areas and forest even Nation-wide applications. In this study, the performance validation of high resolution DSM generated by pixel-wise dense image matching in part of Istanbul was aimed. The study area in Istanbul is including different land classes such as open areas, forest and built-up areas to test performance of dense image matching in different land classes. The obtained result from this performance validation in Istanbul test area showed that, high resolution DSM which corresponds to the ground sample distance (GSD) of original aerial image can be generated successfully by pixel-wise dense image matching using commercial and research institution’s software.


2019 ◽  
Author(s):  
◽  
Peng Sun

[ACCESS RESTRICTED TO THE UNIVERSITY OF MISSOURI AT REQUEST OF AUTHOR.] With the widespread usage of many different types of sensors in recent years, large amounts of diverse and complex sensor data have been generated and analyzed to extract useful information. This dissertation focuses on two types of data: aerial images and physiological sensor data. Several new methods have been proposed based on deep learning techniques to advance the state-of-the-art in analyzing these data. For aerial images, a new method for designing effective loss functions for training deep neural networks for object detection, called adaptive salience biased loss (ASBL), has been proposed. In addition, several state-of-the-art deep neural network models for object detection, including RetinaNet, UNet, Yolo, etc., have been adapted and modified to achieve improved performance on a new set of real-world aerial images for bird detection. For physiological sensor data, a deep learning method for alcohol usage detection, called Deep ADA, has been proposed to improve the automatic detection of alcohol usage (ADA) system, which is statistical data analysis pipeline to detect drinking episodes based on wearable physiological sensor data collected from real subjects. Object detection in aerial images remains a challenging problem due to low image resolutions, complex backgrounds, and variations of sizes and orientations of objects in images. The new ASBL method has been designed for training deep neural network object detectors to achieve improved performance. ASBL can be implemented at the image level, which is called image-based ASBL, or at the anchor level, which is called anchor-based ASBL. The method computes saliency information of input images and anchors generated by deep neural network object detectors, and weights different training examples and anchors differently based on their corresponding saliency measurements. It gives complex images and difficult targets more weights during training. In our experiments using two of the largest public benchmark data sets of aerial images, DOTA and NWPU VHR-10, the existing RetinaNet was trained using ASBL to generate an one-stage detector, ASBL-RetinaNet. ASBL-RetinaNet significantly outperformed the original RetinaNet by 3.61 mAP and 12.5 mAP on the two data sets, respectively. In addition, ASBL-RetinaNet outperformed 10 other state-of-art object detection methods. To improve bird detection in aerial images, the Little Birds in Aerial Imagery (LBAI) dataset has been created from real-life aerial imagery data. LBAI contains various flocks and species of birds that are small in size, ranging from 10 by 10 pixel to 40 by 40 pixel. The dataset was labeled and further divided into two subsets, Easy and Hard, based on the complex of background. We have applied and improved some of the best deep learning models to LBAI images, including object detection techniques, such as YOLOv3, SSD, and RetinaNet, and semantic segmentation techniques, such as U-Net and Mask R-CNN. Experimental results show that RetinaNet performed the best overall, outperforming other models by 1.4 and 4.9 F1 scores on the Easy and Hard LBAI dataset, respectively. For physiological sensor data analysis, Deep ADA has been developed to extract features from physiological signals and predict alcohol usage of real subjects in their daily lives. The features extracted are using Convolutional Neural Networks without any human intervention. A large amount of unlabeled data has been used in an unsupervised learning matter to improve the quality of learned features. The method outperformed traditional feature extraction methods by up to 19% higher accuracy.


Author(s):  
Fereshteh S. Bashiri ◽  
Eric LaRose ◽  
Jonathan C. Badger ◽  
Roshan M. D’Souza ◽  
Zeyun Yu ◽  
...  

2020 ◽  
Vol 12 (5) ◽  
pp. 784 ◽  
Author(s):  
Wei Guo ◽  
Weihong Li ◽  
Weiguo Gong ◽  
Jinkai Cui

Multi-scale object detection is a basic challenge in computer vision. Although many advanced methods based on convolutional neural networks have succeeded in natural images, the progress in aerial images has been relatively slow mainly due to the considerably huge scale variations of objects and many densely distributed small objects. In this paper, considering that the semantic information of the small objects may be weakened or even disappear in the deeper layers of neural network, we propose a new detection framework called Extended Feature Pyramid Network (EFPN) for strengthening the information extraction ability of the neural network. In the EFPN, we first design the multi-branched dilated bottleneck (MBDB) module in the lateral connections to capture much more semantic information. Then, we further devise an attention pathway for better locating the objects. Finally, an augmented bottom-up pathway is conducted for making shallow layer information easier to spread and further improving performance. Moreover, we present an adaptive scale training strategy to enable the network to better recognize multi-scale objects. Meanwhile, we present a novel clustering method to achieve adaptive anchors and make the neural network better learn data features. Experiments on the public aerial datasets indicate that the presented method obtain state-of-the-art performance.


Author(s):  
G. Mandlburger

In the last years, the tremendous progress in image processing and camera technology has reactivated the interest in photogrammetrybased surface mapping. With the advent of Dense Image Matching (DIM), the derivation of height values on a per-pixel basis became feasible, allowing the derivation of Digital Elevation Models (DEM) with a spatial resolution in the range of the ground sampling distance of the aerial images, which is often below 10&amp;thinsp;cm today. While mapping topography and vegetation constitutes the primary field of application for image based surface reconstruction, multi-spectral images also allow to see through the water surface to the bottom underneath provided sufficient water clarity. In this contribution, the feasibility of through-water dense image matching for mapping shallow water bathymetry using off-the-shelf software is evaluated. In a case study, the SURE software is applied to three different coastal and inland water bodies. After refraction correction, the DIM point clouds and the DEMs derived thereof are compared to concurrently acquired laser bathymetry data. The results confirm the general suitability of through-water dense image matching, but sufficient bottom texture and favorable environmental conditions (clear water, calm water surface) are a preconditions for achieving accurate results. Water depths of up to 5&amp;thinsp;m could be mapped with a mean deviation between laser and trough-water DIM in the dm-range. Image based water depth estimates, however, become unreliable in case of turbid or wavy water and poor bottom texture.


Author(s):  
Y. Q. Dong ◽  
L. Zhang ◽  
X. M. Cui ◽  
H. B. Ai

Although many filter algorithms have been presented over past decades, these algorithms are usually designed for the Lidar point clouds and can’t separate the ground points from the DIM (dense image matching, DIM) point clouds derived from the oblique aerial images owing to the high density and variation of the DIM point clouds completely. To solve this problem, a new automatic filter algorithm is developed on the basis of adaptive TIN models. At first, the differences between Lidar and DIM point clouds which influence the filtering results are analysed in this paper. To avoid the influences of the plants which can’t be penetrated by the DIM point clouds in the searching seed pointes process, the algorithm makes use of the facades of buildings to get ground points located on the roads as seed points and construct the initial TIN. Then a new densification strategy is applied to deal with the problem that the densification thresholds do not change as described in other methods in each iterative process. Finally, we use the DIM point clouds located in Potsdam produced by Photo-Scan to evaluate the method proposed in this paper. The experiment results show that the method proposed in this paper can not only separate the ground points from the DIM point clouds completely but also obtain the better filter results compared with TerraSolid. 1.


Author(s):  
Z. Kurczynski ◽  
K. Bakuła ◽  
M. Karabin ◽  
M. Kowalczyk ◽  
J. S. Markiewicz ◽  
...  

Updating the cadastre requires much work carried out by surveying companies in countries that have still not solved the problem of updating the cadastral data. In terms of the required precision, these works are among the most accurate. This raises the question: to what extent may modern digital photogrammetric methods be useful in this process? The capabilities of photogrammetry have increased significantly after the introduction of digital aerial cameras and digital technologies. For the registration of cadastral objects, i.e., land parcels’ boundaries and the outlines of buildings, very high-resolution aerial photographs can be used. The paper relates an attempt to use an alternative source of data for this task - the development of images acquired from UAS platforms. Multivariate mapping of cadastral parcels was implemented to determine the scope of the suitability of low altitude photos for the cadastre. In this study, images obtained from UAS with the GSD of 3 cm were collected for an area of a few square kilometres. Bundle adjustment of these data was processed with sub-pixel accuracy. This led to photogrammetric measurements being carried out and the provision of an orthophotomap (orthogonalized with a digital surface model from dense image matching of UAS images). Geometric data related to buildings were collected with two methods: stereoscopic and multi-photo measurements. Data related to parcels’ boundaries were measured with monoplotting on an orthophotomap from low-altitude images. As reference field surveying data were used. The paper shows the potential and limits of the use of UAS in a process of updating cadastral data. It also gives recommendations when performing photogrammetric missions and presents the possible accuracy of this type of work.


Sign in / Sign up

Export Citation Format

Share Document