scholarly journals EFFICIENT SURFACE-AWARE SEMI-GLOBAL MATCHING WITH MULTI-VIEW PLANE-SWEEP SAMPLING

Author(s):  
B. Ruf ◽  
T. Pollok ◽  
M. Weinmann

<p><strong>Abstract.</strong> Online augmentation of an oblique aerial image sequence with structural information is an essential aspect in the process of 3D scene interpretation and analysis. One key aspect in this is the efficient dense image matching and depth estimation. Here, the Semi-Global Matching (SGM) approach has proven to be one of the most widely used algorithms for efficient depth estimation, providing a good trade-off between accuracy and computational complexity. However, SGM only models a first-order smoothness assumption, thus favoring fronto-parallel surfaces. In this work, we present a hierarchical algorithm that allows for efficient depth and normal map estimation together with confidence measures for each estimate. Our algorithm relies on a plane-sweep multi-image matching followed by an extended SGM optimization that allows to incorporate local surface orientations, thus achieving more consistent and accurate estimates in areasmade up of slanted surfaces, inherent to oblique aerial imagery. We evaluate numerous configurations of our algorithm on two different datasets using an absolute and relative accuracy measure. In our evaluation, we show that the results of our approach are comparable to the ones achieved by refined Structure-from-Motion (SfM) pipelines, such as COLMAP, which are designed for offline processing. In contrast, however, our approach only considers a confined image bundle of an input sequence, thus allowing to perform an online and incremental computation at 1Hz&amp;ndash;2Hz.</p>

Author(s):  
F. Bethmann ◽  
T. Luhmann

Semi-Global Matching (SGM) is a widespread algorithm for image matching which is used for very different applications, ranging from real-time applications (e.g. for generating 3D data for driver assistance systems) to aerial image matching. Originally developed for stereo-image matching, several extensions have been proposed to use more than two images within the matching process (multi-baseline matching, multi-view stereo). These extensions still perform the image matching in (rectified) stereo images and combine the pairwise results afterwards to create the final solution. This paper proposes an alternative approach which is suitable for the introduction of an arbitrary number of images into the matching process and utilizes image matching by using non-rectified images. The new method differs from the original SGM method mainly in two aspects: Firstly, the cost calculation is formulated in object space within a dense voxel raster by using the grey (or colour) values of all images instead of pairwise cost calculation in image space. Secondly, the semi-global (path-wise) minimization process is transferred into object space as well, so that the result of semi-global optimization leads to index maps (instead of disparity maps) which directly indicate the 3D positions of the best matches. Altogether, this yields to an essential simplification of the matching process compared to multi-view stereo (MVS) approaches. After a description of the new method, results achieved from two different datasets (close-range and aerial) are presented and discussed.


Author(s):  
F. Bethmann ◽  
T. Luhmann

Semi-Global Matching (SGM) is a widespread algorithm for image matching which is used for very different applications, reaching from real-time applications (e.g. for generating 3D-data for driver assistance systems) to aerial image matching. Originally developed for stereo-image matching, several extensions have been proposed to use more than two images within the matching process (multibaseline matching, multi-view stereo). Most of these extensions still perform the image matching in (rectified) stereo images and combine the pairwise results afterwards to create the final solution. This paper proposes an alternative approach which is suitable for the introduction of an arbitrary number of images into the matching process and utilizes image matching by using non-rectified images within a closed solution. The proposed approach differs from the original SGM method in two major aspects: Firstly, the cost calculation is formulated in object space within a dense voxel raster by using the grey- (or colour-) values of all images instead of pairwise cost calculation in image space. Secondly, the semi-global (path-wise) minimization process is transferred into object space as well, so that the result of semi-global optimization leads to index-maps (instead of disparity maps) which directly indicate the 3D positions of the best matches. The paper provides a detailed description of the approach and it discusses its advantages and disadvantages. Further on, first results and accuracy analysis are presented.


2021 ◽  
Vol 8 ◽  
Author(s):  
Qi Zhao ◽  
Ziqiang Zheng ◽  
Huimin Zeng ◽  
Zhibin Yu ◽  
Haiyong Zheng ◽  
...  

Underwater depth prediction plays an important role in underwater vision research. Because of the complex underwater environment, it is extremely difficult and expensive to obtain underwater datasets with reliable depth annotation. Thus, underwater depth map estimation with a data-driven manner is still a challenging task. To tackle this problem, we propose an end-to-end system including two different modules for underwater image synthesis and underwater depth map estimation, respectively. The former module aims to translate the hazy in-air RGB-D images to multi-style realistic synthetic underwater images while retaining the objects and the structural information of the input images. Then we construct a semi-real RGB-D underwater dataset using the synthesized underwater images and the original corresponding depth maps. We conduct supervised learning to perform depth estimation through the pseudo paired underwater RGB-D images. Comprehensive experiments have demonstrated that the proposed method can generate multiple realistic underwater images with high fidelity, which can be applied to enhance the performance of monocular underwater image depth estimation. Furthermore, the trained depth estimation model can be applied to real underwater image depth map estimation. We will release our codes and experimental setting in https://github.com/ZHAOQIII/UW_depth.


Author(s):  
L. Madhuanand ◽  
F. Nex ◽  
M. Y. Yang

Abstract. Depth is an essential component for various scene understanding tasks and for reconstructing the 3D geometry of the scene. Estimating depth from stereo images requires multiple views of the same scene to be captured which is often not possible when exploring new environments with a UAV. To overcome this monocular depth estimation has been a topic of interest with the recent advancements in computer vision and deep learning techniques. This research has been widely focused on indoor scenes or outdoor scenes captured at ground level. Single image depth estimation from aerial images has been limited due to additional complexities arising from increased camera distance, wider area coverage with lots of occlusions. A new aerial image dataset is prepared specifically for this purpose combining Unmanned Aerial Vehicles (UAV) images covering different regions, features and point of views. The single image depth estimation is based on image reconstruction techniques which uses stereo images for learning to estimate depth from single images. Among the various available models for ground-level single image depth estimation, two models, 1) a Convolutional Neural Network (CNN) and 2) a Generative Adversarial model (GAN) are used to learn depth from aerial images from UAVs. These models generate pixel-wise disparity images which could be converted into depth information. The generated disparity maps from these models are evaluated for its internal quality using various error metrics. The results show higher disparity ranges with smoother images generated by CNN model and sharper images with lesser disparity range generated by GAN model. The produced disparity images are converted to depth information and compared with point clouds obtained using Pix4D. It is found that the CNN model performs better than GAN and produces depth similar to that of Pix4D. This comparison helps in streamlining the efforts to produce depth from a single aerial image.


Author(s):  
Han Hu ◽  
Chongtai Chen ◽  
Bo Wu ◽  
Xiaoxia Yang ◽  
Qing Zhu ◽  
...  

Textureless and geometric discontinuities are major problems in state-of-the-art dense image matching methods, as they can cause visually significant noise and the loss of sharp features. Binary census transform is one of the best matching cost methods but in textureless areas, where the intensity values are similar, it suffers from small random noises. Global optimization for disparity computation is inherently sensitive to parameter tuning in complex urban scenes, and must compromise between smoothness and discontinuities. The aim of this study is to provide a method to overcome these issues in dense image matching, by extending the industry proven Semi-Global Matching through 1) developing a ternary census transform, which takes three outputs in a single order comparison and encodes the results in two bits rather than one, and also 2) by using texture-information to self-tune the parameters, which both preserves sharp edges and enforces smoothness when necessary. Experimental results using various datasets from different platforms have shown that the visual qualities of the triangulated point clouds in urban areas can be largely improved by these proposed methods.


Author(s):  
E. Karkalou ◽  
C. Stentoumis ◽  
G. Karras

The demand for 3D models of various scales and precisions is strong for a wide range of applications, among which cultural heritage recording is particularly important and challenging. In this context, dense image matching is a fundamental task for processes which involve image-based reconstruction of 3D models. Despite the existence of commercial software, the need for complete and accurate results under different conditions, as well as for computational efficiency under a variety of hardware, has kept image-matching algorithms as one of the most active research topics. Semi-global matching (SGM) is among the most popular optimization algorithms due to its accuracy, computational efficiency, and simplicity. A challenging aspect in SGM implementation is the determination of smoothness constraints, i.e. penalties P1, P2 for disparity changes and discontinuities. In fact, penalty adjustment is needed for every particular stereo-pair and cost computation. In this work, a novel formulation of <i>self-adjusting penalties</i> is proposed: <i>SGM penalties can be estimated solely from the statistical properties of the initial disparity space image</i>. The proposed method of self-adjusting penalties (SGM-SAP) is evaluated using typical cost functions on stereo-pairs from the recent Middlebury dataset of interior scenes, as well as from the EPFL Herz-Jesu architectural scenes. Results are competitive against the original SGM estimates. The significant aspects of self-adjusting penalties are: (i) the time-consuming tuning process is avoided; (ii) SGM can be used in image collections with limited number of stereo-pairs; and (iii) no heuristic user intervention is needed.


Sign in / Sign up

Export Citation Format

Share Document