scholarly journals A comparison of semiglobal and local dense matching algorithms for surface reconstruction

Author(s):  
E. Dall'Asta ◽  
R. Roncella

Encouraged by the growing interest in automatic 3D image-based reconstruction, the development and improvement of robust stereo matching techniques is one of the most investigated research topic of the last years in photogrammetry and computer vision.<br><br> The paper is focused on the comparison of some stereo matching algorithms (local and global) which are very popular both in photogrammetry and computer vision. In particular, the Semi-Global Matching (SGM), which realizes a pixel-wise matching and relies on the application of consistency constraints during the matching cost aggregation, will be discussed.<br><br> The results of some tests performed on real and simulated stereo image datasets, evaluating in particular the accuracy of the obtained digital surface models, will be presented. Several algorithms and different implementation are considered in the comparison, using freeware software codes like MICMAC and OpenCV, commercial software (e.g. Agisoft PhotoScan) and proprietary codes implementing Least Square e Semi-Global Matching algorithms. The comparisons will also consider the completeness and the level of detail within fine structures, and the reliability and repeatability of the obtainable data.

Author(s):  
C. de Franchis ◽  
E. Meinhardt-Llopis ◽  
J. Michel ◽  
J.-M. Morel ◽  
G. Facciolo

The increasing availability of high resolution stereo images from Earth observation satellites has boosted the development of tools for producing 3D elevation models. The objective of these tools is to produce digital elevation models of very large areas with minimal human intervention. The development of these tools has been shaped by the constraints of the remote sensing acquisition, for example, using ad hoc stereo matching tools to deal with the pushbroom image geometry. However, this specialization has also created a gap with respect to the fields of computer vision and image processing, where these constraints are usually factored out. In this work we propose a fully automatic and modular stereo pipeline to produce digital elevation models from satellite images. The aim of this new pipeline, called <i>Satellite Stereo Pipeline</i> and abbreviated as <i>s2p</i>, is to use (and test) off-the-shelf computer vision tools while abstracting from the complexity associated to satellite imaging. To this aim, images are cut in small tiles for which we proved that the pushbroom geometry is very accurately approximated by the pinhole model. These tiles are then processed with standard stereo image rectification and stereo matching tools. The specifics of satellite imaging such as pointing accuracy refinement, estimation of the initial elevation from SRTM data, and geodetic coordinate systems are handled transparently by s2p. We demonstrate the robustness of our approach on a large database of satellite images and by providing an online demo of s2p.


Author(s):  
Y. Zhou ◽  
Y. Song ◽  
J. Lu

Semi-global matching(SGM) performs the dynamic programming by treating the different path directions equally. It does not consider the impact of different path directions on cost aggregation, and with the expansion of the disparity search range, the accuracy and efficiency of the algorithm drastically decrease. This paper presents a dense matching algorithm by integrating SIFT and SGM. It takes the successful matching pairs matched by SIFT as control points to direct the path in dynamic programming with truncating error propagation. Besides, matching accuracy can be improved by using the gradient direction of the detected feature points to modify the weights of the paths in different directions. The experimental results based on Middlebury stereo data sets and CE-3 lunar data sets demonstrate that the proposed algorithm can effectively cut off the error propagation, reduce disparity search range and improve matching accuracy.


2020 ◽  
Vol 12 (7) ◽  
pp. 1069
Author(s):  
Yuanxin Xia ◽  
Pablo d’Angelo ◽  
Jiaojiao Tian ◽  
Friedrich Fraundorfer ◽  
Peter Reinartz

Semi-Global Matching (SGM) approximates a 2D Markov Random Field (MRF) via multiple 1D scanline optimizations, which serves as a good trade-off between accuracy and efficiency in dense matching. Nevertheless, the performance is limited due to the simple summation of the aggregated costs from all 1D scanline optimizations for the final disparity estimation. SGM-Forest improves the performance of SGM by training a random forest to predict the best scanline according to each scanline’s disparity proposal. The disparity estimated by the best scanline acts as reference to adaptively adopt close proposals for further post-processing. However, in many cases more than one scanline is capable of providing a good prediction. Training the random forest with only one scanline labeled may limit or even confuse the learning procedure when other scanlines can offer similar contributions. In this paper, we propose a multi-label classification strategy to further improve SGM-Forest. Each training sample is allowed to be described by multiple labels (or zero label) if more than one (or none) scanline gives a proper prediction. We test the proposed method on stereo matching datasets, from Middlebury, ETH3D, EuroSDR image matching benchmark, and the 2019 IEEE GRSS data fusion contest. The result indicates that under the framework of SGM-Forest, the multi-label strategy outperforms the single-label scheme consistently.


Author(s):  
Y. Xia ◽  
J. Tian ◽  
P. d’Angelo ◽  
P. Reinartz

3D reconstruction of plants is hard to implement, as the complex leaf distribution highly increases the difficulty level in dense matching. Semi-Global Matching has been successfully applied to recover the depth information of a scene, but may perform variably when different matching cost algorithms are used. In this paper two matching cost computation algorithms, Census transform and an algorithm using a convolutional neural network, are tested for plant reconstruction based on Semi-Global Matching. High resolution close-range photogrammetric images from a handheld camera are used for the experiment. The disparity maps generated based on the two selected matching cost methods are comparable with acceptable quality, which shows the good performance of Census and the potential of neural networks to improve the dense matching.


Author(s):  
M. Shahbazi ◽  
G. Sohn ◽  
J. Théau ◽  
P. Ménard

Dense stereo matching is one of the fundamental and active areas of photogrammetry. The increasing image resolution of digital cameras as well as the growing interest in unconventional imaging, e.g. unmanned aerial imagery, has exposed stereo image pairs to serious occlusion, noise and matching ambiguity. This has also resulted in an increase in the range of disparity values that should be considered for matching. Therefore, conventional methods of dense matching need to be revised to achieve higher levels of efficiency and accuracy. In this paper, we present an algorithm that uses the concepts of intrinsic curves to propose sparse disparity hypotheses for each pixel. Then, the hypotheses are propagated to adjoining pixels by label-set enlargement based on the proximity in the space of intrinsic curves. The same concepts are applied to model occlusions explicitly via a regularization term in the energy function. Finally, a global optimization stage is performed using belief-propagation to assign one of the disparity hypotheses to each pixel. By searching only through a small fraction of the whole disparity search space and handling occlusions and ambiguities, the proposed framework could achieve high levels of accuracy and efficiency.


Author(s):  
M. Shahbazi ◽  
G. Sohn ◽  
J. Théau ◽  
P. Ménard

Dense stereo matching is one of the fundamental and active areas of photogrammetry. The increasing image resolution of digital cameras as well as the growing interest in unconventional imaging, e.g. unmanned aerial imagery, has exposed stereo image pairs to serious occlusion, noise and matching ambiguity. This has also resulted in an increase in the range of disparity values that should be considered for matching. Therefore, conventional methods of dense matching need to be revised to achieve higher levels of efficiency and accuracy. In this paper, we present an algorithm that uses the concepts of intrinsic curves to propose sparse disparity hypotheses for each pixel. Then, the hypotheses are propagated to adjoining pixels by label-set enlargement based on the proximity in the space of intrinsic curves. The same concepts are applied to model occlusions explicitly via a regularization term in the energy function. Finally, a global optimization stage is performed using belief-propagation to assign one of the disparity hypotheses to each pixel. By searching only through a small fraction of the whole disparity search space and handling occlusions and ambiguities, the proposed framework could achieve high levels of accuracy and efficiency.


2020 ◽  
Vol 12 (5) ◽  
pp. 870 ◽  
Author(s):  
Wenhuan Yang ◽  
Xin Li ◽  
Bo Yang ◽  
Yu Fu

Image dense matching has become one of the widely used means for DSM generation due to its good performance in both accuracy and efficiency. However, for water areas, the most common ground object, accurate disparity estimation is always a challenge to excellent image dense matching methods, as represented by semi-global matching (SGM), due to the poor texture. For this reason, a great deal of manual editing is always inevitable before practical applications. The main reason for this is the lack of uniqueness of matching primitives, with fixed size and shape, used by those methods. In this paper, we propose a novel DSM generation method, namely semi-global and block matching (SGBM), to achieve accurate disparity and height estimation in water areas by adaptive block matching instead of pixel matching. First, the water blocks are extracted by seed point growth, and an adaptive block matching strategy considering geometrical deformations, called end-block matching (EBM), is adopted to achieve accurate disparity estimation. Then, the disparity of all other pixels beyond these water blocks is obtained by SGM. Last, the median value of height of all pixels within the same block is selected as the final height for this block after forward intersection. Experiments are conducted on ZiYuan-3 (ZY-3) stereo images, and the results show that DSM generated by our method in water areas has high accuracy and visual quality.


Author(s):  
K. Heinrich ◽  
M. Mehltretter

Abstract. In recent years, the ability to assess the uncertainty of depth estimates in the context of dense stereo matching has received increased attention due to its potential to detect erroneous estimates. Especially, the introduction of deep learning approaches greatly improved general performance, with feature extraction from multiple modalities proving to be highly advantageous due to the unique and different characteristics of each modality. However, most work in the literature focuses on using only mono- or bi- or rarely tri-modal input, not considering the potential effectiveness of modalities, going beyond tri-modality. To further advance the idea of combining different types of features for confidence estimation, in this work, a CNN-based approach is proposed, exploiting uncertainty cues from up to four modalities. For this purpose, a state-of-the-art local-global approach is used as baseline and extended accordingly. Additionally, a novel disparity-based modality named warped difference is presented to support uncertainty estimation at common failure cases of dense stereo matching. The general validity and improved performance of the proposed approach is demonstrated and compared against the bi-modal baseline in an evaluation on three datasets using two common dense stereo matching techniques.


Author(s):  
J. Liu ◽  
S. Ji ◽  
C. Zhang ◽  
Z. Qin

Dense stereo matching has been extensively studied in photogrammetry and computer vision. In this paper we evaluate the application of deep learning based stereo methods, which were raised from 2016 and rapidly spread, on aerial stereos other than ground images that are commonly used in computer vision community. Two popular methods are evaluated. One learns matching cost with a convolutional neural network (known as MC-CNN); the other produces a disparity map in an end-to-end manner by utilizing both geometry and context (known as GC-net). First, we evaluate the performance of the deep learning based methods for aerial stereo images by a direct model reuse. The models pre-trained on KITTI 2012, KITTI 2015 and Driving datasets separately, are directly applied to three aerial datasets. We also give the results of direct training on target aerial datasets. Second, the deep learning based methods are compared to the classic stereo matching method, Semi-Global Matching(SGM), and a photogrammetric software, SURE, on the same aerial datasets. Third, transfer learning strategy is introduced to aerial image matching based on the assumption of a few target samples available for model fine tuning. It experimentally proved that the conventional methods and the deep learning based methods performed similarly, and the latter had greater potential to be explored.


2021 ◽  
Vol 13 (2) ◽  
pp. 274
Author(s):  
Guobiao Yao ◽  
Alper Yilmaz ◽  
Li Zhang ◽  
Fei Meng ◽  
Haibin Ai ◽  
...  

The available stereo matching algorithms produce large number of false positive matches or only produce a few true-positives across oblique stereo images with large baseline. This undesired result happens due to the complex perspective deformation and radiometric distortion across the images. To address this problem, we propose a novel affine invariant feature matching algorithm with subpixel accuracy based on an end-to-end convolutional neural network (CNN). In our method, we adopt and modify a Hessian affine network, which we refer to as IHesAffNet, to obtain affine invariant Hessian regions using deep learning framework. To improve the correlation between corresponding features, we introduce an empirical weighted loss function (EWLF) based on the negative samples using K nearest neighbors, and then generate deep learning-based descriptors with high discrimination that is realized with our multiple hard network structure (MTHardNets). Following this step, the conjugate features are produced by using the Euclidean distance ratio as the matching metric, and the accuracy of matches are optimized through the deep learning transform based least square matching (DLT-LSM). Finally, experiments on Large baseline oblique stereo images acquired by ground close-range and unmanned aerial vehicle (UAV) verify the effectiveness of the proposed approach, and comprehensive comparisons demonstrate that our matching algorithm outperforms the state-of-art methods in terms of accuracy, distribution and correct ratio. The main contributions of this article are: (i) our proposed MTHardNets can generate high quality descriptors; and (ii) the IHesAffNet can produce substantial affine invariant corresponding features with reliable transform parameters.


Sign in / Sign up

Export Citation Format

Share Document