scholarly journals An Efficient Stereo Matching Network Using Sequential Feature Fusion

Electronics ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 1045
Author(s):  
Jaecheol Jeong ◽  
Suyeon Jeon ◽  
Yong Seok Heo

Recent stereo matching networks adopt 4D cost volumes and 3D convolutions for processing those volumes. Although these methods show good performance in terms of accuracy, they have an inherent disadvantage in that they require great deal of computing resources and memory. These requirements limit their applications for mobile environments, which are subject to inherent computing hardware constraints. Both accuracy and consumption of computing resources are important, and improving both at the same time is a non-trivial task. To deal with this problem, we propose a simple yet efficient network, called Sequential Feature Fusion Network (SFFNet) which sequentially generates and processes the cost volume using only 2D convolutions. The main building block of our network is a Sequential Feature Fusion (SFF) module which generates 3D cost volumes to cover a part of the disparity range by shifting and concatenating the target features, and processes the cost volume using 2D convolutions. A series of the SFF modules in our SFFNet are designed to gradually cover the full disparity range. Our method prevents heavy computations and allows for efficient generation of an accurate final disparity map. Various experiments show that our method has an advantage in terms of accuracy versus efficiency compared to other networks.

PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0251657
Author(s):  
Zedong Huang ◽  
Jinan Gu ◽  
Jing Li ◽  
Xuefei Yu

Deep learning based on a convolutional neural network (CNN) has been successfully applied to stereo matching. Compared with the traditional method, the speed and accuracy of this method have been greatly improved. However, the existing stereo matching framework based on a CNN often encounters two problems. First, the existing stereo matching network has many parameters, which leads to the matching running time being too long. Second, the disparity estimation is inadequate in some regions where reflections, repeated textures, and fine structures may lead to ill-posed problems. Through the lightweight improvement of the PSMNet (Pyramid Stereo Matching Network) model, the common matching effect of ill-conditioned areas such as repeated texture areas and weak texture areas is solved. In the feature extraction part, ResNeXt is introduced to learn unitary feature extraction, and the ASPP (Atrous Spatial Pyramid Pooling) module is trained to extract multiscale spatial feature information. The feature fusion module is designed to effectively fuse the feature information of different scales to construct the matching cost volume. The improved 3D CNN uses the stacked encoding and decoding structure to further regularize the matching cost volume and obtain the corresponding relationship between feature points under different parallax conditions. Finally, the disparity map is obtained by a regression. We evaluate our method on the Scene Flow, KITTI 2012, and KITTI 2015 stereo datasets. The experiments show that the proposed stereo matching network achieves a comparable prediction accuracy and much faster running speed compared with PSMNet.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1430
Author(s):  
Xiaogang Jia ◽  
Wei Chen ◽  
Zhengfa Liang ◽  
Xin Luo ◽  
Mingfei Wu ◽  
...  

Stereo matching is an important research field of computer vision. Due to the dimension of cost aggregation, current neural network-based stereo methods are difficult to trade-off speed and accuracy. To this end, we integrate fast 2D stereo methods with accurate 3D networks to improve performance and reduce running time. We leverage a 2D encoder-decoder network to generate a rough disparity map and construct a disparity range to guide the 3D aggregation network, which can significantly improve the accuracy and reduce the computational cost. We use a stacked hourglass structure to refine the disparity from coarse to fine. We evaluated our method on three public datasets. According to the KITTI official website results, Our network can generate an accurate result in 80 ms on a modern GPU. Compared to other 2D stereo networks (AANet, DeepPruner, FADNet, etc.), our network has a big improvement in accuracy. Meanwhile, it is significantly faster than other 3D stereo networks (5× than PSMNet, 7.5× than CSN and 22.5× than GANet, etc.), demonstrating the effectiveness of our method.


2020 ◽  
Vol 12 (24) ◽  
pp. 4025
Author(s):  
Rongshu Tao ◽  
Yuming Xiang ◽  
Hongjian You

As an essential step in 3D reconstruction, stereo matching still faces unignorable problems due to the high resolution and complex structures of remote sensing images. Especially in occluded areas of tall buildings and textureless areas of waters and woods, precise disparity estimation has become a difficult but important task. In this paper, we develop a novel edge-sense bidirectional pyramid stereo matching network to solve the aforementioned problems. The cost volume is constructed from negative to positive disparities since the disparity range in remote sensing images varies greatly and traditional deep learning networks only work well for positive disparities. Then, the occlusion-aware maps based on the forward-backward consistency assumption are applied to reduce the influence of the occluded area. Moreover, we design an edge-sense smoothness loss to improve the performance of textureless areas while maintaining the main structure. The proposed network is compared with two baselines. The experimental results show that our proposed method outperforms two methods, DenseMapNet and PSMNet, in terms of averaged endpoint error (EPE) and the fraction of erroneous pixels (D1), and the improvements in occluded and textureless areas are significant.


2020 ◽  
Vol 64 (2) ◽  
pp. 20505-1-20505-12
Author(s):  
Hui-Yu Huang ◽  
Zhe-Hao Liu

Abstract A stereo matching algorithm is used to find the best match between a pair of images. To compute the cost of the matching points from the sequence of images, the disparity maps from video streams are estimated. However, the estimated disparity sequences may cause undesirable flickering errors. These errors result in low visibility of the synthesized video and reduce video coding. In order to solve this problem, in this article, the authors propose a spatiotemporal disparity refinement on local stereo matching based on the segmentation strategy. Based on segmentation information, matching point searching, and color similarity, adaptive disparity values to recover the disparity errors in disparity sequences can be obtained. The flickering errors are also effectively removed, and the boundaries of objects are well preserved. The procedures of the proposed approach consist of a segmentation process, matching point searching, and refinement in the temporal and spatial domains. Experimental results verify that the proposed approach can yield a high quantitative evaluation and a high-quality disparity map compared with other methods.


Author(s):  
A. F. Kadmin ◽  
◽  
R. A. Hamzah ◽  
M. N. Abd Manap ◽  
M. S. Hamid ◽  
...  

Stereo matching is a significant subject in the stereo vision algorithm. Traditional taxonomy composition consists of several issues in the stereo correspondences process such as radiometric distortion, discontinuity, and low accuracy at the low texture regions. This new taxonomy improves the local method of stereo matching algorithm based on the dynamic cost computation for disparity map measurement. This method utilised modified dynamic cost computation in the matching cost stage. A modified Census Transform with dynamic histogram is used to provide the cost volume. An adaptive bilateral filtering is applied to retain the image depth and edge information in the cost aggregation stage. A Winner Takes All (WTA) optimisation is applied in the disparity selection and a left-right check with an adaptive bilateral median filtering are employed for final refinement. Based on the dataset of standard Middlebury, the taxonomy has better accuracy and outperformed several other state-ofthe-art algorithms. Keywords—Stereo matching, disparity map, dynamic cost, census transform, local method


Author(s):  
E. Sarrazin ◽  
M. Cournet ◽  
L. Dumas ◽  
V. Defonte ◽  
Q. Fardet ◽  
...  

Abstract. In a 3D reconstruction pipeline, stereo matching step aims at computing a disparity map representing the depth between image pair. The evaluation of the disparity map can be done through the estimation of a confidence metric. In this article, we propose a new confidence metric, named ambiguity integral metric, to assess the quality of the produced disparity map. This metric is derived from the concept of ambiguity, which characterizes the property of the cost curve profile. It aims to quantify the difficulty in identifying the correct disparity to select. The quality of ambiguity integral metric is evaluated through the ROC curve methodology and compared with other confidence measures. In regards to other measures, the ambiguity integral measure shows a good potential. We also integrate this measure through various steps of the stereo matching pipeline in order to improve the performance estimation of the disparity map. First, we include ambiguity integral measure during the Semi Global Matching optimization step. The objective is to weight, by ambiguity integral measure, the influence of points in the SGM regularization to reduce the impact of ambiguous points. Secondly, we use ambiguity as an input of a disparity refinement deep learning architecture in order to easily locate noisy area and preserve details.


2017 ◽  
Author(s):  
Santi J. Vives

Hash-based signatures use a one-time signature (OTS) as its main building block, and transform it into a many-times scheme, to sign a larger number of signatures. In known constructions, the cost and the size of each signature increase as the number of needed signatures grows. In real-world applications, requiring a significant number of signatures, the signatures can get quite large. As a result, it is usually believed that post-quantum signatures based on hashes need more computation and much larger sizes than classical signatures. We introduce a construction to challenge that idea: we show that it is possible to construct a many-times signatures scheme that is more efficient than the OTS it is built from, rather than less.We study the generation of signatures in conjunction with a blockchain, like bitcoin. The proposed scheme permits an unlimited number of signatures. The size of each signatures is constant and the same as in the OTS. The verification cost starts the same as in the OTS and decreases with each new signature, becoming more efficient on average as the number of signatures grows.


2012 ◽  
Vol 629 ◽  
pp. 682-687 ◽  
Author(s):  
Bo Xiong Wang ◽  
Wen Feng Liu ◽  
Jian Nan Liu ◽  
Yuan Yuan Cui ◽  
Xiu Zhia Luo

The performances of ultrasonic testing systems are greatly affected by the impedance characteristics of ultrasonic transducers. Conventional methods for designing matching networks consider only the characteristics of matching elements and transducer, while ignoring the effects of other elements of emission circuit. As a consequence, such method cannot give out satisfactory results. In this paper, a modeling method for ultrasonic driving circuits is proposed, which takes into account the power supply, the transformer, the matching networks, as well as the ultrasonic transducer. This method focuses on the performances both in time domain and in frequency domain. A computer simulation and experiments show that this method can provide better attenuation characteristics and energy transmission, and can be widely used for analyzing and designing matching network for ultrasonic testing systems.


Author(s):  
James Okae ◽  
Bohan Li ◽  
Juan Du ◽  
Yueming Hu

2021 ◽  
Vol 297 ◽  
pp. 01055
Author(s):  
Mohamed El Ansari ◽  
Ilyas El Jaafari ◽  
Lahcen Koutti

This paper proposes a new edge based stereo matching approach for road applications. The new approach consists in matching the edge points extracted from the input stereo images using temporal constraints. At the current frame, we propose to estimate a disparity range for each image line based on the disparity map of its preceding one. The stereo images are divided into multiple parts according to the estimated disparity ranges. The optimal solution of each part is independently approximated via the state-of-the-art energy minimization approach Graph cuts. The disparity search space at each image part is very small compared to the global one, which improves the results and reduces the execution time. Furthermore, as a similarity criterion between corresponding edge points, we propose a new cost function based on the intensity, the gradient magnitude and gradient orientation. The proposed method has been tested on virtual stereo images, and it has been compared to a recently proposed method and the results are satisfactory.


Sign in / Sign up

Export Citation Format

Share Document