Angular Disparity Map: A Scalable Perceptual-Based Representation of Binocular Disparity

Author(s):  
Yu-Hsun Lin ◽  
Ja-Ling Wu
Author(s):  
Patrick Knöbelreiter ◽  
Thomas Pock

AbstractIn this work, we propose a learning-based method to denoise and refine disparity maps. The proposed variational network arises naturally from unrolling the iterates of a proximal gradient method applied to a variational energy defined in a joint disparity, color, and confidence image space. Our method allows to learn a robust collaborative regularizer leveraging the joint statistics of the color image, the confidence map and the disparity map. Due to the variational structure of our method, the individual steps can be easily visualized, thus enabling interpretability of the method. We can therefore provide interesting insights into how our method refines and denoises disparity maps. To this end, we can visualize and interpret the learned filters and activation functions and prove the increased reliability of the predicted pixel-wise confidence maps. Furthermore, the optimization based structure of our refinement module allows us to compute eigen disparity maps, which reveal structural properties of our refinement module. The efficiency of our method is demonstrated on the publicly available stereo benchmarks Middlebury 2014 and Kitti 2015.


2021 ◽  
Vol 11 (12) ◽  
pp. 5383
Author(s):  
Huachen Gao ◽  
Xiaoyu Liu ◽  
Meixia Qu ◽  
Shijie Huang

In recent studies, self-supervised learning methods have been explored for monocular depth estimation. They minimize the reconstruction loss of images instead of depth information as a supervised signal. However, existing methods usually assume that the corresponding points in different views should have the same color, which leads to unreliable unsupervised signals and ultimately damages the reconstruction loss during the training. Meanwhile, in the low texture region, it is unable to predict the disparity value of pixels correctly because of the small number of extracted features. To solve the above issues, we propose a network—PDANet—that integrates perceptual consistency and data augmentation consistency, which are more reliable unsupervised signals, into a regular unsupervised depth estimation model. Specifically, we apply a reliable data augmentation mechanism to minimize the loss of the disparity map generated by the original image and the augmented image, respectively, which will enhance the robustness of the image in the prediction of color fluctuation. At the same time, we aggregate the features of different layers extracted by a pre-trained VGG16 network to explore the higher-level perceptual differences between the input image and the generated one. Ablation studies demonstrate the effectiveness of each components, and PDANet shows high-quality depth estimation results on the KITTI benchmark, which optimizes the state-of-the-art method from 0.114 to 0.084, measured by absolute relative error for depth estimation.


Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1430
Author(s):  
Xiaogang Jia ◽  
Wei Chen ◽  
Zhengfa Liang ◽  
Xin Luo ◽  
Mingfei Wu ◽  
...  

Stereo matching is an important research field of computer vision. Due to the dimension of cost aggregation, current neural network-based stereo methods are difficult to trade-off speed and accuracy. To this end, we integrate fast 2D stereo methods with accurate 3D networks to improve performance and reduce running time. We leverage a 2D encoder-decoder network to generate a rough disparity map and construct a disparity range to guide the 3D aggregation network, which can significantly improve the accuracy and reduce the computational cost. We use a stacked hourglass structure to refine the disparity from coarse to fine. We evaluated our method on three public datasets. According to the KITTI official website results, Our network can generate an accurate result in 80 ms on a modern GPU. Compared to other 2D stereo networks (AANet, DeepPruner, FADNet, etc.), our network has a big improvement in accuracy. Meanwhile, it is significantly faster than other 3D stereo networks (5× than PSMNet, 7.5× than CSN and 22.5× than GANet, etc.), demonstrating the effectiveness of our method.


2015 ◽  
Author(s):  
Bo Yu ◽  
Hideki Kakeya
Keyword(s):  

2021 ◽  
Author(s):  
Ryan Edward O'Donnell ◽  
Kyrie Murawski ◽  
Ella Herrmann ◽  
Jesse Wisch ◽  
Garrett D. Sullivan ◽  
...  

There have been conflicting findings on the degree to which exogenous/reflexive visual attention is selective for depth, and this issue has important implications for attention models. Previous findings have attempted to find depth-based cueing effects on such attention using reaction time measures for stimuli presented in stereo goggles with a display screen. Results stemming from such approaches have been mixed, depending on whether target/distractor discrimination was required. To help clarify the existence of such depth effects, we have developed a paradigm that measures accuracy rather than reaction time in an immersive virtual-reality environment, providing a more appropriate context of depth. Four modified Posner Cueing paradigms were run to test for depth-specific attentional selectivity. Participants fixated a cross while attempting to identify a rapidly masked letter that was preceded by a cue that could be valid in depth and side, depth only, or side only. In Experiment 1, a potent cueing effect was found for side validity and a weak effect was found for depth. Experiment 2 controlled for differences in cue and target sizes when presented at different depths, which caused the depth validity effect to disappear entirely even though participants were explicitly asked to report depth and the difference in virtual depth was extreme (20 vs 300 meters). Experiments 3a and 3b brought the front depth plane even closer (1 m) to maximize effects of binocular disparity, but no reliable depth cueing validity was observed. Thus, it seems that rapid/exogenous attention pancakes 3-dimensional space into a 2-dimensional reference frame.


Sign in / Sign up

Export Citation Format

Share Document