scholarly journals Learning Inverse Depth Regression for Multi-View Stereo with Correlation Cost Volume

2020 ◽  
Vol 34 (07) ◽  
pp. 12508-12515
Author(s):  
Qingshan Xu ◽  
Wenbing Tao

Deep learning has shown to be effective for depth inference in multi-view stereo (MVS). However, the scalability and accuracy still remain an open problem in this domain. This can be attributed to the memory-consuming cost volume representation and inappropriate depth inference. Inspired by the group-wise correlation in stereo matching, we propose an average group-wise correlation similarity measure to construct a lightweight cost volume. This can not only reduce the memory consumption but also reduce the computational burden in the cost volume filtering. Based on our effective cost volume representation, we propose a cascade 3D U-Net module to regularize the cost volume to further boost the performance. Unlike the previous methods that treat multi-view depth inference as a depth regression problem or an inverse depth classification problem, we recast multi-view depth inference as an inverse depth regression task. This allows our network to achieve sub-pixel estimation and be applicable to large-scale scenes. Through extensive experiments on DTU dataset and Tanks and Temples dataset, we show that our proposed network with Correlation cost volume and Inverse DEpth Regression (CIDER1), achieves state-of-the-art results, demonstrating its superior performance on scalability and accuracy.

Symmetry ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 705
Author(s):  
Po-Chou Shih ◽  
Chun-Chin Hsu ◽  
Fang-Chih Tien

Silicon wafer is the most crucial material in the semiconductor manufacturing industry. Owing to limited resources, the reclamation of monitor and dummy wafers for reuse can dramatically lower the cost, and become a competitive edge in this industry. However, defects such as void, scratches, particles, and contamination are found on the surfaces of the reclaimed wafers. Most of the reclaimed wafers with the asymmetric distribution of the defects, known as the “good (G)” reclaimed wafers, can be re-polished if their defects are not irreversible and if their thicknesses are sufficient for re-polishing. Currently, the “no good (NG)” reclaimed wafers must be first screened by experienced human inspectors to determine their re-usability through defect mapping. This screening task is tedious, time-consuming, and unreliable. This study presents a deep-learning-based reclaimed wafers defect classification approach. Three neural networks, multilayer perceptron (MLP), convolutional neural network (CNN) and Residual Network (ResNet), are adopted and compared for classification. These networks analyze the pattern of defect mapping and determine not only the reclaimed wafers are suitable for re-polishing but also where the defect categories belong. The open source TensorFlow library was used to train the MLP, CNN, and ResNet networks using collected wafer images as input data. Based on the experimental results, we found that the system applying CNN networks with a proper design of kernels and structures gave fast and superior performance in identifying defective wafers owing to its deep learning capability, and the ResNet averagely exhibited excellent accuracy, while the large-scale MLP networks also acquired good results with proper network structures.


2020 ◽  
Vol 34 (07) ◽  
pp. 12926-12934
Author(s):  
Youmin Zhang ◽  
Yimin Chen ◽  
Xiao Bai ◽  
Suihanjin Yu ◽  
Kun Yu ◽  
...  

State-of-the-art deep learning based stereo matching approaches treat disparity estimation as a regression problem, where loss function is directly defined on true disparities and their estimated ones. However, disparity is just a byproduct of a matching process modeled by cost volume, while indirectly learning cost volume driven by disparity regression is prone to overfitting since the cost volume is under constrained. In this paper, we propose to directly add constraints to the cost volume by filtering cost volume with unimodal distribution peaked at true disparities. In addition, variances of the unimodal distributions for each pixel are estimated to explicitly model matching uncertainty under different contexts. The proposed architecture achieves state-of-the-art performance on Scene Flow and two KITTI stereo benchmarks. In particular, our method ranked the 1st place of KITTI 2012 evaluation and the 4th place of KITTI 2015 evaluation (recorded on 2019.8.20). The codes of AcfNet are available at: https://github.com/youmi-zym/AcfNet.


Author(s):  
S. Lobry ◽  
D. Marcos ◽  
B. Kellenberger ◽  
D. Tuia

Abstract. Visual Question Answering for Remote Sensing (RSVQA) aims at extracting information from remote sensing images through queries formulated in natural language. Since the answer to the query is also provided in natural language, the system is accessible to non-experts, and therefore dramatically increases the value of remote sensing images as a source of information, for example for journalism purposes or interactive land planning. Ideally, an RSVQA system should be able to provide an answer to questions that vary both in terms of topic (presence, localization, counting) and image content. However, aiming at such flexibility generates problems related to the variability of the possible answers. A striking example is counting, where the number of objects present in a remote sensing image can vary by multiple orders of magnitude, depending on both the scene and type of objects. This represents a challenge for traditional Visual Question Answering (VQA) methods, which either become intractable or result in an accuracy loss, as the number of possible answers has to be limited. To this end, we introduce a new model that jointly solves a classification problem (which is the most common approach in VQA) and a regression problem (to answer numerical questions more precisely). An evaluation of this method on the RSVQA dataset shows that this finer numerical output comes at the cost of a small loss of performance on non-numerical questions.


2021 ◽  
Author(s):  
Joel Rabelo ◽  
Yuri Saporito ◽  
Antonio Leitao

Abstract In this article we investigate a family of "stochastic gradient type methods", for solving systems of linear ill-posed equations. The method under consideration is a stochastic version of the projective Landweber-Kaczmarz (PLWK) method in [Leitão/Svaiter, Inv. Probl. 2016] (see also [Leitão/Svaiter, NFAO 2018]). In the case of exact data, mean square convergence to zero of the iteration error is proven. In the noise data case, we couple our method with an a priori stopping rule and characterize it as a regularization method for solving systems of linear ill-posed operator equations. Numerical tests are presented for two linear ill-posed problems: (i) a Hilbert matrix type system with over 10^8 equations; (ii) a Big Data linear regression problem with real data. The obtained results indicate superior performance of the proposed method when compared with other well established iterations. Our preliminary investigation indicates that the proposed iteration is a promising alternative for computing stable approximate solutions of large scale systems of linear ill-posed equations.


1997 ◽  
Vol 05 (01) ◽  
pp. 33-51 ◽  
Author(s):  
Isaac Harari ◽  
Danny Avraham

The goal of this work is to design and analyze quadratic finite elements for problems of time-harmonic acoustics, and to compare the computational efficiency of quadratic elements to that of lower-order elements. Non-reflecting boundary conditions yield an equivalent problem in a bounded region which is suitable for domain-based computation of solutions to exterior problems. Galerkin/least-squares technology is utilized to develop robust methods in which stability properties are enhanced while maintaining higher-order accuracy. The design of Galerkin/least-squares methods depends on the order of interpolation employed, and in this case quadratic elements are designed to yield dispersion-free solutions to model problems. The accuracy of Galerkin/least-squares and traditional Galerkin elements is compared, as well as the accuracy of quadratic versus standard linear interpolation, incorporating the effects of representing the radiation condition in exterior problems. The efficiency of the various methods is measured in terms of the cost of computation, rather than resolution requirements. In this manner, clear guidelines for selecting the order of interpolation are derived. Numerical testing validates the superior performance of the proposed methods. This work is a first step to gaining a thorough analytical understanding of the performance of p refinement as a basis for the development of h-p finite element methods for large-scale computation of solutions to acoustic problems.


2000 ◽  
Vol 151 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Stephan Wild-Eck ◽  
Willi Zimmermann

Two large-scale surveys looking at attitudes towards forests, forestry and forest policy in the second half ofthe nineties have been carried out. This work was done on behalf of the Swiss Confederation by the Chair of Forest Policy and Forest Economics of the Federal Institute of Technology (ETH) in Zurich. Not only did the two studies use very different methods, but the results also varied greatly as far as infrastructure and basic conditions were concerned. One of the main differences between the two studies was the fact that the first dealt only with mountainous areas, whereas the second was carried out on the whole Swiss population. The results of the studies reflect these differences:each produced its own specific findings. Where the same (or similar) questions were asked, the answers highlight not only how the attitudes of those questioned differ, but also views that they hold in common. Both surveys showed positive attitudes towards forests in general, as well as a deep-seated appreciation ofthe forest as a recreational area, and a positive approach to tending. Detailed results of the two surveys will be available in the near future.


1999 ◽  
Vol 39 (10-11) ◽  
pp. 289-295
Author(s):  
Saleh Al-Muzaini

The Shuaiba Industrial Area (SIA) is located about 50 km south of Kuwait City. It accommodates most of the large-scale industries in Kuwait. The total area of the SIA (both eastern and western sectors) is about 22.98 million m2. Fifteen plants are located in the eastern sector and 23 in the western sector, including two petrochemical companies, three refineries, two power plants, a melamine company, an industrial gas corporation, a paper products company and, two steam electricity generating stations, in addition to several other industries. Therefore, only 30 percent of the land in the SIA's eastern sector and 70 percent of land in the SIA's western sector is available for future expansion. Presently, industries in the SIA generate approximately 204,000 t of solid waste. With future development in the industries in the SIA, the estimated quantities will reach 240,000 t. The Shuaiba Area Authority (SAA), a governmental regulatory body responsible for planning and development in the SIA, has recognized the problem of solid waste and has developed an industrial waste minimization program. This program would help to reduce the quantity of waste generated within the SIA and thereby reduce the cost of waste management. This paper presents a description of the waste minimization program and how it is to be implemented by major petroleum companies. The protocols employed in the waste minimization program are detailed.


Author(s):  
Zheng Zhou ◽  
Erik Saule ◽  
Hasan Metin Aktulga ◽  
Chao Yang ◽  
Esmond G. Ng ◽  
...  

Technologies ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 2
Author(s):  
Ashish Jaiswal ◽  
Ashwin Ramesh Babu ◽  
Mohammad Zaki Zadeh ◽  
Debapriya Banerjee ◽  
Fillia Makedon

Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudolabels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we present a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make meaningful progress.


Sign in / Sign up

Export Citation Format

Share Document