A deep-shallow and global–local multi-feature fusion network for photometric stereo

2021 ◽  
pp. 104368
Author(s):  
Yanru Liua ◽  
Yakun Ju ◽  
Muwei Jian ◽  
Feng Gao ◽  
Yuan Rao ◽  
...  
2021 ◽  
Vol 8 (1) ◽  
pp. 105-118
Author(s):  
Yakun Ju ◽  
Yuxin Peng ◽  
Muwei Jian ◽  
Feng Gao ◽  
Junyu Dong

AbstractPhotometric stereo aims to reconstruct 3D geometry by recovering the dense surface orientation of a 3D object from multiple images under differing illumination. Traditional methods normally adopt simplified reflectance models to make the surface orientation computable. However, the real reflectances of surfaces greatly limit applicability of such methods to real-world objects. While deep neural networks have been employed to handle non-Lambertian surfaces, these methods are subject to blurring and errors, especially in high-frequency regions (such as crinkles and edges), caused by spectral bias: neural networks favor low-frequency representations so exhibit a bias towards smooth functions. In this paper, therefore, we propose a self-learning conditional network with multi-scale features for photometric stereo, avoiding blurred reconstruction in such regions. Our explorations include: (i) a multi-scale feature fusion architecture, which keeps high-resolution representations and deep feature extraction, simultaneously, and (ii) an improved gradient-motivated conditionally parameterized convolution (GM-CondConv) in our photometric stereo network, with different combinations of convolution kernels for varying surfaces. Extensive experiments on public benchmark datasets show that our calibrated photometric stereo method outperforms the state-of-the-art.


2019 ◽  
Vol 63 (5) ◽  
pp. 50402-1-50402-9 ◽  
Author(s):  
Ing-Jr Ding ◽  
Chong-Min Ruan

Abstract The acoustic-based automatic speech recognition (ASR) technique has been a matured technique and widely seen to be used in numerous applications. However, acoustic-based ASR will not maintain a standard performance for the disabled group with an abnormal face, that is atypical eye or mouth geometrical characteristics. For governing this problem, this article develops a three-dimensional (3D) sensor lip image based pronunciation recognition system where the 3D sensor is efficiently used to acquire the action variations of the lip shapes of the pronunciation action from a speaker. In this work, two different types of 3D lip features for pronunciation recognition are presented, 3D-(x, y, z) coordinate lip feature and 3D geometry lip feature parameters. For the 3D-(x, y, z) coordinate lip feature design, 18 location points, each of which has 3D-sized coordinates, around the outer and inner lips are properly defined. In the design of 3D geometry lip features, eight types of features considering the geometrical space characteristics of the inner lip are developed. In addition, feature fusion to combine both 3D-(x, y, z) coordinate and 3D geometry lip features is further considered. The presented 3D sensor lip image based feature evaluated the performance and effectiveness using the principal component analysis based classification calculation approach. Experimental results on pronunciation recognition of two different datasets, Mandarin syllables and Mandarin phrases, demonstrate the competitive performance of the presented 3D sensor lip image based pronunciation recognition system.


2010 ◽  
Vol 30 (3) ◽  
pp. 643-645 ◽  
Author(s):  
Wei ZENG ◽  
Gui-bin ZHU ◽  
Jie CHEN ◽  
Ding-ding TANG

Sign in / Sign up

Export Citation Format

Share Document