Partial plane sweep volume for deep learning based view synthesis

Author(s):  
Kouta Takeuchi ◽  
Kazuki Okami ◽  
Daisuke Ochi ◽  
Hideaki Kimata
Nutrients ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 2005 ◽  
Author(s):  
Frank Lo ◽  
Yingnan Sun ◽  
Jianing Qiu ◽  
Benny Lo

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.


Sensors ◽  
2021 ◽  
Vol 21 (19) ◽  
pp. 6680
Author(s):  
Min-Jae Lee ◽  
Gi-Mun Um ◽  
Joungil Yun ◽  
Won-Sik Cheong ◽  
Soon-Yong Park

In this paper, we propose a multi-view stereo matching method, EnSoft3D (Enhanced Soft 3D Reconstruction) to obtain dense and high-quality depth images. Multi-view stereo is one of the high-interest research areas and has wide applications. Motivated by the Soft3D reconstruction method, we introduce a new multi-view stereo matching scheme. The original Soft3D method is introduced for novel view synthesis, while occlusion-aware depth is also reconstructed by integrating the matching costs of the Plane Sweep Stereo (PSS) and soft visibility volumes. However, the Soft3D method has an inherent limitation because the erroneous PSS matching costs are not updated. To overcome this limitation, the proposed scheme introduces an update process of the PSS matching costs. From the object surface consensus volume, an inverse consensus kernel is derived, and the PSS matching costs are iteratively updated using the kernel. The proposed EnSoft3D method reconstructs a highly accurate 3D depth image because both the multi-view matching cost and soft visibility are updated simultaneously. The performance of the proposed method is evaluated by using structured and unstructured benchmark datasets. Disparity error is measured to verify 3D reconstruction accuracy, and both PSNR and SSIM are measured to verify the simultaneous enhancement of view synthesis.


Electronics ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 924 ◽  
Author(s):  
Zhao Pei ◽  
Deqiang Wen ◽  
Yanning Zhang ◽  
Miao Ma ◽  
Min Guo ◽  
...  

In recent years, disparity estimation of a scene based on deep learning methods has been extensively studied and significant progress has been made. In contrast, a traditional image disparity estimation method requires considerable resources and consumes much time in processes such as stereo matching and 3D reconstruction. At present, most deep learning based disparity estimation methods focus on estimating disparity based on monocular images. Motivated by the results of traditional methods that multi-view methods are more accurate than monocular methods, especially for scenes that are textureless and have thin structures, in this paper, we present MDEAN, a new deep convolutional neural network to estimate disparity using multi-view images with an asymmetric encoder–decoder network structure. First, our method takes an arbitrary number of multi-view images as input. Next, we use these images to produce a set of plane-sweep cost volumes, which are combined to compute a high quality disparity map using an end-to-end asymmetric network. The results show that our method performs better than state-of-the-art methods, in particular, for outdoor scenes with the sky, flat surfaces and buildings.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4892
Author(s):  
Anh Minh Truong ◽  
Wilfried Philips ◽  
Peter Veelaert

Depth sensing has improved rapidly in recent years, which allows for structural information to be utilized in various applications, such as virtual reality, scene and object recognition, view synthesis, and 3D reconstruction. Due to the limitations of the current generation of depth sensors, the resolution of depth maps is often still much lower than the resolution of color images. This hinders applications, such as view synthesis or 3D reconstruction, from providing high-quality results. Therefore, super-resolution, which allows for the upscaling of depth maps while still retaining sharpness, has recently drawn much attention in the deep learning community. However, state-of-the-art deep learning methods are typically designed and trained to handle a fixed set of integer-scale factors. Moreover, the raw depth map collected by the depth sensor usually has many depth data missing or misestimated values along the edges and corners of observed objects. In this work, we propose a novel deep learning network for both depth completion and depth super-resolution with arbitrary scale factors. The experimental results on the Middlebury stereo, NYUv2, and Matterport3D datasets demonstrate that the proposed method can outperform state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document