Predicting the Quality of View Synthesis With Color-Depth Image Fusion

Author(s):  
Leida Li ◽  
Yipo Huang ◽  
Jinjian Wu ◽  
Ke Gu ◽  
Yuming Fang
Author(s):  
Ting Cao ◽  
Pengjia Tu ◽  
Weixing Wang

The depth image generated by Kinect sensor always contains vibration and shadow noises which limit the related usage. In this research, a method based on image fusion and fractional differential is proposed for the vibration filtering and shadow detection. First, an image fusion method based on pixel level is put forward to filter the vibration noises. This method can achieve the best quality of every pixel according to the depth images sequence. Second, an improved operator based on fractional differential is studied to extract the shadow noises, which can enhance the boundaries of shadow regions significantly to accomplish the shadow detection effectively. Finally, a comparison is made with other traditional and state-of-the-art methods and our experimental results indicate that the proposed method can filter out the vibration and shadow noises effectively based on the [Formula: see text]-measure system.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 863
Author(s):  
Vidas Raudonis ◽  
Agne Paulauskaite-Taraseviciene ◽  
Kristina Sutiene

Background: Cell detection and counting is of essential importance in evaluating the quality of early-stage embryo. Full automation of this process remains a challenging task due to different cell size, shape, the presence of incomplete cell boundaries, partially or fully overlapping cells. Moreover, the algorithm to be developed should process a large number of image data of different quality in a reasonable amount of time. Methods: Multi-focus image fusion approach based on deep learning U-Net architecture is proposed in the paper, which allows reducing the amount of data up to 7 times without losing spectral information required for embryo enhancement in the microscopic image. Results: The experiment includes the visual and quantitative analysis by estimating the image similarity metrics and processing times, which is compared to the results achieved by two wellknown techniques—Inverse Laplacian Pyramid Transform and Enhanced Correlation Coefficient Maximization. Conclusion: Comparatively, the image fusion time is substantially improved for different image resolutions, whilst ensuring the high quality of the fused image.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Seyed Muhammad Hossein Mousavi ◽  
S. Younes Mirinezhad

AbstractThis study presents a new color-depth based face database gathered from different genders and age ranges from Iranian subjects. Using suitable databases, it is possible to validate and assess available methods in different research fields. This database has application in different fields such as face recognition, age estimation and Facial Expression Recognition and Facial Micro Expressions Recognition. Image databases based on their size and resolution are mostly large. Color images usually consist of three channels namely Red, Green and Blue. But in the last decade, another aspect of image type has emerged, named “depth image”. Depth images are used in calculating range and distance between objects and the sensor. Depending on the depth sensor technology, it is possible to acquire range data differently. Kinect sensor version 2 is capable of acquiring color and depth data simultaneously. Facial expression recognition is an important field in image processing, which has multiple uses from animation to psychology. Currently, there is a few numbers of color-depth (RGB-D) facial micro expressions recognition databases existing. With adding depth data to color data, the accuracy of final recognition will be increased. Due to the shortage of color-depth based facial expression databases and some weakness in available ones, a new and almost perfect RGB-D face database is presented in this paper, covering Middle-Eastern face type. In the validation section, the database will be compared with some famous benchmark face databases. For evaluation, Histogram Oriented Gradients features are extracted, and classification algorithms such as Support Vector Machine, Multi-Layer Neural Network and a deep learning method, called Convolutional Neural Network or are employed. The results are so promising.


2021 ◽  
Vol 11 (6) ◽  
pp. 2666
Author(s):  
Hafiz Muhammad Usama Hassan Alvi ◽  
Muhammad Shahid Farid ◽  
Muhammad Hassan Khan ◽  
Marcin Grzegorzek

Emerging 3D-related technologies such as augmented reality, virtual reality, mixed reality, and stereoscopy have gained remarkable growth due to their numerous applications in the entertainment, gaming, and electromedical industries. In particular, the 3D television (3DTV) and free-viewpoint television (FTV) enhance viewers’ television experience by providing immersion. They need an infinite number of views to provide a full parallax to the viewer, which is not practical due to various financial and technological constraints. Therefore, novel 3D views are generated from a set of available views and their depth maps using depth-image-based rendering (DIBR) techniques. The quality of a DIBR-synthesized image may be compromised for several reasons, e.g., inaccurate depth estimation. Since depth is important in this application, inaccuracies in depth maps lead to different textural and structural distortions that degrade the quality of the generated image and result in a poor quality of experience (QoE). Therefore, quality assessment DIBR-generated images are essential to guarantee an appreciative QoE. This paper aims at estimating the quality of DIBR-synthesized images and proposes a novel 3D objective image quality metric. The proposed algorithm aims to measure both textural and structural distortions in the DIBR image by exploiting the contrast sensitivity and the Hausdorff distance, respectively. The two measures are combined to estimate an overall quality score. The experimental evaluations performed on the benchmark MCL-3D dataset show that the proposed metric is reliable and accurate, and performs better than existing 2D and 3D quality assessment metrics.


2009 ◽  
Vol 45 (1) ◽  
pp. 30 ◽  
Author(s):  
S.-X. Liu ◽  
P. An ◽  
Z.-Y. Zhang ◽  
Q. Zhang ◽  
L.-Q. Shen ◽  
...  

2011 ◽  
Vol 1 (3) ◽  
Author(s):  
T. Sumathi ◽  
M. Hemalatha

AbstractImage fusion is the method of combining relevant information from two or more images into a single image resulting in an image that is more informative than the initial inputs. Methods for fusion include discrete wavelet transform, Laplacian pyramid based transform, curvelet based transform etc. These methods demonstrate the best performance in spatial and spectral quality of the fused image compared to other spatial methods of fusion. In particular, wavelet transform has good time-frequency characteristics. However, this characteristic cannot be extended easily to two or more dimensions with separable wavelet experiencing limited directivity when spanning a one-dimensional wavelet. This paper introduces the second generation curvelet transform and uses it to fuse images together. This method is compared against the others previously described to show that useful information can be extracted from source and fused images resulting in the production of fused images which offer clear, detailed information.


Nutrients ◽  
2018 ◽  
Vol 10 (12) ◽  
pp. 2005 ◽  
Author(s):  
Frank Lo ◽  
Yingnan Sun ◽  
Jianing Qiu ◽  
Benny Lo

An objective dietary assessment system can help users to understand their dietary behavior and enable targeted interventions to address underlying health problems. To accurately quantify dietary intake, measurement of the portion size or food volume is required. For volume estimation, previous research studies mostly focused on using model-based or stereo-based approaches which rely on manual intervention or require users to capture multiple frames from different viewing angles which can be tedious. In this paper, a view synthesis approach based on deep learning is proposed to reconstruct 3D point clouds of food items and estimate the volume from a single depth image. A distinct neural network is designed to use a depth image from one viewing angle to predict another depth image captured from the corresponding opposite viewing angle. The whole 3D point cloud map is then reconstructed by fusing the initial data points with the synthesized points of the object items through the proposed point cloud completion and Iterative Closest Point (ICP) algorithms. Furthermore, a database with depth images of food object items captured from different viewing angles is constructed with image rendering and used to validate the proposed neural network. The methodology is then evaluated by comparing the volume estimated by the synthesized 3D point cloud with the ground truth volume of the object items.


Author(s):  
MICHAEL SCHMEING ◽  
XIAOYI JIANG

In this paper, we address the disocclusion problem that occurs during view synthesis in depth image-based rendering (DIBR). We propose a method that can recover faithful texture information for disoccluded areas. In contrast to common disocclusion filling methods, which usually work frame-by-frame, our algorithm can take information from temporally neighboring frames into account. This way, we are able to reconstruct a faithful filling for the disocclusion regions and not just an approximate or plausible one. Our method avoids artifacts that occur with common approaches and can additionally reduce compression artifacts at object boundaries.


Sign in / Sign up

Export Citation Format

Share Document