scholarly journals Posteriori Reconstruction and Denoising for Path Tracing

2021 ◽  
Author(s):  
◽  
Ping Liu

<p>Path tracing is a well-established technique for photo-realistic rendering to simulate light path transport. This method has been widely adopted in visual effects industries to generate high quality synthetic images requiring a large number of samples and a long computation time. Due to the high cost to produce the final output, intermediate previsualization of path tracing is in high demand from production artists to detect errors in the early stage of rendering. But visualizing intermediate results of path tracing is also challenging since the synthesized image with limited samples or improper sampling usually suffers from distracting noise. The ideal solution would be to provide a highly plausible intermediate result in the early stages of rendering, using a small fraction of samples, and apply a posteriori manner to approximate the ground truth.  In this thesis, this issue is addressed by providing several efficient posteriori reconstructions and denoising technique for previsualization of pa-th tracing. Firstly, we address the problem for the recovery of the missing values to construct low rank matrices for incomplete images including missing pixel, missing sub-pixel, and multi-frame scenarios. A novel approach utilizing a convolutional neural network which provides fast precompletion for initializing missing values, and subsequent weighted nuclear norm minimization with a parameter adjustment strategy efficiently recovers missing values even in high frequency details. The result shows better visual quality compared to the recent methods including compressed sensing based reconstruction.  Furthermore, to mitigate the computation budgets of our new approac-h, we extend our method by applying a block Toeplitz structure forming a low-rank matrix for pixel recovery, and tensor structure for multi-frame recovery. In this manner, the reconstruction time can be significantly reduced. Besides that, by exploiting temporal coherence of multi-frame with a tensor structure, we demonstrate an improvement in the overall recovery quality compared to our previous approach.  Our recovery methods provide satisfying solution but still require plen-ty of rendering time at prior stage compared with denoising solutions. Finally, we introduce a novel filter for denoising based on convolutional neural network, to address the problem as conventional denoising approach for rendered images. Unlike a plain CNN that applies fixed kernel size in each layer, we propose a multi-scale residual network with various auxiliary scene features to leverage a new efficient denoising filter for path tracing. Our experimental results show on par or better denoising quality compare to state-of-the-art path tracing denoiser.</p>

2021 ◽  
Author(s):  
◽  
Ping Liu

<p>Path tracing is a well-established technique for photo-realistic rendering to simulate light path transport. This method has been widely adopted in visual effects industries to generate high quality synthetic images requiring a large number of samples and a long computation time. Due to the high cost to produce the final output, intermediate previsualization of path tracing is in high demand from production artists to detect errors in the early stage of rendering. But visualizing intermediate results of path tracing is also challenging since the synthesized image with limited samples or improper sampling usually suffers from distracting noise. The ideal solution would be to provide a highly plausible intermediate result in the early stages of rendering, using a small fraction of samples, and apply a posteriori manner to approximate the ground truth.  In this thesis, this issue is addressed by providing several efficient posteriori reconstructions and denoising technique for previsualization of pa-th tracing. Firstly, we address the problem for the recovery of the missing values to construct low rank matrices for incomplete images including missing pixel, missing sub-pixel, and multi-frame scenarios. A novel approach utilizing a convolutional neural network which provides fast precompletion for initializing missing values, and subsequent weighted nuclear norm minimization with a parameter adjustment strategy efficiently recovers missing values even in high frequency details. The result shows better visual quality compared to the recent methods including compressed sensing based reconstruction.  Furthermore, to mitigate the computation budgets of our new approac-h, we extend our method by applying a block Toeplitz structure forming a low-rank matrix for pixel recovery, and tensor structure for multi-frame recovery. In this manner, the reconstruction time can be significantly reduced. Besides that, by exploiting temporal coherence of multi-frame with a tensor structure, we demonstrate an improvement in the overall recovery quality compared to our previous approach.  Our recovery methods provide satisfying solution but still require plen-ty of rendering time at prior stage compared with denoising solutions. Finally, we introduce a novel filter for denoising based on convolutional neural network, to address the problem as conventional denoising approach for rendered images. Unlike a plain CNN that applies fixed kernel size in each layer, we propose a multi-scale residual network with various auxiliary scene features to leverage a new efficient denoising filter for path tracing. Our experimental results show on par or better denoising quality compare to state-of-the-art path tracing denoiser.</p>


Author(s):  
Carla Sendra-Balcells ◽  
Ricardo Salvador ◽  
Juan B. Pedro ◽  
M C Biagi ◽  
Charlène Aubinet ◽  
...  

AbstractThe segmentation of structural MRI data is an essential step for deriving geometrical information about brain tissues. One important application is in transcranial electrical stimulation (e.g., tDCS), a non-invasive neuromodulatory technique where head modeling is required to determine the electric field (E-field) generated in the cortex to predict and optimize its effects. Here we propose a deep learning-based model (StarNEt) to automatize white matter (WM) and gray matter (GM) segmentation and compare its performance with FreeSurfer, an established tool. Since good definition of sulci and gyri in the cortical surface is an important requirement for E-field calculation, StarNEt is specifically designed to output masks at a higher resolution than that of the original input T1w-MRI. StarNEt uses a residual network as the encoder (ResNet) and a fully convolutional neural network with U-net skip connections as the decoder to segment an MRI slice by slice. Slice vertical location is provided as an extra input. The model was trained on scans from 425 patients in the open-access ADNI+IXI datasets, and using FreeSurfer segmentation as ground truth. Model performance was evaluated using the Dice Coefficient (DC) in a separate subset (N=105) of ADNI+IXI and in two extra testing sets not involved in training. In addition, FreeSurfer and StarNEt were compared to manual segmentations of the MRBrainS18 dataset, also unseen by the model. To study performance in real use cases, first, we created electrical head models derived from the FreeSurfer and StarNEt segmentations and used them for montage optimization with a common target region using a standard algorithm (Stimweaver) and second, we used StarNEt to successfully segment the brains of minimally conscious state (MCS) patients having suffered from brain trauma, a scenario where FreeSurfer typically fails. Our results indicate that StarNEt matches FreeSurfer performance on the trained tasks while reducing computation time from several hours to a few seconds, and with the potential to evolve into an effective technique even when patients present large brain abnormalities.


Author(s):  
Liang Kim Meng ◽  
Azira Khalil ◽  
Muhamad Hanif Ahmad Nizar ◽  
Maryam Kamarun Nisham ◽  
Belinda Pingguan-Murphy ◽  
...  

Background: Bone Age Assessment (BAA) refers to a clinical procedure that aims to identify a discrepancy between biological and chronological age of an individual by assessing the bone age growth. Currently, there are two main methods of executing BAA which are known as Greulich-Pyle and Tanner-Whitehouse techniques. Both techniques involve a manual and qualitative assessment of hand and wrist radiographs, resulting in intra and inter-operator variability accuracy and time-consuming. An automatic segmentation can be applied to the radiographs, providing the physician with more accurate delineation of the carpal bone and accurate quantitative analysis. Methods: In this study, we proposed an image feature extraction technique based on image segmentation with the fully convolutional neural network with eight stride pixel (FCN-8). A total of 290 radiographic images including both female and the male subject of age ranging from 0 to 18 were manually segmented and trained using FCN-8. Results and Conclusion: The results exhibit a high training accuracy value of 99.68% and a loss rate of 0.008619 for 50 epochs of training. The experiments compared 58 images against the gold standard ground truth images. The accuracy of our fully automated segmentation technique is 0.78 ± 0.06, 1.56 ±0.30 mm and 98.02% in terms of Dice Coefficient, Hausdorff Distance, and overall qualitative carpal recognition accuracy, respectively.


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Peter M. Maloca ◽  
Philipp L. Müller ◽  
Aaron Y. Lee ◽  
Adnan Tufail ◽  
Konstantinos Balaskas ◽  
...  

AbstractMachine learning has greatly facilitated the analysis of medical data, while the internal operations usually remain intransparent. To better comprehend these opaque procedures, a convolutional neural network for optical coherence tomography image segmentation was enhanced with a Traceable Relevance Explainability (T-REX) technique. The proposed application was based on three components: ground truth generation by multiple graders, calculation of Hamming distances among graders and the machine learning algorithm, as well as a smart data visualization (‘neural recording’). An overall average variability of 1.75% between the human graders and the algorithm was found, slightly minor to 2.02% among human graders. The ambiguity in ground truth had noteworthy impact on machine learning results, which could be visualized. The convolutional neural network balanced between graders and allowed for modifiable predictions dependent on the compartment. Using the proposed T-REX setup, machine learning processes could be rendered more transparent and understandable, possibly leading to optimized applications.


2017 ◽  
Vol 2017 ◽  
pp. 1-11 ◽  
Author(s):  
Siqi Tang ◽  
Zhisong Pan ◽  
Xingyu Zhou

This paper proposes an accurate crowd counting method based on convolutional neural network and low-rank and sparse structure. To this end, we firstly propose an effective deep-fusion convolutional neural network to promote the density map regression accuracy. Furthermore, we figure out that most of the existing CNN based crowd counting methods obtain overall counting by direct integral of estimated density map, which limits the accuracy of counting. Instead of direct integral, we adopt a regression method based on low-rank and sparse penalty to promote accuracy of the projection from density map to global counting. Experiments demonstrate the importance of such regression process on promoting the crowd counting performance. The proposed low-rank and sparse based deep-fusion convolutional neural network (LFCNN) outperforms existing crowd counting methods and achieves the state-of-the-art performance.


Feed-forward neural networks can be trained based on a gradient-descent based backpropagation algorithm. But, these algorithms require more computation time. Extreme Learning Machines (ELM’s) are time-efficient, and they are less complicated than the conventional gradient-based algorithm. In previous years, an SRAM based convolutional neural network using a receptive – field Approach was proposed. This neural network was used as an encoder for the ELM algorithm and was implemented on FPGA. But, this neural network used an inaccurate 3-stage pipelined parallel adder. Hence, this neural network generates imprecise stimuli to the hidden layer neurons. This paper presents an implementation of precise convolutional neural network for encoding in the ELM algorithm based on the receptive - field approach at the hardware level. In the third stage of the pipelined parallel adder, instead of approximating the output by using one 2-input 15-bit adder, one 4-input 14-bit adder is used. Also, an additional weighted pixel array block is used. This weighted pixel array improves the accuracy of generating 128 weighted pixels. This neural network was simulated using ModelSim-Altera 10.1d and synthesized using Quartus II 13.0 sp1. This neural network is implemented on Cyclone V FPGA and used for pattern recognition applications. Although this design consumes slightly more hardware resources, this design is more accurate compared to previously existing encoders


2019 ◽  
Vol 2019 ◽  
pp. 1-16 ◽  
Author(s):  
Lian Zou ◽  
Shaode Yu ◽  
Tiebao Meng ◽  
Zhicheng Zhang ◽  
Xiaokun Liang ◽  
...  

This study reviews the technique of convolutional neural network (CNN) applied in a specific field of mammographic breast cancer diagnosis (MBCD). It aims to provide several clues on how to use CNN for related tasks. MBCD is a long-standing problem, and massive computer-aided diagnosis models have been proposed. The models of CNN-based MBCD can be broadly categorized into three groups. One is to design shallow or to modify existing models to decrease the time cost as well as the number of instances for training; another is to make the best use of a pretrained CNN by transfer learning and fine-tuning; the third is to take advantage of CNN models for feature extraction, and the differentiation of malignant lesions from benign ones is fulfilled by using machine learning classifiers. This study enrolls peer-reviewed journal publications and presents technical details and pros and cons of each model. Furthermore, the findings, challenges and limitations are summarized and some clues on the future work are also given. Conclusively, CNN-based MBCD is at its early stage, and there is still a long way ahead in achieving the ultimate goal of using deep learning tools to facilitate clinical practice. This review benefits scientific researchers, industrial engineers, and those who are devoted to intelligent cancer diagnosis.


2020 ◽  
Vol 10 (2) ◽  
pp. 84 ◽  
Author(s):  
Atif Mehmood ◽  
Muazzam Maqsood ◽  
Muzaffar Bashir ◽  
Yang Shuyuan

Alzheimer’s disease (AD) may cause damage to the memory cells permanently, which results in the form of dementia. The diagnosis of Alzheimer’s disease at an early stage is a problematic task for researchers. For this, machine learning and deep convolutional neural network (CNN) based approaches are readily available to solve various problems related to brain image data analysis. In clinical research, magnetic resonance imaging (MRI) is used to diagnose AD. For accurate classification of dementia stages, we need highly discriminative features obtained from MRI images. Recently advanced deep CNN-based models successfully proved their accuracy. However, due to a smaller number of image samples available in the datasets, there exist problems of over-fitting hindering the performance of deep learning approaches. In this research, we developed a Siamese convolutional neural network (SCNN) model inspired by VGG-16 (also called Oxford Net) to classify dementia stages. In our approach, we extend the insufficient and imbalanced data by using augmentation approaches. Experiments are performed on a publicly available dataset open access series of imaging studies (OASIS), by using the proposed approach, an excellent test accuracy of 99.05% is achieved for the classification of dementia stages. We compared our model with the state-of-the-art models and discovered that the proposed model outperformed the state-of-the-art models in terms of performance, efficiency, and accuracy.


Sign in / Sign up

Export Citation Format

Share Document