scholarly journals cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate

2011 ◽  
Vol 39 (12) ◽  
pp. e79-e79 ◽  
Author(s):  
D.-A. Clevert ◽  
A. Mitterecker ◽  
A. Mayr ◽  
G. Klambauer ◽  
M. Tuefferd ◽  
...  
2017 ◽  
Author(s):  
Louis J. Dijkstra ◽  
Johannes Köster ◽  
Tobias Marschall ◽  
Alexander Schönhuth

AbstractCancer is a genetic disorder in the first place. Therefore, next-generation sequencing (NGS) based discovery of somatically acquired genetic variants has gained widespread attention. Computational prediction of somatic variants, however, is affected by a variety of confounding factors. In addition to the uncertainties that one commonly encounters also in germline variation prediction, such as misplaced and/or inaccurate read alignments, cancer heterogeneity and impure samples significantly add to the issues. Overall, this hampers state-of-the-art indel discovery tools to discover somatic indels at operable performance rates, although they perform excellently when calling germline indels. While affecting all size ranges, both common and cancer-specific problems interfere in particularly unfavorable ways in the prediction of somatic midsize (30-150 bp) insertions and deletions.Here, we present a latent variable model that can take the major confounding factors and uncertainties into a unifying account. Using this modeling framework, we first demonstrate how to efficiently compute the probability for a (putative) indel to be somatic, thereby resolving a principled computational runtime bottleneck in Bayesian uncertainty quantification. Second, we show how to reliably estimate the allele frequencies for a given list of indels. Third, we also present an intuitive and effective way to control the false discovery rate, an issue in genetic variant discovery that has been found notoriously hard to deal with. As a tool that implements all methodology developed, we present PROSIC (PROcessing Somatic Indel Calls). PROSIC achieves significant improvements in particular in terms of recall when applied to deletion call sheets, as provided by prevalent state-of-the-art tools, in comparison to their integrated somatic indel calling routines.The software is publicly available at https://prosic.github.io and can be easily installed via https://bioconda.github.io.


2012 ◽  
Vol 40 (9) ◽  
pp. e69-e69 ◽  
Author(s):  
Günter Klambauer ◽  
Karin Schwarzbauer ◽  
Andreas Mayr ◽  
Djork-Arné Clevert ◽  
Andreas Mitterecker ◽  
...  

2021 ◽  
Vol 421 ◽  
pp. 244-259
Author(s):  
Hao Xiong ◽  
Yuan Yan Tang ◽  
Fionn Murtagh ◽  
Leszek Rutkowski ◽  
Shlomo Berkovsky

Energies ◽  
2021 ◽  
Vol 14 (11) ◽  
pp. 3137
Author(s):  
Amine Tadjer ◽  
Reider B. Bratvold ◽  
Remus G. Hanea

Production forecasting is the basis for decision making in the oil and gas industry, and can be quite challenging, especially in terms of complex geological modeling of the subsurface. To help solve this problem, assisted history matching built on ensemble-based analysis such as the ensemble smoother and ensemble Kalman filter is useful in estimating models that preserve geological realism and have predictive capabilities. These methods tend, however, to be computationally demanding, as they require a large ensemble size for stable convergence. In this paper, we propose a novel method of uncertainty quantification and reservoir model calibration with much-reduced computation time. This approach is based on a sequential combination of nonlinear dimensionality reduction techniques: t-distributed stochastic neighbor embedding or the Gaussian process latent variable model and clustering K-means, along with the data assimilation method ensemble smoother with multiple data assimilation. The cluster analysis with t-distributed stochastic neighbor embedding and Gaussian process latent variable model is used to reduce the number of initial geostatistical realizations and select a set of optimal reservoir models that have similar production performance to the reference model. We then apply ensemble smoother with multiple data assimilation for providing reliable assimilation results. Experimental results based on the Brugge field case data verify the efficiency of the proposed approach.


2021 ◽  
Vol 11 (2) ◽  
pp. 624
Author(s):  
In-su Jo ◽  
Dong-bin Choi ◽  
Young B. Park

Chinese characters in ancient books have many corrupted characters, and there are cases in which objects are mixed in the process of extracting the characters into images. To use this incomplete image as accurate data, we use image completion technology, which removes unnecessary objects and restores corrupted images. In this paper, we propose a variational autoencoder with classification (VAE-C) model. This model is characterized by using classification areas and a class activation map (CAM). Through the classification area, the data distribution is disentangled, and then the node to be adjusted is tracked using CAM. Through the latent variable, with which the determined node value is reduced, an image from which unnecessary objects have been removed is created. The VAE-C model can be utilized not only to eliminate unnecessary objects but also to restore corrupted images. By comparing the performance of removing unnecessary objects with mask regions with convolutional neural networks (Mask R-CNN), one of the prevalent object detection technologies, and also comparing the image restoration performance with the partial convolution model (PConv) and the gated convolution model (GConv), which are image inpainting technologies, our model is proven to perform excellently in terms of removing objects and restoring corrupted areas.


Sign in / Sign up

Export Citation Format

Share Document