cn.FARMS: a latent variable model to detect copy number variations in microarray data with a low false discovery rate

AbstractCancer is a genetic disorder in the first place. Therefore, next-generation sequencing (NGS) based discovery of somatically acquired genetic variants has gained widespread attention. Computational prediction of somatic variants, however, is affected by a variety of confounding factors. In addition to the uncertainties that one commonly encounters also in germline variation prediction, such as misplaced and/or inaccurate read alignments, cancer heterogeneity and impure samples significantly add to the issues. Overall, this hampers state-of-the-art indel discovery tools to discover somatic indels at operable performance rates, although they perform excellently when calling germline indels. While affecting all size ranges, both common and cancer-specific problems interfere in particularly unfavorable ways in the prediction of somatic midsize (30-150 bp) insertions and deletions.Here, we present a latent variable model that can take the major confounding factors and uncertainties into a unifying account. Using this modeling framework, we first demonstrate how to efficiently compute the probability for a (putative) indel to be somatic, thereby resolving a principled computational runtime bottleneck in Bayesian uncertainty quantification. Second, we show how to reliably estimate the allele frequencies for a given list of indels. Third, we also present an intuitive and effective way to control the false discovery rate, an issue in genetic variant discovery that has been found notoriously hard to deal with. As a tool that implements all methodology developed, we present PROSIC (PROcessing Somatic Indel Calls). PROSIC achieves significant improvements in particular in terms of recall when applied to deletion call sheets, as provided by prevalent state-of-the-art tools, in comparison to their integrated somatic indel calling routines.The software is publicly available at https://prosic.github.io and can be easily installed via https://bioconda.github.io.

Download Full-text

cn.MOPS: mixture of Poissons for discovering copy number variations in next-generation sequencing data with a low false discovery rate

Nucleic Acids Research ◽

10.1093/nar/gks003 ◽

2012 ◽

Vol 40 (9) ◽

pp. e69-e69 ◽

Cited By ~ 239

Author(s):

Günter Klambauer ◽

Karin Schwarzbauer ◽

Andreas Mayr ◽

Djork-Arné Clevert ◽

Andreas Mitterecker ◽

...

Keyword(s):

Next Generation Sequencing ◽

False Discovery Rate ◽

Copy Number ◽

Copy Number Variations ◽

Next Generation Sequencing Data ◽

Next Generation ◽

Sequencing Data ◽

False Discovery ◽

Generation Sequencing

Download Full-text

Proposal of Unified Validity Analysis Framework Applying Latent Variable Model

Korean Journal of Sports Science ◽

10.35159/kjss.2017.08.26.4.1265 ◽

2017 ◽

Vol 26 (4) ◽

pp. 1265-1280

Author(s):

Eun-Chul Seo ◽

Hyun-Kyun Ahn

Keyword(s):

Latent Variable ◽

Latent Variable Model ◽

Analysis Framework ◽

Variable Model ◽

Validity Analysis

Download Full-text

Using an Integrated Choice and Latent Variable Model to Understand the Impact of 'Professional' Respondents in a Stated Preference Survey

SSRN Electronic Journal ◽

10.2139/ssrn.3368948 ◽

2019 ◽

Author(s):

Erlend Dancke Sandorf ◽

Lars Persson ◽

Thomas Broberg

Keyword(s):

Latent Variable ◽

Stated Preference ◽

Latent Variable Model ◽

Variable Model ◽

Stated Preference Survey ◽

The Impact

Download Full-text

A diversified shared latent variable model for efficient image characteristics extraction and modelling

Neurocomputing ◽

10.1016/j.neucom.2020.09.035 ◽

2021 ◽

Vol 421 ◽

pp. 244-259

Author(s):

Hao Xiong ◽

Yuan Yan Tang ◽

Fionn Murtagh ◽

Leszek Rutkowski ◽

Shlomo Berkovsky

Keyword(s):

Latent Variable ◽

Latent Variable Model ◽

Variable Model ◽

Image Characteristics

Download Full-text

A latent variable model for analyzing mixed longitudinal (k,l)-inflated count and ordinal responses

Journal of Applied Statistics ◽

10.1080/02664763.2015.1134448 ◽

2016 ◽

Vol 43 (12) ◽

pp. 2203-2224 ◽

Cited By ~ 5

Author(s):

F. Razie ◽

E. Bahrami Samani ◽

M. Ganjali

Keyword(s):

Latent Variable ◽

Latent Variable Model ◽

Variable Model

Download Full-text

Shared Linear Encoder-based Gaussian Process Latent Variable Model for Visual Classification

2018 ACM Multimedia Conference on Multimedia Conference - MM '18 ◽

10.1145/3240508.3240520 ◽

2018 ◽

Cited By ~ 3

Author(s):

Jinxing Li ◽

Bob Zhang ◽

Guangming Lu ◽

David Zhang

Keyword(s):

Gaussian Process ◽

Latent Variable ◽

Latent Variable Model ◽

Variable Model ◽

Visual Classification ◽

Linear Encoder

Download Full-text

Efficient Dimensionality Reduction Methods in Reservoir History Matching

Energies ◽

10.3390/en14113137 ◽

2021 ◽

Vol 14 (11) ◽

pp. 3137

Author(s):

Amine Tadjer ◽

Reider B. Bratvold ◽

Remus G. Hanea

Keyword(s):

Data Assimilation ◽

Dimensionality Reduction ◽

Gaussian Process ◽

Latent Variable ◽

History Matching ◽

Production Performance ◽

Latent Variable Model ◽

Variable Model ◽

Multiple Data ◽

Ensemble Smoother

Production forecasting is the basis for decision making in the oil and gas industry, and can be quite challenging, especially in terms of complex geological modeling of the subsurface. To help solve this problem, assisted history matching built on ensemble-based analysis such as the ensemble smoother and ensemble Kalman filter is useful in estimating models that preserve geological realism and have predictive capabilities. These methods tend, however, to be computationally demanding, as they require a large ensemble size for stable convergence. In this paper, we propose a novel method of uncertainty quantification and reservoir model calibration with much-reduced computation time. This approach is based on a sequential combination of nonlinear dimensionality reduction techniques: t-distributed stochastic neighbor embedding or the Gaussian process latent variable model and clustering K-means, along with the data assimilation method ensemble smoother with multiple data assimilation. The cluster analysis with t-distributed stochastic neighbor embedding and Gaussian process latent variable model is used to reduce the number of initial geostatistical realizations and select a set of optimal reservoir models that have similar production performance to the reference model. We then apply ensemble smoother with multiple data assimilation for providing reliable assimilation results. Experimental results based on the Brugge field case data verify the efficiency of the proposed approach.

Download Full-text

Chinese Character Image Completion Using a Generative Latent Variable Model

Applied Sciences ◽

10.3390/app11020624 ◽

2021 ◽

Vol 11 (2) ◽

pp. 624

Author(s):

In-su Jo ◽

Dong-bin Choi ◽

Young B. Park

Keyword(s):

Latent Variable ◽

Chinese Character ◽

Image Inpainting ◽

Latent Variable Model ◽

Chinese Characters ◽

Image Completion ◽

Variable Model ◽

Variational Autoencoder ◽

Convolution Model ◽

Detection Technologies

Chinese characters in ancient books have many corrupted characters, and there are cases in which objects are mixed in the process of extracting the characters into images. To use this incomplete image as accurate data, we use image completion technology, which removes unnecessary objects and restores corrupted images. In this paper, we propose a variational autoencoder with classification (VAE-C) model. This model is characterized by using classification areas and a class activation map (CAM). Through the classification area, the data distribution is disentangled, and then the node to be adjusted is tracked using CAM. Through the latent variable, with which the determined node value is reduced, an image from which unnecessary objects have been removed is created. The VAE-C model can be utilized not only to eliminate unnecessary objects but also to restore corrupted images. By comparing the performance of removing unnecessary objects with mask regions with convolutional neural networks (Mask R-CNN), one of the prevalent object detection technologies, and also comparing the image restoration performance with the partial convolution model (PConv) and the gated convolution model (GConv), which are image inpainting technologies, our model is proven to perform excellently in terms of removing objects and restoring corrupted areas.

Download Full-text