scholarly journals Effective QTL Discovery Incorporating Genomic Annotations

2015 ◽  
Author(s):  
Xiaoquan Wen

Mapping molecular QTLs has emerged as an important tool for understanding the genetic basis of cell functions. With the increasing availability of functional genomic data, it is natural to incorporate genomic annotations into QTL discovery. In this paper, we describe a novel method, named TORUS, for integrative QTL discovery. Using hierarchical modeling, our approach embeds a rigorous enrichment analysis to quantify the enrichment level of each annotation in target QTLs. This enrichment information is then used to identify QTLs by up-weighting the genetic variants with relevant annotations using a Bayesian false discovery rate control procedure. Our proposed method only requires summary-level statistics and is highly efficient computationally: it runs a few hundreds times faster than the current gold-standard QTL discovery approach that relies on permutations. Through simulation studies, we demonstrate that the proposed method performs accurate enrichment analysis and controls the desired type I error rate while greatly improving the power of QTL discovery when incorporating informative annotations. Finally, we analyze the recently released expression-genotype data from 44 human tissues generated by the GTEx project. By integrating the simple annotation of SNP distance to transcription start sites, we discover more genes that harbor expression-associated SNPs in all 44 tissues, with an average increase of 1,485 genes.

Author(s):  
Zaheer Ahmed ◽  
Alberto Cassese ◽  
Gerard van Breukelen ◽  
Jan Schepers

AbstractWe present a novel method, REMAXINT, that captures the gist of two-way interaction in row by column (i.e., two-mode) data, with one observation per cell. REMAXINT is a probabilistic two-mode clustering model that yields two-mode partitions with maximal interaction between row and column clusters. For estimation of the parameters of REMAXINT, we maximize a conditional classification likelihood in which the random row (or column) main effects are conditioned out. For testing the null hypothesis of no interaction between row and column clusters, we propose a $$max-F$$ m a x - F test statistic and discuss its properties. We develop a Monte Carlo approach to obtain its sampling distribution under the null hypothesis. We evaluate the performance of the method through simulation studies. Specifically, for selected values of data size and (true) numbers of clusters, we obtain critical values of the $$max-F$$ m a x - F statistic, determine empirical Type I error rate of the proposed inferential procedure and study its power to reject the null hypothesis. Next, we show that the novel method is useful in a variety of applications by presenting two empirical case studies and end with some concluding remarks.


Author(s):  
Anja C Gumpinger ◽  
Bastian Rieck ◽  
Dominik G Grimm ◽  
Karsten Borgwardt ◽  

Abstract Motivation Correlating genetic loci with a disease phenotype is a common approach to improve our understanding of the genetics underlying complex diseases. Standard analyses mostly ignore two aspects, namely genetic heterogeneity and interactions between loci. Genetic heterogeneity, the phenomenon that different genetic markers lead to the same phenotype, promises to increase statistical power by aggregating low-signal variants. Incorporating interactions between loci results in a computational and statistical bottleneck due to the vast amount of candidate interactions. Results We propose a novel method SiNIMin that addresses these two aspects by finding pairs of interacting genes that are, upon combination, associated with a phenotype of interest under a model of genetic heterogeneity. We guide the interaction search using biological prior knowledge in the form of protein-protein interaction networks. Our method controls type I error and outperforms state-of-the-art methods with respect to statistical power. Additionally, we find novel associations for multiple A. thaliana phenotypes, and for a study of rare variants in migraine patients. Availability Code available at https://github.com/BorgwardtLab/SiNIMin. Supplementary information Supplementary data are available at Bioinformatics online.


2016 ◽  
Vol 27 (5) ◽  
pp. 1547-1558 ◽  
Author(s):  
Joseph S Koopmeiners ◽  
Brian P Hobbs

Randomized, placebo-controlled clinical trials are the gold standard for evaluating a novel therapeutic agent. In some instances, it may not be considered ethical or desirable to complete a placebo-controlled clinical trial and, instead, the placebo is replaced by an active comparator with the objective of showing either superiority or non-inferiority to the active comparator. In a non-inferiority trial, the experimental treatment is considered non-inferior if it retains a pre-specified proportion of the effect of the active comparator as represented by the non-inferiority margin. A key assumption required for valid inference in the non-inferiority setting is the constancy assumption, which requires that the effect of the active comparator in the non-inferiority trial is consistent with the effect that was observed in previous trials. It has been shown that violations of the constancy assumption can result in a dramatic increase in the rate of incorrectly concluding non-inferiority in the presence of ineffective or even harmful treatment. In this paper, we illustrate how Bayesian hierarchical modeling can be used to facilitate multi-source smoothing of the data from the current trial with the data from historical studies, enabling direct probabilistic evaluation of the constancy assumption. We then show how this result can be used to adapt the non-inferiority margin when the constancy assumption is violated and present simulation results illustrating that our method controls the type-I error rate when the constancy assumption is violated, while retaining the power of the standard approach when the constancy assumption holds. We illustrate our adaptive procedure using a non-inferiority trial of raltegravir, an antiretroviral drug for the treatment of HIV.


2020 ◽  
Vol 36 (9) ◽  
pp. 2796-2804 ◽  
Author(s):  
Stephen S Tran ◽  
Qing Zhou ◽  
Xinshu Xiao

Abstract Motivation RNA-sequencing (RNA-seq) enables global identification of RNA-editing sites in biological systems and disease. A salient step in many studies is to identify editing sites that statistically associate with treatment (e.g. case versus control) or covary with biological factors, such as age. However, RNA-seq has technical features that incumbent tests (e.g. t-test and linear regression) do not consider, which can lead to false positives and false negatives. Results In this study, we demonstrate the limitations of currently used tests and introduce the method, RNA-editing tests (REDITs), a suite of tests that employ beta-binomial models to identify differential RNA editing. The tests in REDITs have higher sensitivity than other tests, while also maintaining the type I error (false positive) rate at the nominal level. Applied to the GTEx dataset, we unveil RNA-editing changes associated with age and gender, and differential recoding profiles between brain regions. Availability and implementation REDITs are implemented as functions in R and freely available for download at https://github.com/gxiaolab/REDITs. The repository also provides a code example for leveraging parallelization using multiple cores.


2000 ◽  
Vol 14 (1) ◽  
pp. 1-10 ◽  
Author(s):  
Joni Kettunen ◽  
Niklas Ravaja ◽  
Liisa Keltikangas-Järvinen

Abstract We examined the use of smoothing to enhance the detection of response coupling from the activity of different response systems. Three different types of moving average smoothers were applied to both simulated interbeat interval (IBI) and electrodermal activity (EDA) time series and to empirical IBI, EDA, and facial electromyography time series. The results indicated that progressive smoothing increased the efficiency of the detection of response coupling but did not increase the probability of Type I error. The power of the smoothing methods depended on the response characteristics. The benefits and use of the smoothing methods to extract information from psychophysiological time series are discussed.


Methodology ◽  
2012 ◽  
Vol 8 (1) ◽  
pp. 23-38 ◽  
Author(s):  
Manuel C. Voelkle ◽  
Patrick E. McKnight

The use of latent curve models (LCMs) has increased almost exponentially during the last decade. Oftentimes, researchers regard LCM as a “new” method to analyze change with little attention paid to the fact that the technique was originally introduced as an “alternative to standard repeated measures ANOVA and first-order auto-regressive methods” (Meredith & Tisak, 1990, p. 107). In the first part of the paper, this close relationship is reviewed, and it is demonstrated how “traditional” methods, such as the repeated measures ANOVA, and MANOVA, can be formulated as LCMs. Given that latent curve modeling is essentially a large-sample technique, compared to “traditional” finite-sample approaches, the second part of the paper addresses the question to what degree the more flexible LCMs can actually replace some of the older tests by means of a Monte-Carlo simulation. In addition, a structural equation modeling alternative to Mauchly’s (1940) test of sphericity is explored. Although “traditional” methods may be expressed as special cases of more general LCMs, we found the equivalence holds only asymptotically. For practical purposes, however, no approach always outperformed the other alternatives in terms of power and type I error, so the best method to be used depends on the situation. We provide detailed recommendations of when to use which method.


Methodology ◽  
2015 ◽  
Vol 11 (1) ◽  
pp. 3-12 ◽  
Author(s):  
Jochen Ranger ◽  
Jörg-Tobias Kuhn

In this manuscript, a new approach to the analysis of person fit is presented that is based on the information matrix test of White (1982) . This test can be interpreted as a test of trait stability during the measurement situation. The test follows approximately a χ2-distribution. In small samples, the approximation can be improved by a higher-order expansion. The performance of the test is explored in a simulation study. This simulation study suggests that the test adheres to the nominal Type-I error rate well, although it tends to be conservative in very short scales. The power of the test is compared to the power of four alternative tests of person fit. This comparison corroborates that the power of the information matrix test is similar to the power of the alternative tests. Advantages and areas of application of the information matrix test are discussed.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


Sign in / Sign up

Export Citation Format

Share Document