Coal-Miner: a coalescent-based method for GWA studies of quantitative traits with complex evolutionary origins

AbstractAssociation mapping (AM) methods are used in genome-wide association (GWA) studies to test for statistically significant associations between genotypic and phenotypic data. The genotypic and phenotypic data share common evolutionary origins – namely, the evolutionary history of sampled organisms – introducing covariance which must be distinguished from the covariance due to biological function that is of primary interest in GWA studies. A variety of methods have been introduced to perform AM while accounting for sample relatedness. However, the state of the art predominantly utilizes the simplifying assumption that sample relatedness is effectively fixed across the genome. In contrast, population genetic theory and empirical studies have shown that sample relatedness can vary greatly across different loci within a genome; this phenomena – referred to as local genealogical variation – is commonly encountered in many genomic datasets. New AM methods are needed to better account for local variation in sample relatedness within genomes.We address this gap by introducing Coal-Miner, a new statistical AM method. The Coal-Miner algorithm takes the form of a methodological pipeline. The initial stages of Coal-Miner seek to detect candidate loci, or loci which contain putatively causal markers. Subsequent stages of Coal-Miner perform test for association using a linear mixed model with multiple effects which account for sample relatedness locally within candidate loci and globally across the entire genome.Using synthetic and empirical datasets, we compare the statistical power and type I error control of Coal-Miner against state-of-theart AM methods. The simulation conditions reflect a variety of genomic architectures for complex traits and incorporate a range of evolutionary scenarios, each with different evolutionary processes that can generate local genealogical variation. The empirical benchmarks include a large-scale dataset that appeared in a recent high-profile publication. Across the datasets in our study, we find that Coal-Miner consistently offers comparable or typically better statistical power and type I error control compared to the state-of-art methods.CCS CONCEPTSApplied computing → Computational genomics; Computational biology; Molecular sequence analysis; Molecular evolution; Computational genomics; Systems biology; Bioinformatics; Population genetics;ACM Reference formatHussein A. Hejase, Natalie Vande Pol, Gregory M. Bonito, Patrick P. Edger, and Kevin J. Liu. 2017. Coal-Miner: a coalescent-based method for GWA studies of quantitative traits with complex evolutionary origins. In Proceedings of ACM BCB, Boston, MA, 2017 (BCB), 10 pages. DOI: 10.475/123 4

Download Full-text

Evaluating the Efficacy of Conditional Analysis of Variance under Heterogeneity and Non-Normality

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1555340224 ◽

2019 ◽

Vol 17 (2) ◽

Author(s):

Yan Wang ◽

Thanh Pham ◽

Diep Nguyen ◽

Eun Sook Kim ◽

Yi-Hsin Chen ◽

...

Keyword(s):

Analysis Of Variance ◽

Simulation Study ◽

Statistical Power ◽

Error Control ◽

Type I Error ◽

Conditional Analysis ◽

Type I ◽

F Test ◽

Homogeneity Of Variance ◽

Robust Anova

A simulation study was conducted to examine the efficacy of conditional analysis of variance (ANOVA) methods where the initial homogeneity of variance screening leads to the choice between the ANOVA F test and robust ANOVA methods. Type I error control and statistical power were investigated under various conditions.

Download Full-text

Are Reaction Time Transformations Really Beneficial?

10.31234/osf.io/9ksa6 ◽

2019 ◽

Cited By ~ 6

Author(s):

Pele Schramm ◽

Jeffrey Rouder

Keyword(s):

Statistical Power ◽

Error Control ◽

Type I Error ◽

Response Times ◽

Error Rates ◽

Type I ◽

Drift Diffusion Model ◽

Reciprocal Transformation ◽

The Common ◽

Time Transformations

We investigate whether or not the common practice of transforming response times prior to conventional analyses of central tendency yields any notable benefits. We generate data from a realistic single-bound drift diffusion model with parameters informed by several different typical experiments in cognition. We then examine the effects of log and reciprocal transformation on expected effect size, statistical power, and Type I error rates for conventional two-sample t-tests. One of the key elements of our setup is that RTs have a lower bound, called the shift, which is well above 0. We closely examine the effect that different shifts have for the analyses. We conclude that logarithm and reciprocal transformation offer no gain in power or Type I error control. In some typical cases, reciprocal transformations are detrimental as they lead to a lowering of power.

Download Full-text

Empirical Comparison of Tests for One-Factor ANOVA Under Heterogeneity and Non-Normality: A Monte Carlo Study

Journal of Modern Applied Statistical Methods ◽

10.22237/jmasm/1604190000 ◽

2020 ◽

Vol 18 (2) ◽

pp. 2-30

Author(s):

Diep Nguyen ◽

Eunsook Kim ◽

Yan Wang ◽

Thanh Vinh Pham ◽

Yi-Hsin Chen ◽

...

Keyword(s):

Statistical Power ◽

Error Control ◽

Type I Error ◽

Monte Carlo Study ◽

Type I ◽

Omnibus Test ◽

Homogeneity Of Variance ◽

Mean Equality ◽

Comparison Of Tests ◽

Structured Means

Although the Analysis of Variance (ANOVA) F test is one of the most popular statistical tools to compare group means, it is sensitive to violations of the homogeneity of variance (HOV) assumption. This simulation study examines the performance of thirteen tests in one-factor ANOVA models in terms of their Type I error rate and statistical power under numerous (82,080) conditions. The results show that when HOV was satisfied, the ANOVA F or the Brown-Forsythe test outperformed the other methods in terms of both Type I error control and statistical power even under non-normality. When HOV was violated, the Structured Means Modeling (SMM) with Bartlett or SMM with Maximum Likelihood was strongly recommended for the omnibus test of group mean equality.

Download Full-text

How to Detect Publication Bias in Psychological Research

Zeitschrift für Psychologie ◽

10.1027/2151-2604/a000386 ◽

2019 ◽

Vol 227 (4) ◽

pp. 261-279 ◽

Cited By ~ 2

Author(s):

Frank Renkewitz ◽

Melanie Keiner

Keyword(s):

Publication Bias ◽

Effect Size ◽

Statistical Power ◽

Type I Error ◽

Psychological Research ◽

Type I ◽

True Effect Size ◽

Questionable Research Practices ◽

True Effect ◽

Meta Analyses

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.

Download Full-text

Comparison of Means: An F Test Followed by a Modified Multiple Range Procedure

Journal of Educational Statistics ◽

10.3102/10769986004001014 ◽

1979 ◽

Vol 4 (1) ◽

pp. 14-23 ◽

Cited By ~ 9

Author(s):

Juliet Popper Shaffer

Keyword(s):

Error Control ◽

Type I Error ◽

Critical Value ◽

The Other ◽

Type I ◽

F Test ◽

Range Test

If used only when a preliminary F test yields significance, the usual multiple range procedures can be modified to increase the probability of detecting differences without changing the control of Type I error. The modification consists of a reduction in the critical value when comparing the largest and smallest means. Equivalence of modified and unmodified procedures in error control is demonstrated. The modified procedure is also compared with the alternative of using the unmodified range test without a preliminary F test, and it is shown that each has advantages over the other under some circumstances.

Download Full-text

Efficient Noninferiority Testing Procedures for Simultaneously Assessing Sensitivity and Specificity of Two Diagnostic Tests

Computational and Mathematical Methods in Medicine ◽

10.1155/2015/128930 ◽

2015 ◽

Vol 2015 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Guogen Shan ◽

Amei Amei ◽

Daniel Young

Keyword(s):

Sensitivity And Specificity ◽

Diagnostic Tests ◽

Error Control ◽

Type I Error ◽

Binary Outcomes ◽

Type I ◽

Test Statistics ◽

Asymptotic Approach ◽

Testing Procedures ◽

Assessing Sensitivity

Sensitivity and specificity are often used to assess the performance of a diagnostic test with binary outcomes. Wald-type test statistics have been proposed for testing sensitivity and specificity individually. In the presence of a gold standard, simultaneous comparison between two diagnostic tests for noninferiority of sensitivity and specificity based on an asymptotic approach has been studied by Chen et al. (2003). However, the asymptotic approach may suffer from unsatisfactory type I error control as observed from many studies, especially in small to medium sample settings. In this paper, we compare three unconditional approaches for simultaneously testing sensitivity and specificity. They are approaches based on estimation, maximization, and a combination of estimation and maximization. Although the estimation approach does not guarantee type I error, it has satisfactory performance with regard to type I error control. The other two unconditional approaches are exact. The approach based on estimation and maximization is generally more powerful than the approach based on maximization.

Download Full-text

A Multi-faceted Mess: A Review of Statistical Power Analysis in Psychology Journal Articles

10.31234/osf.io/3bdfu ◽

2019 ◽

Cited By ~ 2

Author(s):

Rob Cribbie ◽

Nataly Beribisky ◽

Udi Alter

Keyword(s):

Sample Size ◽

Effect Size ◽

Power Analysis ◽

Statistical Power ◽

Type I Error ◽

A Priori ◽

Type I ◽

Specific Level ◽

Maximum Sample Size ◽

Power Analyses

Many bodies recommend that a sample planning procedure, such as traditional NHST a priori power analysis, is conducted during the planning stages of a study. Power analysis allows the researcher to estimate how many participants are required in order to detect a minimally meaningful effect size at a specific level of power and Type I error rate. However, there are several drawbacks to the procedure that render it “a mess.” Specifically, the identification of the minimally meaningful effect size is often difficult but unavoidable for conducting the procedure properly, the procedure is not precision oriented, and does not guide the researcher to collect as many participants as feasibly possible. In this study, we explore how these three theoretical issues are reflected in applied psychological research in order to better understand whether these issues are concerns in practice. To investigate how power analysis is currently used, this study reviewed the reporting of 443 power analyses in high impact psychology journals in 2016 and 2017. It was found that researchers rarely use the minimally meaningful effect size as a rationale for the chosen effect in a power analysis. Further, precision-based approaches and collecting the maximum sample size feasible are almost never used in tandem with power analyses. In light of these findings, we offer that researchers should focus on tools beyond traditional power analysis when sample planning, such as collecting the maximum sample size feasible.

Download Full-text

Cognitive tests used in chronic adult human randomised controlled trial micronutrient and phytochemical intervention studies

Nutrition Research Reviews ◽

10.1017/s0954422410000119 ◽

2010 ◽

Vol 23 (2) ◽

pp. 200-229 ◽

Cited By ~ 25

Author(s):

Anna L. Macready ◽

Laurie T. Butler ◽

Orla B. Kennedy ◽

Judi A. Ellis ◽

Claire M. Williams ◽

...

Keyword(s):

Randomised Controlled Trial ◽

Statistical Power ◽

Type I Error ◽

Spatial Working Memory ◽

Controlled Trial ◽

Type I ◽

Cognitive Tests ◽

Cognitive Domains ◽

Positive Effects ◽

Randomised Controlled

In recent years there has been a rapid growth of interest in exploring the relationship between nutritional therapies and the maintenance of cognitive function in adulthood. Emerging evidence reveals an increasingly complex picture with respect to the benefits of various food constituents on learning, memory and psychomotor function in adults. However, to date, there has been little consensus in human studies on the range of cognitive domains to be tested or the particular tests to be employed. To illustrate the potential difficulties that this poses, we conducted a systematic review of existing human adult randomised controlled trial (RCT) studies that have investigated the effects of 24 d to 36 months of supplementation with flavonoids and micronutrients on cognitive performance. There were thirty-nine studies employing a total of 121 different cognitive tasks that met the criteria for inclusion. Results showed that less than half of these studies reported positive effects of treatment, with some important cognitive domains either under-represented or not explored at all. Although there was some evidence of sensitivity to nutritional supplementation in a number of domains (for example, executive function, spatial working memory), interpretation is currently difficult given the prevailing ‘scattergun approach’ for selecting cognitive tests. Specifically, the practice means that it is often difficult to distinguish between a boundary condition for a particular nutrient and a lack of task sensitivity. We argue that for significant future progress to be made, researchers need to pay much closer attention to existing human RCT and animal data, as well as to more basic issues surrounding task sensitivity, statistical power and type I error.

Download Full-text

Required sample size for comparing two independent means

Marine Medicine ◽

10.22328/2413-5747-2020-6-2-106-113 ◽

2020 ◽

Vol 6 (2) ◽

pp. 106-113

Author(s):

A. M. Grjibovski ◽

M. A. Gorbatova ◽

A. N. Narkevich ◽

K. A. Vinogradov

Keyword(s):

Sample Size ◽

Statistical Power ◽

Type I Error ◽

Sample Size Calculation ◽

Biomedical Literature ◽

Type I ◽

Research Practice ◽

False Null Hypothesis ◽

Different Levels ◽

Russian Research

Sample size calculation in a planning phase is still uncommon in Russian research practice. This situation threatens validity of the conclusions and may introduce Type I error when the false null hypothesis is accepted due to lack of statistical power to detect the existing difference between the means. Comparing two means using unpaired Students’ ttests is the most common statistical procedure in the Russian biomedical literature. However, calculations of the minimal required sample size or retrospective calculation of the statistical power were observed only in very few publications. In this paper we demonstrate how to calculate required sample size for comparing means in unpaired samples using WinPepi and Stata software. In addition, we produced tables for minimal required sample size for studies when two means have to be compared and body mass index and blood pressure are the variables of interest. The tables were constructed for unpaired samples for different levels of statistical power and standard deviations obtained from the literature.

Download Full-text

Bayesian Two-Stage Adaptive Design in Bioequivalence

The International Journal of Biostatistics ◽

10.1515/ijb-2018-0105 ◽

2019 ◽

Vol 16 (1) ◽

Cited By ~ 2

Author(s):

Shengjie Liu ◽

Jun Gao ◽

Yuling Zheng ◽

Lei Huang ◽

Fangrong Yan

Keyword(s):

Statistical Power ◽

Adaptive Design ◽

Type I Error ◽

Probability Model ◽

Type I ◽

Two Stage ◽

Stage Design ◽

Estimation Strategy ◽

Drug Products ◽

Two Stage Design

AbstractBioequivalence (BE) studies are an integral component of new drug development process, and play an important role in approval and marketing of generic drug products. However, existing design and evaluation methods are basically under the framework of frequentist theory, while few implements Bayesian ideas. Based on the bioequivalence predictive probability model and sample re-estimation strategy, we propose a new Bayesian two-stage adaptive design and explore its application in bioequivalence testing. The new design differs from existing two-stage design (such as Potvin’s method B, C) in the following aspects. First, it not only incorporates historical information and expert information, but further combines experimental data flexibly to aid decision-making. Secondly, its sample re-estimation strategy is based on the ratio of the information in interim analysis to total information, which is simpler in calculation than the Potvin’s method. Simulation results manifested that the two-stage design can be combined with various stop boundary functions, and the results are different. Moreover, the proposed method saves sample size compared to the Potvin’s method under the conditions that type I error rate is below 0.05 and statistical power reaches 80 %.

Download Full-text