type i error rate
Recently Published Documents


TOTAL DOCUMENTS

124
(FIVE YEARS 32)

H-INDEX

23
(FIVE YEARS 1)

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Shuai Wang ◽  
James B. Meigs ◽  
Josée Dupuis

Abstract Background Advancements in statistical methods and sequencing technology have led to numerous novel discoveries in human genetics in the past two decades. Among phenotypes of interest, most attention has been given to studying genetic associations with continuous or binary traits. Efficient statistical methods have been proposed and are available for both types of traits under different study designs. However, for multinomial categorical traits in related samples, there is a lack of efficient statistical methods and software. Results We propose an efficient score test to analyze a multinomial trait in family samples, in the context of genome-wide association/sequencing studies. An alternative Wald statistic is also proposed. We also extend the methodology to be applicable to ordinal traits. We performed extensive simulation studies to evaluate the type-I error of the score test, Wald test compared to the multinomial logistic regression for unrelated samples, under different allele frequency and study designs. We also evaluate the power of these methods. Results show that both the score and Wald tests have a well-controlled type-I error rate, but the multinomial logistic regression has an inflated type-I error rate when applied to family samples. We illustrated the application of the score test with an application to the Framingham Heart Study to uncover genetic variants associated with diabesity, a multi-category phenotype. Conclusion Both proposed tests have correct type-I error rate and similar power. However, because the Wald statistics rely on computer-intensive estimation, it is less efficient than the score test in terms of applications to large-scale genetic association studies. We provide computer implementation for both multinomial and ordinal traits.


2021 ◽  
Author(s):  
Tristan Tibbe ◽  
Amanda Kay Montoya

The bias-corrected bootstrap confidence interval (BCBCI) was once the method of choice for conducting inference on the indirect effect in mediation analysis due to its high power in small samples, but now it is criticized by methodologists for its inflated type I error rates. In its place, the percentile bootstrap confidence interval (PBCI), which does not adjust for bias, is currently the recommended inferential method for indirect effects. This study proposes two alternative bias-corrected bootstrap methods for creating confidence intervals around the indirect effect. Using a Monte Carlo simulation, these methods were compared to the BCBCI, PBCI, and a bias-corrected method introduced by Chen and Fritz (2021). The results showed that the methods perform on a continuum, where the BCBCI has the best balance (i.e., having closest to an equal proportion of CIs falling above and below the true effect), highest power, and highest type I error rate; the PBCI has the worst balance, lowest power, and lowest type I error rate; and the alternative bias-corrected methods fall between these two methods on all three performance criteria. An extension of the original simulation that compared the bias-corrected methods to the PBCI after controlling for type I error rate inflation suggests that the increased power of these methods might only be due to their higher type I error rates. Thus, if control over the type I error rate is desired, the PBCI is still the recommended method for use with the indirect effect. Future research should examine the performance of these methods in the presence of missing data, confounding variables, and other real-world complications to enhance the generalizability of these results.


2021 ◽  
pp. 263-281
Author(s):  
Weichung Joe Shih ◽  
Joseph Aisner

2021 ◽  
Author(s):  
David Zelený ◽  
Kenny Helsen ◽  
Yi-Nuo Lee

AbstractCommunity weighted means (CWMs) are widely used to study the relationship between community-level functional traits and environment variation. When relationships between CWM traits and environmental variables are directly assessed using linear regression or ANOVA and tested by standard parametric tests, results are prone to inflated Type I error rates, thus producing overly optimistic results. Previous research has found that this problem can be solved by permutation tests (i.e. the max test). A recent extension of this CWM approach, that allows the inclusion of intraspecific trait variation (ITV) by partitioning information in fixed, site-specific and intraspecific CWMs, has proven popular. However, this raises the question whether the same kind of Type I error rate inflation also exists for site-specific CWM or intraspecific CWM-environment relationships. Using simulated community datasets and a real-world dataset from a subtropical montane cloud forest in Taiwan, we show that site-specific CWM-environment relationships also suffer from Type I error rate inflation, and that the severity of this inflation is negatively related to the relative ITV magnitude. In contrast, for intraspecific CWM-environment relationships, standard parametric tests have the correct Type I error rate, while being somewhat conservative, with reduced statistical power. We introduce an ITV-extended version of the max test for the ITV-extended CWM approach, which can solve the inflation problem for site-specific CWM-environment relationships, and which, without considering ITV, becomes equivalent to the “original” max test used for the CWM approach. On both simulated and real-world data, we show that this new ITV-extended max test works well across the full possible magnitude of ITV. We also provide guidelines and R codes of max test solutions for each CWM type and situation. Finally, we suggest recommendations on how to handle the results of previously published studies using the CWM approach without controlling for Type I error rate inflation.


2021 ◽  
Author(s):  
Liang-Dar Hwang ◽  
Gunn-Helen Moen ◽  
David M Evans

Maternal genetic effects can be defined as the effect of a mother's genotype on the phenotype of her offspring, independent of the offspring's genotype. Maternal genetic effects can act via the intrauterine environment during pregnancy and/or via the postnatal environment. In this manuscript, we present a simple extension to the basic adoption design that uses structural equation modelling (SEM) to partition maternal genetic effects into prenatal and postnatal effects. We assume that in biological families, offspring phenotypes are influenced prenatally by their mother's genotype and postnatally by both parents' genotypes, whereas adopted individuals' phenotypes are influenced prenatally by their biological mother's genotype and postnatally by their adoptive parents' genotypes. Our SEM framework allows us to model the (potentially) unobserved genotypes of biological and adoptive parents as latent variables, permitting us in principle to leverage the thousands of adopted singleton individuals in the UK Biobank. We examine the power, utility and type I error rate of our model using simulations and asymptotic power calculations. We apply our model to polygenic scores of educational attainment and birth weight associated variants, in up to 5178 adopted singletons, 983 trios, 3650 mother-offspring pairs, 1665 father-offspring pairs and 350330 singletons from the UK Biobank. Our results show the expected pattern of maternal genetic effects on offspring birth weight, but unexpectedly large prenatal maternal genetic effects on offspring educational attainment. Sensitivity and simulation analyses suggest this result may be at least partially due to adopted individuals in the UK Biobank being raised by their biological relatives. We show that accurate modelling of these sorts of cryptic relationships is sufficient to bring type I error rate under control and produce unbiased estimates of prenatal and postnatal maternal genetic effects. We conclude that there would be considerable value in following up adopted individuals in the UK Biobank to determine whether they were raised by their biological relatives, and if so, to precisely ascertain the nature of these relationships. These adopted individuals could then be incorporated into informative statistical genetics models like the one described in our manuscript to further elucidate the genetic architecture of complex traits and diseases.


2021 ◽  
Author(s):  
Marc J Lanovaz ◽  
Rachel Primiani

Researchers and practitioners often use single-case designs (SCDs), or n-of-1 trials, to develop and validate novel treatments. Standards and guidelines have been published to provide guidance as to how to implement SCDs, but many of their recommendations are not derived from the research literature. For example, one of these recommendations suggests that researchers and practitioners should wait for baseline stability prior to introducing an independent variable. However, this recommendation is not strongly supported by empirical evidence. To address this issue, we used a Monte Carlo simulation to generate a total of 480,000 AB graphs with fixed, response-guided, and random baseline lengths. Then, our analyses compared the Type I error rate and power produced by two methods of analysis: the conservative dual-criteria method (a structured visual aid) and a support vector classifier (a model derived from machine learning). The conservative dual-criteria method produced more power when using response-guided decision-making (i.e., waiting for stability) with negligeable effects on Type I error rate. In contrast, waiting for stability did not reduce decision-making errors with the support vector classifier. Our findings question the necessity of waiting for baseline stability when using SCDs with machine learning, but the study must be replicated with other designs to support our results.


2021 ◽  
Author(s):  
Haiyang Jin

Analysis of variance (ANOVA) is one of the most popular statistical methods employed for data analysis in psychology and other fields. Nevertheless, ANOVA is frequently used as an exploratory approach, even in confirmatory studies with explicit hypotheses. Such misapplication may invalidate ANOVA conventions, resulting in reduced statistical power, and even threatening the validity of conclusions. This paper evaluates the appropriateness of ANOVA conventions, discusses the potential motivations possibly misunderstood by researchers, and provides practical suggestions. Moreover, this paper proposes to control the Type I error rate with Hypothesis-based Type I Error Rate to consider both the number of tests and their logical relationships in rejecting the null hypothesis. Furthermore, this paper introduces the simple interaction analysis, which can employ the most straightforward interaction to test a hypothesis of interest. Finally, pre-registration is recommended to provide clarity for the selection of appropriate ANOVA tests in both confirmatory and exploratory studies.


2021 ◽  
pp. 174077452110101
Author(s):  
Jennifer Proper ◽  
John Connett ◽  
Thomas Murray

Background: Bayesian response-adaptive designs, which data adaptively alter the allocation ratio in favor of the better performing treatment, are often criticized for engendering a non-trivial probability of a subject imbalance in favor of the inferior treatment, inflating type I error rate, and increasing sample size requirements. The implementation of these designs using the Thompson sampling methods has generally assumed a simple beta-binomial probability model in the literature; however, the effect of these choices on the resulting design operating characteristics relative to other reasonable alternatives has not been fully examined. Motivated by the Advanced R2 Eperfusion STrategies for Refractory Cardiac Arrest trial, we posit that a logistic probability model coupled with an urn or permuted block randomization method will alleviate some of the practical limitations engendered by the conventional implementation of a two-arm Bayesian response-adaptive design with binary outcomes. In this article, we discuss up to what extent this solution works and when it does not. Methods: A computer simulation study was performed to evaluate the relative merits of a Bayesian response-adaptive design for the Advanced R2 Eperfusion STrategies for Refractory Cardiac Arrest trial using the Thompson sampling methods based on a logistic regression probability model coupled with either an urn or permuted block randomization method that limits deviations from the evolving target allocation ratio. The different implementations of the response-adaptive design were evaluated for type I error rate control across various null response rates and power, among other performance metrics. Results: The logistic regression probability model engenders smaller average sample sizes with similar power, better control over type I error rate, and more favorable treatment arm sample size distributions than the conventional beta-binomial probability model, and designs using the alternative randomization methods have a negligible chance of a sample size imbalance in the wrong direction. Conclusion: Pairing the logistic regression probability model with either of the alternative randomization methods results in a much improved response-adaptive design in regard to important operating characteristics, including type I error rate control and the risk of a sample size imbalance in favor of the inferior treatment.


2021 ◽  
Author(s):  
Shuai Wang ◽  
James Meigs ◽  
Josee Dupuis

Abstract Background Advancements in statistical methods and sequencing technology have led to numerous novel discoveries in human genetics in the past two decades. Among phenotypes of interest, most attention has been given to studying genetic associations with continuous or binary traits. Efficient statistical methods have been proposed and are available for both type of traits under different study designs. However, for multinomial categorical traits in related samples, there is a lack of widely used efficient statistical methods and software. Results We propose an efficient score test to analyze a multinomial trait in family samples, in the context of genome-wide association/sequencing studies. An alternative Wald statistic is also proposed. We also extend the methodology to be applicable to ordinal traits. We performed extensive simulation studies to evaluate the type-I error of the score test, Wald test compared to the multinomial logistic regression for unrelated samples, under different allele frequency and study designs. We also evaluate the power of these methods. Results show that both the score and Wald tests have well-controlled type-I error rate, but the multinomial logistic regression has inflated type-I error rate when applied to family samples. We illustrated the application of the score test with an application to the Framingham Heart Study to uncover genetic variants associated with diabesity, a multi-category phenotype. Conclusion Both proposed tests have correct type-I error rate and similar power rate. However, because the Wald statistics rely on computer intensive estimation, it is less efficient than the score test in terms of applications to large-scale genetic association studies. We provide computer implementation for both multinomial and ordinal traits.


Sign in / Sign up

Export Citation Format

Share Document