Type I Error Inflation of the Separate-Variances Welch t test with Very Small Sample Sizes when Assumptions Are Met

Albert K. Adusah; Gordon P. Brooks

doi:10.22237/jmasm/1304224320

A Monte Carlo Comparison of Seven ε-Adjustment Procedures in Repeated Measures Designs With Small Sample Sizes

Journal of Educational Statistics ◽

10.3102/10769986019001057 ◽

1994 ◽

Vol 19 (1) ◽

pp. 57-71 ◽

Cited By ~ 18

Author(s):

Stephen M. Quintana ◽

Scott E. Maxwell

Keyword(s):

Repeated Measures ◽

Type I Error ◽

Error Rates ◽

Small Sample ◽

Small Samples ◽

Type I ◽

Sample Sizes ◽

Type I Error Rates ◽

Repeated Measures Designs ◽

Small Sample Sizes

The purpose of this study was to evaluate seven univariate procedures for testing omnibus null hypotheses for data gathered from repeated measures designs. Five alternate approaches are compared to the two more traditional adjustment procedures (Geisser and Greenhouse’s ε̂ and Huynh and Feldt’s ε̃), neither of which may be entirely adequate when sample sizes are small and the number of levels of the repeated factors is large. Empirical Type I error rates and power levels were obtained by simulation for conditions where small samples occur in combination with many levels of the repeated factor. Results suggested that alternate univariate approaches were improvements to the traditional approaches. One alternate approach in particular was found to be most effective in controlling Type I error rates without unduly sacrificing power.

Download Full-text

Comparing Alternative Corrections for Bias in the Bias-Corrected Bootstrap Test of Mediation

Evaluation & the Health Professions ◽

10.1177/01632787211024356 ◽

2021 ◽

pp. 016327872110243

Author(s):

Donna Chen ◽

Matthew S. Fritz

Keyword(s):

Effect Size ◽

Statistical Power ◽

Type I Error ◽

Small Sample ◽

Medium Effect ◽

Type I ◽

Sample Sizes ◽

Bootstrap Test ◽

Small Effect Size ◽

Small Sample Sizes

Although the bias-corrected (BC) bootstrap is an often-recommended method for testing mediation due to its higher statistical power relative to other tests, it has also been found to have elevated Type I error rates with small sample sizes. Under limitations for participant recruitment, obtaining a larger sample size is not always feasible. Thus, this study examines whether using alternative corrections for bias in the BC bootstrap test of mediation for small sample sizes can achieve equal levels of statistical power without the associated increase in Type I error. A simulation study was conducted to compare Efron and Tibshirani’s original correction for bias, z 0, to six alternative corrections for bias: (a) mean, (b–e) Winsorized mean with 10%, 20%, 30%, and 40% trimming in each tail, and (f) medcouple (robust skewness measure). Most variation in Type I error (given a medium effect size of one regression slope and zero for the other slope) and power (small effect size in both regression slopes) was found with small sample sizes. Recommendations for applied researchers are made based on the results. An empirical example using data from the ATLAS drug prevention intervention study is presented to illustrate these results. Limitations and future directions are discussed.

Download Full-text

Likelihood ratio test between two groups of castor oil plant traits

Ciência Rural ◽

10.1590/0103-8478cr20151418 ◽

2016 ◽

Vol 46 (7) ◽

pp. 1158-1164

Author(s):

Betania Brum ◽

Sidinei José Lopes ◽

Daniel Furtado Ferreira ◽

Lindolfo Storck ◽

Alberto Cargnelutti Filho

Keyword(s):

Likelihood Ratio ◽

Likelihood Ratio Test ◽

Type I Error ◽

Plant Traits ◽

Small Sample ◽

Ratio Test ◽

Type I ◽

Sample Sizes ◽

Oil Plant ◽

Small Sample Sizes

ABSTRACT: The likelihood ratio test (LRT), to the independence between two sets of variables, allows to identify whether there is a dependency relationship between them. The aim of this study was to calculate the type I error and power of the LRT for determining independence between two sets of variables under multivariate normal distributions in scenarios consisting of combinations of 16 sample sizes; 40 combinations of the number of variables of the two groups; and nine degrees of correlation between the variables (for the power). The rate of type I error and power were calculate at 640 and 5,760 scenarios, respectively. A performance evaluation of the LRT was conducted by computer simulation by the Monte Carlo method, using 2,000 simulations in each scenario. When the number of variables was large (24), the TRV controlled the rate of type I errors and showed high power in sizes greater than 100 samples. For small sample sizes (25, 30 and 50), the test showed good performance because the number of variables did not exceed 12.

Download Full-text

Permutation-based methods for mediation analysis in studies with small sample sizes

PeerJ ◽

10.7717/peerj.8246 ◽

2020 ◽

Vol 8 ◽

pp. e8246

Author(s):

Miranda E. Kroehl ◽

Sharon Lutz ◽

Brandie D. Wagner

Keyword(s):

Indirect Effect ◽

Mediation Analysis ◽

Type I Error ◽

Permutation Test ◽

Error Rates ◽

Small Sample ◽

Type I ◽

Permutation Testing ◽

Sample Sizes ◽

Small Sample Sizes

Background Mediation analysis can be used to evaluate the effect of an exposure on an outcome acting through an intermediate variable or mediator. For studies with small sample sizes, permutation testing may be useful in evaluating the indirect effect (i.e., the effect of exposure on the outcome through the mediator) while maintaining the appropriate type I error rate. For mediation analysis in studies with small sample sizes, existing permutation testing methods permute the residuals under the full or alternative model, but have not been evaluated under situations where covariates are included. In this article, we consider and evaluate two additional permutation approaches for testing the indirect effect in mediation analysis based on permutating the residuals under the reduced or null model which allows for the inclusion of covariates. Methods Simulation studies were used to empirically evaluate the behavior of these two additional approaches: (1) the permutation test of the Indirect Effect under Reduced Models (IERM) and (2) the Permutation Supremum test under Reduced Models (PSRM). The performance of these methods was compared to the standard permutation approach for mediation analysis, the permutation test of the Indirect Effect under Full Models (IEFM). We evaluated the type 1 error rates and power of these methods in the presence of covariates since mediation analysis assumes no unmeasured confounders of the exposure–mediator–outcome relationships. Results The proposed PSRM approach maintained type I error rates below nominal levels under all conditions, while the proposed IERM approach exhibited grossly inflated type I rates in many conditions and the standard IEFM exhibited inflated type I error rates under a small number of conditions. Power did not differ substantially between the proposed PSRM approach and the standard IEFM approach. Conclusions The proposed PSRM approach is recommended over the existing IEFM approach for mediation analysis in studies with small sample sizes.

Download Full-text

On the use of chi-square analyses in studies of resource utilization

Canadian Journal of Forest Research ◽

10.1139/x91-009 ◽

1991 ◽

Vol 21 (1) ◽

pp. 58-65 ◽

Cited By ~ 10

Author(s):

Dennis E. Jelinski

Keyword(s):

Resource Utilization ◽

Goodness Of Fit ◽

Small Sample ◽

Homogeneity Test ◽

Type I ◽

Sample Sizes ◽

Goodness Of Fit Test ◽

Chi Square ◽

Test Of Homogeneity ◽

Small Sample Sizes

Chi-square (χ2) tests are analytic procedures that are often used to test the hypothesis that animals use a particular food item or habitat in proportion to its availability. Unfortunately, several sources of error are common to the use of χ2 analysis in studies of resource utilization. Both the goodness-of-fit and homogeneity tests have been incorrectly used interchangeably when resource availabilities are estimated or known apriori. An empirical comparison of the two methods demonstrates that the χ2 test of homogeneity may generate results contrary to the χ2 goodness-of-fit test. Failure to recognize the conservative nature of the χ2 homogeneity test, when "expected" values are known apriori, may lead to erroneous conclusions owing to the increased possibility of committing a type II error. Conversely, proper use of the goodness-of-fit method is predicated on the availability of accurate maps of resource abundance, or on estimates of resource availability based on very large sample sizes. Where resource availabilities have been estimated from small sample sizes, the use of the χ2 goodness-of-fit test may lead to type I errors beyond the nominal level of α. Both tests require adherence to specific critical assumptions that often have been violated, and accordingly, these assumptions are reviewed here. Alternatives to the Pearson χ2 statistic are also discussed.

Download Full-text

Type I Error Rates for Welch’s Test and James’s Second-Order Test Under Nonnormality and Inequality of Variance When There Are Two Groups

Journal of Educational Statistics ◽

10.3102/10769986019003275 ◽

1994 ◽

Vol 19 (3) ◽

pp. 275-291 ◽

Cited By ~ 28

Author(s):

James Algina ◽

T. C. Oshima ◽

Wen-Ying Lin

Keyword(s):

Degrees Of Freedom ◽

Type I Error ◽

Total Sample ◽

Error Rates ◽

Second Order ◽

T Test ◽

Type I ◽

Sample Sizes ◽

Unequal Variances ◽

Type I Error Rates

Type I error rates were estimated for three tests that compare means by using data from two independent samples: the independent samples t test, Welch’s approximate degrees of freedom test, and James’s second-order test. Type I error rates were estimated for skewed distributions, equal and unequal variances, equal and unequal sample sizes, and a range of total sample sizes. Welch’s test and James’s test have very similar Type I error rates and tend to control the Type I error rate as well or better than the independent samples t test does. The results provide guidance about the total sample sizes required for controlling Type I error rates.

Download Full-text

Performance of Monte Carlo Permutation and Approximate Tests for Multivariate Means Comparisons With Small Sample Sizes When Parametric Assumptions are Violated

Methodology ◽

10.1027/1614-2241.5.2.60 ◽

2009 ◽

Vol 5 (2) ◽

pp. 60-70 ◽

Cited By ~ 6

Author(s):

W. Holmes Finch ◽

Teresa Davenport

Keyword(s):

Monte Carlo ◽

Type I Error ◽

Permutation Tests ◽

Error Rates ◽

Covariance Matrices ◽

Small Sample ◽

Type I ◽

Permutation Testing ◽

Sample Sizes ◽

Type I Error Rates

Permutation testing has been suggested as an alternative to the standard F approximate tests used in multivariate analysis of variance (MANOVA). These approximate tests, such as Wilks’ Lambda and Pillai’s Trace, have been shown to perform poorly when assumptions of normally distributed dependent variables and homogeneity of group covariance matrices were violated. Because Monte Carlo permutation tests do not rely on distributional assumptions, they may be expected to work better than their approximate cousins when the data do not conform to the assumptions described above. The current simulation study compared the performance of four standard MANOVA test statistics with their Monte Carlo permutation-based counterparts under a variety of conditions with small samples, including conditions when the assumptions were met and when they were not. Results suggest that for sample sizes of 50 subjects, power is very low for all the statistics. In addition, Type I error rates for both the approximate F and Monte Carlo tests were inflated under the condition of nonnormal data and unequal covariance matrices. In general, the performance of the Monte Carlo permutation tests was slightly better in terms of Type I error rates and power when both assumptions of normality and homogeneous covariance matrices were not met. It should be noted that these simulations were based upon the case with three groups only, and as such results presented in this study can only be generalized to similar situations.

Download Full-text

Group sequential t-test for clinical trials with small sample sizes across stages

Contemporary Clinical Trials ◽

10.1016/j.cct.2007.02.006 ◽

2007 ◽

Vol 28 (5) ◽

pp. 563-571 ◽

Cited By ~ 7

Author(s):

Jun Shao ◽

Huaibao Feng

Keyword(s):

Clinical Trials ◽

Small Sample ◽

T Test ◽

Sample Sizes ◽

Group Sequential ◽

Small Sample Sizes

Download Full-text

Type I Error Probabilities of the Wilcoxon-Mann-Whitney Test and Student T Test Altered by Heterogeneous Variances and Equal Sample Sizes

Perceptual and Motor Skills ◽

10.2466/pms.1999.88.2.556 ◽

1999 ◽

Vol 88 (2) ◽

pp. 556-558 ◽

Cited By ~ 3

Author(s):

Donald W. Zimmerman

Keyword(s):

Type I Error ◽

T Test ◽

Type I ◽

Sample Sizes ◽

Whitney Test ◽

Heterogeneous Variances ◽

Mann Whitney Test ◽

Error Probabilities

Download Full-text

Differences of Type I error rates for ANOVA and Multilevel-Linear-Models using SAS and SPSS for repeated measures designs

Meta-Psychology ◽

10.15626/mp.2018.898 ◽

2019 ◽

Vol 3 ◽

Author(s):

Nicolas Haverkamp ◽

André Beauducel

Keyword(s):

Repeated Measures ◽

Linear Models ◽

Type I Error ◽

Error Rates ◽

Small Sample ◽

Small Samples ◽

Type I ◽

Sample Sizes ◽

Type I Error Rates ◽

Multilevel Linear Models

To derive recommendations on how to analyze longitudinal data, we examined Type I error rates of Multilevel Linear Models (MLM) and repeated measures Analysis of Variance (rANOVA) using SAS and SPSS. We performed a simulation with the following specifications: To explore the effects of high numbers of measurement occasions and small sample sizes on Type I error, measurement occasions of m = 9 and 12 were investigated as well as sample sizes of n = 15, 20, 25 and 30. Effects of non-sphericity in the population on Type I error were also inspected: 5,000 random samples were drawn from two populations containing neither a within-subject nor a between-group effect. They were analyzed including the most common options to correct rANOVA and MLM-results: The Huynh-Feldt-correction for rANOVA (rANOVA-HF) and the Kenward-Roger-correction for MLM (MLM-KR), which could help to correct progressive bias of MLM with an unstructured covariance matrix (MLM-UN). Moreover, uncorrected rANOVA and MLM assuming a compound symmetry covariance structure (MLM-CS) were also taken into account. The results showed a progressive bias for MLM-UN for small samples which was stronger in SPSS than in SAS. Moreover, an appropriate bias correction for Type I error via rANOVA-HF and an insufficient correction by MLM-UN-KR for n < 30 were found. These findings suggest MLM-CS or rANOVA if sphericity holds and a correction of a violation via rANOVA-HF. If an analysis requires MLM, SPSS yields more accurate Type I error rates for MLM-CS and SAS yields more accurate Type I error rates for MLM-UN.

Download Full-text