Pairwise comparisons of means under realistic nonnormality, unequal variances, outliers and equal sample sizes

2011 ◽  
Vol 81 (2) ◽  
pp. 125-135 ◽  
Author(s):  
Philip H. Ramsey ◽  
Kyrstle Barrera ◽  
Pri Hachimine-Semprebom ◽  
Chang-Chia Liu
1994 ◽  
Vol 19 (3) ◽  
pp. 275-291 ◽  
Author(s):  
James Algina ◽  
T. C. Oshima ◽  
Wen-Ying Lin

Type I error rates were estimated for three tests that compare means by using data from two independent samples: the independent samples t test, Welch’s approximate degrees of freedom test, and James’s second-order test. Type I error rates were estimated for skewed distributions, equal and unequal variances, equal and unequal sample sizes, and a range of total sample sizes. Welch’s test and James’s test have very similar Type I error rates and tend to control the Type I error rate as well or better than the independent samples t test does. The results provide guidance about the total sample sizes required for controlling Type I error rates.


1977 ◽  
Vol 14 (4) ◽  
pp. 493-498 ◽  
Author(s):  
Joanne C. Rogan ◽  
H. J. Keselman

Numerous investigations have examined the effects of variance heterogeneity on the empirical probability of a Type I error for the analysis of variance (ANOVA) F-test and the prevailing conclusion has been that when sample sizes are equal, the ANOVA is robust to variance heterogeneity. However, Box (1954) reported a Type I error rate of .12, for a 5% nominal level, when unequal variances were paired with equal sample sizes. The present paper explored this finding, examining varying degrees and patterns of variance heterogeneity for varying sample sizes and number of treatment groups. The data indicate that the rate of Type 1 error varies as a function of the degree of variance heterogeneity and, consequently, it should not be assumed that the ANOVA F-test is always robust to variance heterogeneity when sample sizes are equal.


Methodology ◽  
2019 ◽  
Vol 15 (3) ◽  
pp. 128-136
Author(s):  
Jiin-Huarng Guo ◽  
Hubert J. Chen ◽  
Wei-Ming Luh

Abstract. Equivalence tests (also known as similarity or parity tests) have become more and more popular in addition to equality tests. However, in testing the equivalence of two population means, approximate sample sizes developed using conventional techniques found in the literature on this topic have usually been under-valued as having less statistical power than is required. In this paper, the authors first address the reason for this problem and then provide a solution using an exhaustive local search algorithm to find the optimal sample size. The proposed method is not only accurate but is also flexible so that unequal variances or sampling unit costs for different groups can be considered using different sample size allocations. Figures and a numerical example are presented to demonstrate various configurations. An R Shiny App is also available for easy use ( https://optimal-sample-size.shinyapps.io/equivalence-of-means/ ).


2009 ◽  
Vol 59 (6) ◽  
Author(s):  
Júlia Volaufová

AbstractSeemingly, testing for fixed effects in linear models with variance-covariance components has been solved for decades. However, even in simple situations such as in fixed one-way model with heteroscedastic variances (a multiple means case of the Behrens-Fisher problem) the questions of statistical properties of various approximations of test statistics are still alive. Here we present a brief overview of several approaches suggested in the literature as well as those available in statistical software, accompanied by a simulation study in which the accuracy of p-values is studied. Our interest is limited here to the Welch’s test, the Satterthwaite-Fai-Cornelius test, the Kenward-Roger test, the simple ANOVA F-test, and the parametric bootstrap test. We conclude that for small sample sizes, regardless the number of compared means and the heterogeneity of variance, the ANOVA F-test p-value performs the best. For higher sample sizes (at least 5 per group), the parametric bootstrap performs well, and the Kenward-Roger test also performs well.


2012 ◽  
Vol 58 (3) ◽  
pp. 426-439 ◽  
Author(s):  
Rebecca Safran ◽  
Samuel Flaxman ◽  
Michael Kopp ◽  
Darren E. Irwin ◽  
Derek Briggs ◽  
...  

Abstract Whereas a rich literature exists for estimating population genetic divergence, metrics of phenotypic trait divergence are lacking, particularly for comparing multiple traits among three or more populations. Here, we review and analyze via simulation Hedges’ g, a widely used parametric estimate of effect size. Our analyses indicate that g is sensitive to a combination of unequal trait variances and unequal sample sizes among populations and to changes in the scale of measurement. We then go on to derive and explain a new, non-parametric distance measure, “Δp”, which is calculated based upon a joint cumulative distribution function (CDF) from all populations under study. More precisely, distances are measured in terms of the percentiles in this CDF at which each population’s median lies. Δp combines many desirable features of other distance metrics into a single metric; namely, compared to other metrics, p is relatively insensitive to unequal variances and sample sizes among the populations sampled. Furthermore, a key feature of Δp—and our main motivation for developing it—is that it easily accommodates simultaneous comparisons of any number of traits across any number of populations. To exemplify its utility, we employ Δp to address a question related to the role of sexual selection in speciation: are sexual signals more divergent than ecological traits in closely related taxa? Using traits of known function in closely related populations, we show that traits predictive of reproductive performance are, indeed, more divergent and more sexually dimorphic than traits related to ecological adaptation.


Sign in / Sign up

Export Citation Format

Share Document