scholarly journals Comparison of Test Statistics of Nonnormal and Unbalanced Samples for Multivariate Analysis of Variance in terms of Type-I Error Rates

2019 ◽  
Vol 2019 ◽  
pp. 1-8 ◽  
Author(s):  
Can Ateş ◽  
Özlem Kaymaz ◽  
H. Emre Kale ◽  
Mustafa Agah Tekindal

In this study, we investigate how Wilks’ lambda, Pillai’s trace, Hotelling’s trace, and Roy’s largest root test statistics can be affected when the normal and homogeneous variance assumptions of the MANOVA method are violated. In other words, in these cases, the robustness of the tests is examined. For this purpose, a simulation study is conducted in different scenarios. In different variable numbers and different sample sizes, considering the group variances are homogeneous σ12=σ22=⋯=σg2 and heterogeneous (increasing) σ12<σ22<⋯<σg2, random numbers are generated from Gamma(4-4-4; 0.5), Gamma(4-9-36; 0.5), Student’s t(2), and Normal(0; 1) distributions. Furthermore, the number of observations in the groups being balanced and unbalanced is also taken into account. After 10000 repetitions, type-I error values are calculated for each test for α = 0.05. In the Gamma distribution, Pillai’s trace test statistic gives more robust results in the case of homogeneous and heterogeneous variances for 2 variables, and in the case of 3 variables, Roy’s largest root test statistic gives more robust results in balanced samples and Pillai’s trace test statistic in unbalanced samples. In Student’s t distribution, Pillai’s trace test statistic gives more robust results in the case of homogeneous variance and Wilks’ lambda test statistic in the case of heterogeneous variance. In the normal distribution, in the case of homogeneous variance for 2 variables, Roy’s largest root test statistic gives relatively more robust results and Wilks’ lambda test statistic for 3 variables. Also in the case of heterogeneous variance for 2 and 3 variables, Roy’s largest root test statistic gives robust results in the normal distribution. The test statistics used with MANOVA are affected by the violation of homogeneity of covariance matrices and normality assumptions particularly from unbalanced number of observations.

1995 ◽  
Vol 20 (1) ◽  
pp. 27-39 ◽  
Author(s):  
James Algina ◽  
R. Clifford Blair ◽  
William T. Coombs

A maximum test in which the test statistic is the more extreme of the Brown-Forsythe and O’Brien’s test statistics is developed. Estimated Type I error rates and power are presented for the Brown-Forsythe test, O’Brien’s test, and the maximum test. For the conditions included in the study, Type I error rates for the maximum test are near the nominal level. In all conditions, the power of the maximum test tended to be equal to or greater than that of the test—O’Brien or Brown-Forsythe—that had the larger power.


2016 ◽  
Vol 77 (1) ◽  
pp. 54-81 ◽  
Author(s):  
Sandip Sinharay ◽  
Matthew S. Johnson

In a pioneering research article, Wollack and colleagues suggested the “erasure detection index” (EDI) to detect test tampering. The EDI can be used with or without a continuity correction and is assumed to follow the standard normal distribution under the null hypothesis of no test tampering. When used without a continuity correction, the EDI often has inflated Type I error rates. When used with a continuity correction, the EDI has satisfactory Type I error rates, but smaller power compared with the EDI without a continuity correction. This article suggests three methods for detecting test tampering that do not rely on the assumption of a standard normal distribution under the null hypothesis. It is demonstrated in a detailed simulation study that the performance of each suggested method is slightly better than that of the EDI. The EDI and the suggested methods were applied to a real data set. The suggested methods, although more computation intensive than the EDI, seem to be promising in detecting test tampering.


2017 ◽  
Author(s):  
Rounak Dey ◽  
Ellen M. Schmidt ◽  
Goncalo R. Abecasis ◽  
Seunggeun Lee

AbstractThe availability of electronic health record (EHR)-based phenotypes allows for genome-wide association analyses in thousands of traits, and has great potential to identify novel genetic variants associated with clinical phenotypes. We can interpret the phenome-wide association study (PheWAS) result for a single genetic variant by observing its association across a landscape of phenotypes. Since PheWAS can test 1000s of binary phenotypes, and most of them have unbalanced (case:control = 1:10) or often extremely unbalanced (case:control = 1:600) case-control ratios, existing methods cannot provide an accurate and scalable way to test for associations. Here we propose a computationally fast score test-based method that estimates the distribution of the test statistic using the saddlepoint approximation. Our method is much faster than the state of the art Firth’s test (∼ 100 times). It can also adjust for covariates and control type I error rates even when the case-control ratio is extremely unbalanced. Through application to PheWAS data from the Michigan Genomics Initiative, we show that the proposed method can control type I error rates while replicating previously known association signals even for traits with a very small number of cases and a large number of controls.


2004 ◽  
Vol 3 (1) ◽  
pp. 1-69 ◽  
Author(s):  
Sandrine Dudoit ◽  
Mark J. van der Laan ◽  
Katherine S. Pollard

The present article proposes general single-step multiple testing procedures for controlling Type I error rates defined as arbitrary parameters of the distribution of the number of Type I errors, such as the generalized family-wise error rate. A key feature of our approach is the test statistics null distribution (rather than data generating null distribution) used to derive cut-offs (i.e., rejection regions) for these test statistics and the resulting adjusted p-values. For general null hypotheses, corresponding to submodels for the data generating distribution, we identify an asymptotic domination condition for a null distribution under which single-step common-quantile and common-cut-off procedures asymptotically control the Type I error rate, for arbitrary data generating distributions, without the need for conditions such as subset pivotality. Inspired by this general characterization of a null distribution, we then propose as an explicit null distribution the asymptotic distribution of the vector of null value shifted and scaled test statistics. In the special case of family-wise error rate (FWER) control, our method yields the single-step minP and maxT procedures, based on minima of unadjusted p-values and maxima of test statistics, respectively, with the important distinction in the choice of null distribution. Single-step procedures based on consistent estimators of the null distribution are shown to also provide asymptotic control of the Type I error rate. A general bootstrap algorithm is supplied to conveniently obtain consistent estimators of the null distribution. The special cases of t- and F-statistics are discussed in detail. The companion articles focus on step-down multiple testing procedures for control of the FWER (van der Laan et al., 2004b) and on augmentations of FWER-controlling methods to control error rates such as tail probabilities for the number of false positives and for the proportion of false positives among the rejected hypotheses (van der Laan et al., 2004a). The proposed bootstrap multiple testing procedures are evaluated by a simulation study and applied to genomic data in the fourth article of the series (Pollard et al., 2004).


1989 ◽  
Vol 14 (4) ◽  
pp. 351-371 ◽  
Author(s):  
Michael R. Harwell ◽  
Ronald C. Serlin

Puri and Sen (1969Puri and Sen (1985) presented a nonparametric test statistic based on a general linear model approach that is appropriate for testing a wide class of hypotheses. The two forms of this statistic, pure- and mixed-rank, differ according to whether the original predictor values or their ranks are used. Both forms permit the use of standard statistical packages to perform the analyses. The applicability of these statistics in testing a number of hypotheses is highlighted, and an example of their use is given. A simulation study for the multivariate-multiple-regression case is used to examine the distributional behavior of the pure- and mixed-rank statistics and an important competitor, the rank transformation of Conover and Iman (1981). The results suggest that the pure- and mixed-rank statistics are superior with respect to minimizing liberal Type I error rates, whereas the Conover and Iman statistic produces larger power values.


2020 ◽  
Vol 17 (3) ◽  
pp. 273-284 ◽  
Author(s):  
Babak Choodari-Oskooei ◽  
Daniel J Bratton ◽  
Melissa R Gannon ◽  
Angela M Meade ◽  
Matthew R Sydes ◽  
...  

Background: Experimental treatments pass through various stages of development. If a treatment passes through early-phase experiments, the investigators may want to assess it in a late-phase randomised controlled trial. An efficient way to do this is adding it as a new research arm to an ongoing trial while the existing research arms continue, a so-called multi-arm platform trial. The familywise type I error rate is often a key quantity of interest in any multi-arm platform trial. We set out to clarify how it should be calculated when new arms are added to a trial some time after it has started. Methods: We show how the familywise type I error rate, any-pair and all-pairs powers can be calculated when a new arm is added to a platform trial. We extend the Dunnett probability and derive analytical formulae for the correlation between the test statistics of the existing pairwise comparison and that of the newly added arm. We also verify our analytical derivation via simulations. Results: Our results indicate that the familywise type I error rate depends on the shared control arm information (i.e. individuals in continuous and binary outcomes and primary outcome events in time-to-event outcomes) from the common control arm patients and the allocation ratio. The familywise type I error rate is driven more by the number of pairwise comparisons and the corresponding (pairwise) type I error rates than by the timing of the addition of the new arms. The familywise type I error rate can be estimated using Šidák’s correction if the correlation between the test statistics of pairwise comparisons is less than 0.30. Conclusions: The findings we present in this article can be used to design trials with pre-planned deferred arms or to add new pairwise comparisons within an ongoing platform trial where control of the pairwise error rate or familywise type I error rate (for a subset of pairwise comparisons) is required.


Author(s):  
Abdullah A. Ameen ◽  
Osama H. Abbas

The classicalWilks' statistic is mostly used to test hypothesesin the one-way multivariate analysis of variance (MANOVA), which is highly sensitive to the effects of outliers. The non-robustness of the test statistics based on normal theory has led many authors to examine various options.In this paper, we presented a robust version of the Wilks' statistic and constructed its approximate distribution.A comparison was made between the proposed statistics and some Wilks' statistics. The Monte Carlo studies are used to obtain performance assessment of test statistics in different data sets.Moreover, the results of the type I error rate and the power of test were considered as statistical tools to compare test statistics.The study reveals that, under normally distributed, the type I error rates for the classical and the proposedWilks' statistics are close to the true significance levels, and the power of the test statistics are so close. In addition, in the case of contaminated distribution, the proposed statistic is the best.  


2017 ◽  
Vol 41 (8) ◽  
pp. 632-644
Author(s):  
Jie Xu ◽  
Insu Paek ◽  
Yan Xia

It has been widely known that the Type I error rates of goodness-of-fit tests using full information test statistics, such as Pearson’s test statistic χ2 and the likelihood ratio test statistic G2, are problematic when data are sparse. Under such conditions, the limited information goodness-of-fit test statistic M2 is recommended in model fit assessment for models with binary response data. A simulation study was conducted to investigate the power and Type I error rate of M2 in fitting unidimensional models to many different types of multidimensional data. As an additional interest, the behavior of RMSEA2 was also examined, which is the root mean square error approximation (RMSEA) based on M2. Findings from the current study showed that M2 and RMSEA2 are sensitive in detecting the misfits due to varying slope parameters, the bifactor structure, and the partially (or completely) simple structure for multidimensional data, but not the misfits due to the within-item multidimensional structures.


PLoS ONE ◽  
2020 ◽  
Vol 15 (11) ◽  
pp. e0242722
Author(s):  
Zhiming Li ◽  
Changxing Ma ◽  
Mingyao Ai

This paper proposes asymptotic and exact methods for testing the equality of correlations for multiple bilateral data under Dallal’s model. Three asymptotic test statistics are derived for large samples. Since they are not applicable to small data, several conditional and unconditional exact methods are proposed based on these three statistics. Numerical studies are conducted to compare all these methods with regard to type I error rates (TIEs) and powers. The results show that the asymptotic score test is the most robust, and two exact tests have satisfactory TIEs and powers. Some real examples are provided to illustrate the effectiveness of these tests.


Sign in / Sign up

Export Citation Format

Share Document