scholarly journals More on the Supremum Statistic to Test Multivariate Skew-Normality

Computation ◽  
2021 ◽  
Vol 9 (12) ◽  
pp. 126
Author(s):  
Timothy Opheim ◽  
Anuradha Roy

This review is about verifying and generalizing the supremum test statistic developed by Balakrishnan et al. Exhaustive simulation studies are conducted for various dimensions to determine the effect, in terms of empirical size, of the supremum test statistic developed by Balakrishnan et al. to test multivariate skew-normality. Monte Carlo simulation studies indicate that the Type-I error of the supremum test can be controlled reasonably well for various dimensions for given nominal significance levels 0.05 and 0.01. Cut-off values are provided for the number of samples required to attain the nominal significance levels 0.05 and 0.01. Some new and relevant information of the supremum test statistic are reported here.

Author(s):  
Zaheer Ahmed ◽  
Alberto Cassese ◽  
Gerard van Breukelen ◽  
Jan Schepers

AbstractWe present a novel method, REMAXINT, that captures the gist of two-way interaction in row by column (i.e., two-mode) data, with one observation per cell. REMAXINT is a probabilistic two-mode clustering model that yields two-mode partitions with maximal interaction between row and column clusters. For estimation of the parameters of REMAXINT, we maximize a conditional classification likelihood in which the random row (or column) main effects are conditioned out. For testing the null hypothesis of no interaction between row and column clusters, we propose a $$max-F$$ m a x - F test statistic and discuss its properties. We develop a Monte Carlo approach to obtain its sampling distribution under the null hypothesis. We evaluate the performance of the method through simulation studies. Specifically, for selected values of data size and (true) numbers of clusters, we obtain critical values of the $$max-F$$ m a x - F statistic, determine empirical Type I error rate of the proposed inferential procedure and study its power to reject the null hypothesis. Next, we show that the novel method is useful in a variety of applications by presenting two empirical case studies and end with some concluding remarks.


1982 ◽  
Vol 7 (3) ◽  
pp. 207-214 ◽  
Author(s):  
Jennifer J. Clinch ◽  
H. J. Keselman

The ANOVA, Welch, and Brown and Forsyth tests for mean equality were compared using Monte Carlo methods. The tests’ rates of Type I error and power were examined when populations were non-normal, variances were heterogeneous, and group sizes were unequal. The ANOVA F test was most affected by the assumption violations. The test proposed by Brown and Forsyth appeared, on the average, to be the “best” test statistic for testing an omnibus hypothesis of mean equality.


2005 ◽  
Vol 32 (3) ◽  
pp. 193-195 ◽  
Author(s):  
Holly Raffle ◽  
Gordon P. Brooks

Violations of assumptions, inflated Type I error rates, and robustness are important concepts for students to learn in an introductory statistics course. However, these abstract ideas can be difficult for students to understand. Monte Carlo simulation methods can provide a concrete way for students to learn abstract statistical concepts. This article describes the MC4G computer software (Brooks, 2004) and the accompanying instructor's manual (Raffle, 2004). It also provides a case study that includes both assessment and course evaluation data supporting the effectiveness of Monte Carlo simulation exercises in a graduate-level statistics course.


Author(s):  
Vinicius Francisco Rofatto ◽  
Marcelo Tomio Matsuoka ◽  
Ivandro Klein ◽  
Mauricio Roberto Veronez ◽  
Luiz Gonzaga da Silveira Jr.

An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of the modern measurement systems in geodesy and surveying. The test statistic associated with IDS is the extreme normalised least-squares residual. It is well-known in the literature that critical values (quantile values) of such a test statistic cannot be derived from well-known test distributions, but must be computed numerically by means of Monte Carlo. This paper provides the first results about Monte Carlo-based critical value inserted to different scenarios of correlation between the outlier statistics. From the Monte Carlo evaluation, we compute the probabilities of correct identification, missed detection, wrong exclusion, overidentifications and statistical overlap associated with IDS in the presence of a single outlier. Based on such probability levels we obtain the Minimal Detectable Bias (MDB) and Minimal Identifiable Bias (MIB) for the case where IDS is in play. MDB and MIB are sensitivity indicators for outlier detection and identification, respectively. The results show that there are circumstances that the larger the Type I decision error (smaller critical value), the higher the rates of outlier detection, but the lower the rates of outlier identification. For that case, the larger the Type I Error, the larger the ratio between MIB and MDB. We also highlight that an outlier becomes identifiable when the contribution of the measures to the wrong exclusion rate decline simultaneously. In that case, we verify that the effect of the correlation between the outlier statistics on the wrong exclusion rates becomes insignificant from a certain outlier magnitude, which increases the probability of identification.


2020 ◽  
Vol 12 (5) ◽  
pp. 860 ◽  
Author(s):  
Vinicius Francisco Rofatto ◽  
Marcelo Tomio Matsuoka ◽  
Ivandro Klein ◽  
Maurício Roberto Veronez ◽  
Luiz Gonzaga da Silveira

An iterative outlier elimination procedure based on hypothesis testing, commonly known as Iterative Data Snooping (IDS) among geodesists, is often used for the quality control of modern measurement systems in geodesy and surveying. The test statistic associated with IDS is the extreme normalised least-squares residual. It is well-known in the literature that critical values (quantile values) of such a test statistic cannot be derived from well-known test distributions but must be computed numerically by means of Monte Carlo. This paper provides the first results on the Monte Carlo-based critical value inserted into different scenarios of correlation between outlier statistics. From the Monte Carlo evaluation, we compute the probabilities of correct identification, missed detection, wrong exclusion, over-identifications and statistical overlap associated with IDS in the presence of a single outlier. On the basis of such probability levels, we obtain the Minimal Detectable Bias (MDB) and Minimal Identifiable Bias (MIB) for cases in which IDS is in play. The MDB and MIB are sensitivity indicators for outlier detection and identification, respectively. The results show that there are circumstances in which the larger the Type I decision error (smaller critical value), the higher the rates of outlier detection but the lower the rates of outlier identification. In such a case, the larger the Type I Error, the larger the ratio between the MIB and MDB. We also highlight that an outlier becomes identifiable when the contributions of the measures to the wrong exclusion rate decline simultaneously. In this case, we verify that the effect of the correlation between outlier statistics on the wrong exclusion rate becomes insignificant for a certain outlier magnitude, which increases the probability of identification.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2575
Author(s):  
Jeffrey A. Walker

BackgroundSelf-contained tests estimate and test the association between a phenotype and mean expression level in a gene set defineda priori. Many self-contained gene set analysis methods have been developed but the performance of these methods for phenotypes that are continuous rather than discrete and with multiple nuisance covariates has not been well studied. Here, I use Monte Carlo simulation to evaluate the performance of both novel and previously published (and readily available via R) methods for inferring effects of a continuous predictor on mean expression in the presence of nuisance covariates. The motivating data are a high-profile dataset which was used to show opposing effects of hedonic and eudaimonic well-being (or happiness) on the mean expression level of a set of genes that has been correlated with social adversity (the CTRA gene set). The original analysis of these data used a linear model (GLS) of fixed effects with correlated error to infer effects ofHedoniaandEudaimoniaon mean CTRA expression.MethodsThe standardized effects ofHedoniaandEudaimoniaon CTRA gene set expression estimated by GLS were compared to estimates using multivariate (OLS) linear models and generalized estimating equation (GEE) models. The OLS estimates were tested using O’Brien’s OLS test, Anderson’s permutation ${r}_{F}^{2}$-test, two permutationF-tests (including GlobalAncova), and a rotationz-test (Roast). The GEE estimates were tested using a Wald test with robust standard errors. The performance (Type I, II, S, and M errors) of all tests was investigated using a Monte Carlo simulation of data explicitly modeled on the re-analyzed dataset.ResultsGLS estimates are inconsistent between data sets, and, in each dataset, at least one coefficient is large and highly statistically significant. By contrast, effects estimated by OLS or GEE are very small, especially relative to the standard errors. Bootstrap and permutation GLS distributions suggest that the GLS results in downward biased standard errors and inflated coefficients. The Monte Carlo simulation of error rates shows highly inflated Type I error from the GLS test and slightly inflated Type I error from the GEE test. By contrast, Type I error for all OLS tests are at the nominal level. The permutationF-tests have ∼1.9X the power of the other OLS tests. This increased power comes at a cost of high sign error (∼10%) if tested on small effects.DiscussionThe apparently replicated pattern of well-being effects on gene expression is most parsimoniously explained as “correlated noise” due to the geometry of multiple regression. The GLS for fixed effects with correlated error, or any linear mixed model for estimating fixed effects in designs with many repeated measures or outcomes, should be used cautiously because of the inflated Type I and M error. By contrast, all OLS tests perform well, and the permutationF-tests have superior performance, including moderate power for very small effects.


2021 ◽  
pp. 096228022110082
Author(s):  
Yang Li ◽  
Wei Ma ◽  
Yichen Qin ◽  
Feifang Hu

Concerns have been expressed over the validity of statistical inference under covariate-adaptive randomization despite the extensive use in clinical trials. In the literature, the inferential properties under covariate-adaptive randomization have been mainly studied for continuous responses; in particular, it is well known that the usual two-sample t-test for treatment effect is typically conservative. This phenomenon of invalid tests has also been found for generalized linear models without adjusting for the covariates and are sometimes more worrisome due to inflated Type I error. The purpose of this study is to examine the unadjusted test for treatment effect under generalized linear models and covariate-adaptive randomization. For a large class of covariate-adaptive randomization methods, we obtain the asymptotic distribution of the test statistic under the null hypothesis and derive the conditions under which the test is conservative, valid, or anti-conservative. Several commonly used generalized linear models, such as logistic regression and Poisson regression, are discussed in detail. An adjustment method is also proposed to achieve a valid size based on the asymptotic results. Numerical studies confirm the theoretical findings and demonstrate the effectiveness of the proposed adjustment method.


Sign in / Sign up

Export Citation Format

Share Document