scholarly journals ANOVA for Some Non-Normal Data by Inverting Reliability Estimators

2021 ◽  
Author(s):  
Robert Foster

Standard analysis of variance assumes observations are normally distributed within groups. This paper develops some analysis of variance tests for data which are Bernoulli, Poisson, exponential, or geometric distributed within groups. The tests are shown in Table 1. For natural exponential family data with conjugate priors for the distribution of means, reliability estimators directly estimate the posterior shrinkage. Using the linear posterior expectation induced by conjugate prior, a method is developed to construct an analysis of variance test by determining an appropriate transformation of a reliability estimator. The sampling distribution of the transformed reliability estimator under the assumption of group mean equality is derived to construct an appropriate test statistic. This method is used to invert the generalized KR21 estimators of Foster (2021) for some non-normal data, and it is also shown that the standard analysis of variance F-test statistic can be transformed into a consistent reliability estimator under the same assumptions. A limited simulation study shows that the inverted KR21 test has, in some scenarios, higher power than a standard analysis of variance or a generalized linear model analysis of variance.

1982 ◽  
Vol 7 (3) ◽  
pp. 207-214 ◽  
Author(s):  
Jennifer J. Clinch ◽  
H. J. Keselman

The ANOVA, Welch, and Brown and Forsyth tests for mean equality were compared using Monte Carlo methods. The tests’ rates of Type I error and power were examined when populations were non-normal, variances were heterogeneous, and group sizes were unequal. The ANOVA F test was most affected by the assumption violations. The test proposed by Brown and Forsyth appeared, on the average, to be the “best” test statistic for testing an omnibus hypothesis of mean equality.


Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1568
Author(s):  
Shaul K. Bar-Lev

Let F=Fθ:θ∈Θ⊂R be a family of probability distributions indexed by a parameter θ and let X1,⋯,Xn be i.i.d. r.v.’s with L(X1)=Fθ∈F. Then, F is said to be reproducible if for all θ∈Θ and n∈N, there exists a sequence (αn)n≥1 and a mapping gn:Θ→Θ,θ⟼gn(θ) such that L(αn∑i=1nXi)=Fgn(θ)∈F. In this paper, we prove that a natural exponential family F is reproducible iff it possesses a variance function which is a power function of its mean. Such a result generalizes that of Bar-Lev and Enis (1986, The Annals of Statistics) who proved a similar but partial statement under the assumption that F is steep as and under rather restricted constraints on the forms of αn and gn(θ). We show that such restrictions are not required. In addition, we examine various aspects of reproducibility, both theoretically and practically, and discuss the relationship between reproducibility, convolution and infinite divisibility. We suggest new avenues for characterizing other classes of families of distributions with respect to their reproducibility and convolution properties .


2021 ◽  
pp. 001316442199253
Author(s):  
Robert C. Foster

This article presents some equivalent forms of the common Kuder–Richardson Formula 21 and 20 estimators for nondichotomous data belonging to certain other exponential families, such as Poisson count data, exponential data, or geometric counts of trials until failure. Using the generalized framework of Foster (2020), an equation for the reliability for a subset of the natural exponential family have quadratic variance function is derived for known population parameters, and both formulas are shown to be different plug-in estimators of this quantity. The equivalent Kuder–Richardson Formulas 20 and 21 are given for six different natural exponential families, and these match earlier derivations in the case of binomial and Poisson data. Simulations show performance exceeding that of Cronbach’s alpha in terms of root mean square error when the formula matching the correct exponential family is used, and a discussion of Jensen’s inequality suggests explanations for peculiarities of the bias and standard error of the simulations across the different exponential families.


1992 ◽  
Vol 75 (3_suppl) ◽  
pp. 1124-1126
Author(s):  
John F. Walsh

A statistical test is developed based on the comparison of sums of squared errors associated with two competing models. A model based on cell means is compared to a representation that specifies the means for the treatment conditions. Comparing models is more general than the traditional H0 in analysis of variance wherein all the cell means are assumed equal. The test statistic, Proportional Increase in Error, is computed using the SAS statistical system.


Author(s):  
Lingtao Kong

The exponential distribution has been widely used in engineering, social and biological sciences. In this paper, we propose a new goodness-of-fit test for fuzzy exponentiality using α-pessimistic value. The test statistics is established based on Kullback-Leibler information. By using Monte Carlo method, we obtain the empirical critical points of the test statistic at four different significant levels. To evaluate the performance of the proposed test, we compare it with four commonly used tests through some simulations. Experimental studies show that the proposed test has higher power than other tests in most cases. In particular, for the uniform and linear failure rate alternatives, our method has the best performance. A real data example is investigated to show the application of our test.


2021 ◽  
Vol 6 (5) ◽  
pp. 43-49
Author(s):  
J. M. Aniesedo ◽  
C. N. Okoli

This study used the multivariate analysis of variance (MANOVA) test statistic to examine the impact of three categories feed used in the production of pig in Delta State. The multivariate test statistic considered are the Pillai – Bartlett trace, Wilks’ Test Statistic, Roy’s Largest Root Test Statistic, and the Lawley- Hotelling (LH) Statistic. The objectives include to: evaluate the robustness of the four Multivariate Analysis of Variance test statistics to ensure that the best is employed in multivariate analysis to guarantee most useful result in pig production; determine the relatively efficient test statistic for pig production; and determine the test statistic that is consistent across the sample sizes. Secondary source of data collection was used to obtain the data required for the analysis. The outcome of the study showed that the obtained data was multivariate normally distributed based on the result of the asymmetry-based multivariate normality test and the multivariate normality test based on the kurtosis test which makes the data suitable parametric multivariate method such as multivariate analysis of variance (MANOVA). The results show that the Wilks and Roy tests found a significant difference for the intercept. While the Pillai and LH tests could not find any significance. The Roy test was also found to be significant for feed one, feed two, and feed three. The Wilks and Roy tests also turned out to be significant differences for the intercept. All test measures showed significance for feed one. The Wilks and Roy tests also showed a significant difference for feed two, while all test measures found a significance for feed one. Another result showed that none of the tests found significance for the interaction between feed one and two, while the Roy test found significance for the interaction between feed one and three, feed two and three and feed one, two and three. The performance of the test for evaluating the performance of feeds for pig production with/without considering interactions was found to be in the following order of magnitude: Roy, Wilks and Pilla = LH. This result implies that the Roy method, with or without consideration of the interaction, has a better performance of the test than the other methods considered in the study.


2019 ◽  
Vol 2019 (3) ◽  
pp. 310-330 ◽  
Author(s):  
Marika Swanberg ◽  
Ira Globus-Harris ◽  
Iris Griffith ◽  
Anna Ritz ◽  
Adam Groce ◽  
...  

Abstract Hypothesis testing is one of the most common types of data analysis and forms the backbone of scientific research in many disciplines. Analysis of variance (ANOVA) in particular is used to detect dependence between a categorical and a numerical variable. Here we show how one can carry out this hypothesis test under the restrictions of differential privacy. We show that the F -statistic, the optimal test statistic in the public setting, is no longer optimal in the private setting, and we develop a new test statistic F1 with much higher statistical power. We show how to rigorously compute a reference distribution for the F1 statistic and give an algorithm that outputs accurate p-values. We implement our test and experimentally optimize several parameters. We then compare our test to the only previous work on private ANOVA testing, using the same effect size as that work. We see an order of magnitude improvement, with our test requiring only 7% as much data to detect the effect.


Sign in / Sign up

Export Citation Format

Share Document