scholarly journals Problems of nonparametric goodness-of-fit test application in tasks of measurement results processing

Author(s):  
Boris Lemeshko ◽  
◽  
Stanislav Lemeshko ◽  

It is argued that in most cases two reasons underlie the incorrect application of nonparametric goodness-of-fit tests in various applications. The first reason is that when testing composite hypotheses and evaluating the parameters of the law for the analyzed sample, classical results associated with testing simple hypotheses are used. When testing composite hypotheses, the distributions of goodness-of-fit statistics are influenced by the form of the observed law F(x, q) corresponding to the hypothesis being tested, by the type and number of estimated parameters, by the estimation method, and in some cases by the value of the shape parameter. The paper shows the influence of all mentiomed factors on the distribution of test statistics. It is emphasized that, when testing composite hypotheses, the neglect, of the fact that the test has lost the property of “freedom from distribution” leads to an increase in the probability of the 2nd kind errors. It is shown that the distribution of the statistics of the test necessary for the formation of a conclusion about the results of testing a composite hypothesis can be found using simulation in an interactive mode directly in the process of testing. The second reason is associated with the presence of round-off errors which can significantly change the distributions of test statistics. The paper shows that asymptotic results when testing simple and composite hypotheses can be used with round -off errors D much less than the standard deviation s of the distribution law of measurement errors and sample sizes n not exceeding some maximum values. For sample sizes larger than these maximum values, the real distributions of the test statistics deviate from asymptotic ones towards larger statistics values. In such situations, the use of asymptotic distributions to arrive at a conclusion about the test results leads to an increase in the probabilities of errors of the 1st kind (to the rejection of a valid hypothesis being tested). It is shown that when the round-off errors and s are commensurable, the distributions of the test statistics deviate from the asymptotic distributions for small n. And as n grows, the situation only gets worse. In the paper, changes in the distributions of statistics under the influence of rounding are demonstrated both when testing both simple and composite hypotheses. It is shown that the only way out that ensures the correctness of conclusions according to the applied tests in such non-standard conditions is the use of real distributions of statistics. This task can be solved interactively (in the process of verification) and rely on computer research technologies and the apparatus of mathematical statistics.

Author(s):  
Lingtao Kong

The exponential distribution has been widely used in engineering, social and biological sciences. In this paper, we propose a new goodness-of-fit test for fuzzy exponentiality using α-pessimistic value. The test statistics is established based on Kullback-Leibler information. By using Monte Carlo method, we obtain the empirical critical points of the test statistic at four different significant levels. To evaluate the performance of the proposed test, we compare it with four commonly used tests through some simulations. Experimental studies show that the proposed test has higher power than other tests in most cases. In particular, for the uniform and linear failure rate alternatives, our method has the best performance. A real data example is investigated to show the application of our test.


1991 ◽  
Vol 21 (1) ◽  
pp. 58-65 ◽  
Author(s):  
Dennis E. Jelinski

Chi-square (χ2) tests are analytic procedures that are often used to test the hypothesis that animals use a particular food item or habitat in proportion to its availability. Unfortunately, several sources of error are common to the use of χ2 analysis in studies of resource utilization. Both the goodness-of-fit and homogeneity tests have been incorrectly used interchangeably when resource availabilities are estimated or known apriori. An empirical comparison of the two methods demonstrates that the χ2 test of homogeneity may generate results contrary to the χ2 goodness-of-fit test. Failure to recognize the conservative nature of the χ2 homogeneity test, when "expected" values are known apriori, may lead to erroneous conclusions owing to the increased possibility of committing a type II error. Conversely, proper use of the goodness-of-fit method is predicated on the availability of accurate maps of resource abundance, or on estimates of resource availability based on very large sample sizes. Where resource availabilities have been estimated from small sample sizes, the use of the χ2 goodness-of-fit test may lead to type I errors beyond the nominal level of α. Both tests require adherence to specific critical assumptions that often have been violated, and accordingly, these assumptions are reviewed here. Alternatives to the Pearson χ2 statistic are also discussed.


Genome ◽  
1989 ◽  
Vol 32 (1) ◽  
pp. 57-63 ◽  
Author(s):  
S. J. Knapp ◽  
L. A. Tagliani

Genetic markers are needed for mating systems and breeding experiments in Cuphea lanceolata Ait.; however, none have been described in this species. Allozyme variation was analyzed among 14 F2 populations assayed for aconitase (ACO), diaphorase (DIA), esterase (EST), fluorescent esterase transaminase (FEST), glutamine oxaloacetate transaminase (GOT), menadione reductase (MNR), phosphoglucomutase (PGM), phosphoglucose isomerase (PGI), and shikimate dehydrogenase (SKDH) enzyme activity. At least 23 loci were resolved in these enzyme systems: 6 monomorphic loci, 5 poorly resolved loci, and 12 clearly resolved polymorphic loci. Observed segregation ratios were generally not significantly different (P > 0.05) from expected segregation ratios; however, segregation distortion was observed at Skdh-1 and Mnr-1 (Dia-1) in some F2 populations. Skdh-1 and Pgm-2 and Est-1, Est-2, Fest-1, and Mnr-1 comprise putative linkage groups. Allozyme variation was observed between and within accessions. The expected average heterozygosity was 16.3%. There were one to eight polymorphic loci among the F2 populations analyzed. There were an average of 2.05 alleles per locus. Several useful codominant markers were identified and a partial allozyme linkage map was constructed. Additional work is needed to revise and complete the map.Key words: Cuphea, isozymes, goodness of fit test statistics, lauric acid, capric acid.


Author(s):  
Khaoula Aidi ◽  
Nadeem Shafique Butt ◽  
Mir Masoom Ali ◽  
Mohamed Ibrahim ◽  
Haitham M. Yousof ◽  
...  

A new modified version of the Bagdonavičius-Nikulin goodness-of-fit test statistic is presented for validity for the right censor case under the double Burr type X distribution. The maximum likelihood estimation method in censored data case is used and applied. Simulations via the algorithm of Barzilai-Borwein is performed for assessing the right censored estimation method. Another simulation study is presented for testing the null hypothesis under the modified version of the Bagdonavičius and Nikulin goodness-of-fit statistical test. Four right censored data sets are analyzed under the new modified test statistic for checking the distributional validation.


Methodology ◽  
2015 ◽  
Vol 11 (2) ◽  
pp. 65-79 ◽  
Author(s):  
Geert H. van Kollenburg ◽  
Joris Mulder ◽  
Jeroen K. Vermunt

The application of latent class (LC) analysis involves evaluating the LC model using goodness-of-fit statistics. To assess the misfit of a specified model, say with the Pearson chi-squared statistic, a p-value can be obtained using an asymptotic reference distribution. However, asymptotic p-values are not valid when the sample size is not large and/or the analyzed contingency table is sparse. Another problem is that for various other conceivable global and local fit measures, asymptotic distributions are not readily available. An alternative way to obtain the p-value for the statistic of interest is by constructing its empirical reference distribution using resampling techniques such as the parametric bootstrap or the posterior predictive check (PPC). In the current paper, we show how to apply the parametric bootstrap and two versions of the PPC to obtain empirical p-values for a number of commonly used global and local fit statistics within the context of LC analysis. The main difference between the PPC using test statistics and the parametric bootstrap is that the former takes into account parameter uncertainty. The PPC using discrepancies has the advantage that it is computationally much less intensive than the other two resampling methods. In a Monte Carlo study we evaluated Type I error rates and power of these resampling methods when used for global and local goodness-of-fit testing in LC analysis. Results show that both the bootstrap and the PPC using test statistics are generally good alternatives to asymptotic p-values and can also be used when (asymptotic) distributions are not known. Nominal Type I error rates were not met when sample size was small and the contingency table has many cells. Overall the PPC using test statistics was somewhat more conservative than the parametric bootstrap. We have also replicated previous research suggesting that the Pearson χ2 statistic should in many cases be preferred over the likelihood-ratio G2 statistic. Power to reject a model for which the number of LCs was one less than in the population was very high, unless sample size was small. When the contingency tables are very sparse, the total bivariate residual (TBVR) statistic, which is based on bivariate relationships, still had very high power, signifying its usefulness in assessing model fit.


Sign in / Sign up

Export Citation Format

Share Document