Statistical Justification of Pearson's Criterion for Testing a Complex Hypothesis on the Uniform Distribution

Author(s):  
T. V. Oblakova

The paper is studying the justification of the Pearson criterion for checking the hypothesis on the uniform distribution of the general totality. If the distribution parameters are unknown, then estimates of the theoretical frequencies are used [1, 2, 3]. In this case the quantile of the chi-square distribution with the number of degrees of freedom, reduced by the number of parameters evaluated, is used to determine the upper threshold of the main hypothesis acceptance [7]. However, in the case of a uniform law, the application of Pearson's criterion does not extend to complex hypotheses, since the likelihood function does not allow differentiation with respect to parameters, which is used in the proof of the theorem mentioned [7, 10, 11].A statistical experiment is proposed in order to study the distribution of Pearson statistics for samples from a uniform law. The essence of the experiment is that at first a statistically significant number of one-type samples from a given uniform distribution is modeled, then for each sample Pearson statistics are calculated, and then the law of distribution of the totality of these statistics is studied. Modeling and processing of samples were performed in the Mathcad 15 package using the built-in random number generator and array processing facilities.In all the experiments carried out, the hypothesis that the Pearson statistics conform to the chi-square law was unambiguously accepted (confidence level 0.95). It is also statistically proved that the number of degrees of freedom in the case of a complex hypothesis need not be corrected. That is, the maximum likelihood estimates of the uniform law parameters implicitly used in calculating Pearson statistics do not affect the number of degrees of freedom, which is thus determined by the number of grouping intervals only.

2019 ◽  
Vol 486 (1) ◽  
pp. 52-69 ◽  
Author(s):  
Masato Shirasaki ◽  
Takashi Hamana ◽  
Masahiro Takada ◽  
Ryuichi Takahashi ◽  
Hironao Miyatake

Abstract We use the full-sky ray-tracing weak lensing simulations to generate 2268 mock catalogues for the Subaru Hyper Suprime-Cam (HSC) survey first-year shear catalogue. Our mock catalogues take into account various effects as in the real data: the survey footprints, inhomogeneous angular distribution of source galaxies, statistical uncertainties in photometric redshift (photo-z) estimate, variations in the lensing weight, and the statistical noise in galaxy shape measurements including both intrinsic shapes and the measurement errors. We then utilize our mock catalogues to evaluate statistical uncertainties expected in measurements of cosmic shear two-point correlations ξ± with tomographic redshift information for the HSC survey. We develop a quasi-analytical formula for the Gaussian sample variance properly taking into account the number of source pairs in the survey footprints. The standard Gaussian formula significantly overestimates or underestimates the mock results by 50 per cent level. We also show that different photo-z catalogues or the six disconnected fields, rather than a consecutive geometry, cause variations in the covariance by ${\sim } 5{{\ \rm per\ cent}}$. The mock catalogues enable us to study the chi-square distribution for ξ±. We find the wider distribution than that naively expected for the distribution with the degrees of freedom of data vector used. Finally, we propose a method to include non-zero multiplicative bias in mock shape catalogue and show that the non-zero multiplicative bias can change the effective shape noise in cosmic shear analyses. Our results suggest an importance of estimating an accurate form of the likelihood function (and therefore the covariance) for robust cosmological parameter inference from the precise measurements.


Genetics ◽  
2002 ◽  
Vol 160 (4) ◽  
pp. 1631-1639 ◽  
Author(s):  
G P Copenhaver ◽  
E A Housworth ◽  
F W Stahl

AbstractThe crossover distribution in meiotic tetrads of Arabidopsis thaliana differs from those previously described for Drosophila and Neurospora. Whereas a chi-square distribution with an even number of degrees of freedom provides a good fit for the latter organisms, the fit for Arabidopsis was substantially improved by assuming an additional set of crossovers sprinkled, at random, among those distributed as per chi square. This result is compatible with the view that Arabidopsis has two pathways for meiotic crossing over, only one of which is subject to interference. The results further suggest that Arabidopsis meiosis has >10 times as many double-strand breaks as crossovers.


2015 ◽  
Vol 22 (74) ◽  
pp. 385-404
Author(s):  
Sérgio Fernando Loureiro Rezende ◽  
Ricardo Salera ◽  
José Márcio de Castro

This article aims to confront four theories of firm growth – Optimum Firm Size, Stage Theory of Growth, The Theory of the Growth of the Firm and Dynamic Capabilities – with empirical data derived from a backward-looking longitudinal qualitative case of the growth trajectory of a Brazilian capital goods firm. To do so, we employed Degree of Freedom-Analysis for data analysis. This technique aims to test the empirical strengths of competing theories using statistical tests, in particular Chi-square test. Our results suggest that none of the four theories fully explained the growth of the firm we chose as empirical case. Nevertheless, Dynamic Capabilities was regarded as providing a more satisfactory explanatory power.


1971 ◽  
Vol 97 (2-3) ◽  
pp. 325-330 ◽  
Author(s):  
J. H. Pollard

In his paper of 1941, Seal included details of some experiments he performed in an attempt to estimate the appropriate number of degrees of freedom for the chi-square goodness-of-fit test of a summation formula graduation. These results are referred to by Tetley and by Benjamin and Haycocks in their textbooks when they mention the difficulty of determining the number of degrees of freedom or mean chi-square value.


2002 ◽  
Vol 34 (4) ◽  
pp. 733-754 ◽  
Author(s):  
Antonio Páez ◽  
Takashi Uchida ◽  
Kazuaki Miyamoto

Geographically weighted regression (GWR) has been proposed as a technique to explore spatial parametric nonstationarity. The method has been developed mainly along the lines of local regression and smoothing techniques, a strategy that has led to a number of difficult questions about the regularity conditions of the likelihood function, the effective number of degrees of freedom, and in general the relevance of extending the method to derive inference and model specification tests. In this paper we argue that placing GWR within a different statistical context, as a spatial model of error variance heterogeneity, or what might be termed locational heterogeneity, solves these difficulties. A maximum-likelihood-based framework for estimation and inference of a general geographically weighted regression model is presented that leads to a method to estimate location-specific kernel bandwidths. Moreover, a test for locational heterogeneity is derived and its use exemplified with a case study.


2018 ◽  
Vol 2018 ◽  
pp. 1-9
Author(s):  
Xinting Zhai ◽  
Jixin Wang ◽  
Jinshi Chen

Due to the harsh working environment of the construction machinery, a simple distribution cannot be used to approximate the shape of the rainflow matrix. In this paper, the Weibull-normal (W-n) mixture distribution is used. The lowest Akaike information criterion (AIC) value is employed to determine the components number of the mixture. A parameter estimation method based on the idea of optimization is proposed. The method estimates parameters of the mixture by maximizing the log likelihood function (LLF) using an intelligent optimization algorithm (IOA), genetic algorithm (GA). To verify the performance of the proposed method, one of the already existing methods is applied in the simulation study and the practical case study. The fitting effects of the fitted distributions are compared by calculating the AIC and chi-square (χ2) value. It can be concluded that the proposed method is feasible and effective for parameter estimation of the mixture distribution.


Stats ◽  
2020 ◽  
Vol 3 (3) ◽  
pp. 330-342
Author(s):  
Wolf-Dieter Richter

We prove that the Behrens–Fisher statistic follows a Student bridge distribution, the mixing coefficient of which depends on the two sample variances only through their ratio. To this end, it is first shown that a weighted sum of two independent normalized chi-square distributed random variables is chi-square bridge distributed, and secondly that the Behrens–Fisher statistic is based on such a variable and a standard normally distributed one that is independent of the former. In case of a known variance ratio, exact standard statistical testing and confidence estimation methods apply without the need for any additional approximations. In addition, a three pillar bridges explanation is given for the choice of degrees of freedom in Welch’s approximation to the exact distribution of the Behrens–Fisher statistic.


Sign in / Sign up

Export Citation Format

Share Document