Alternatives to the Chi-Square Test for Evaluating Rank Histograms from Ensemble Forecasts

2005 ◽  
Vol 20 (5) ◽  
pp. 789-795 ◽  
Author(s):  
Kimberly L. Elmore

Abstract Rank histograms are a commonly used tool for evaluating an ensemble forecasting system’s performance. Because the sample size is finite, the rank histogram is subject to statistical fluctuations, so a goodness-of-fit (GOF) test is employed to determine if the rank histogram is uniform to within some statistical certainty. Most often, the χ2 test is used to test whether the rank histogram is indistinguishable from a discrete uniform distribution. However, the χ2 test is insensitive to order and so suffers from troubling deficiencies that may render it unsuitable for rank histogram evaluation. As shown by examples in this paper, more powerful tests, suitable for small sample sizes, and very sensitive to the particular deficiencies that appear in rank histograms are available from the order-dependent Cramér–von Mises family of statistics, in particular, the Watson and Anderson–Darling statistics.

1991 ◽  
Vol 21 (1) ◽  
pp. 58-65 ◽  
Author(s):  
Dennis E. Jelinski

Chi-square (χ2) tests are analytic procedures that are often used to test the hypothesis that animals use a particular food item or habitat in proportion to its availability. Unfortunately, several sources of error are common to the use of χ2 analysis in studies of resource utilization. Both the goodness-of-fit and homogeneity tests have been incorrectly used interchangeably when resource availabilities are estimated or known apriori. An empirical comparison of the two methods demonstrates that the χ2 test of homogeneity may generate results contrary to the χ2 goodness-of-fit test. Failure to recognize the conservative nature of the χ2 homogeneity test, when "expected" values are known apriori, may lead to erroneous conclusions owing to the increased possibility of committing a type II error. Conversely, proper use of the goodness-of-fit method is predicated on the availability of accurate maps of resource abundance, or on estimates of resource availability based on very large sample sizes. Where resource availabilities have been estimated from small sample sizes, the use of the χ2 goodness-of-fit test may lead to type I errors beyond the nominal level of α. Both tests require adherence to specific critical assumptions that often have been violated, and accordingly, these assumptions are reviewed here. Alternatives to the Pearson χ2 statistic are also discussed.


2008 ◽  
Vol 136 (6) ◽  
pp. 2133-2139 ◽  
Author(s):  
Ian T. Jolliffe ◽  
Cristina Primo

Abstract Rank histograms are often plotted to evaluate the forecasts produced by an ensemble forecasting system—an ideal rank histogram is “flat” or uniform. It has been noted previously that the obvious test of “flatness,” the well-known χ2 goodness-of-fit test, spreads its power thinly and hence is not good at detecting specific alternatives to flatness, such as bias or over- or underdispersion. Members of the Cramér–von Mises family of tests do much better in this respect. An alternative to using the Cramér–von Mises family is to decompose the χ2 test statistic into components that correspond to specific alternatives. This approach is described in the present paper. It is arguably easier to use and more flexible than the Cramér–von Mises family of tests, and does at least as well as it in detecting alternatives corresponding to bias and over- or underdispersion.


2021 ◽  
Vol 39 (6_suppl) ◽  
pp. 465-465
Author(s):  
Arpit Rao ◽  
Julie Elaine McGrath ◽  
Joanne Xiu ◽  
Andre Luiz De Souza ◽  
Shuchi Gulati ◽  
...  

465 Background: UTUC is a rare genitourinary malignancy and a number of studies, limited by small sample sizes, have attempted to characterize its mutational landscape. Because immunotherapy is commonly used for this disease type, we evaluated the prevalence of microsatellite instability and characterized the mutational landscapes of UTUC in a large contemporary patient cohort. Methods: UTUC tumor samples were analyzed using next generation sequencing (NGS) (NextSeq, 592 gene panel) or whole exome sequencing (WES) (NovaSeq) (Caris Life Sciences, Phoenix, AZ). Mismatch repair status (deficient [dMMR] or proficient [pMMR]) and microsatellite instability status (MSI-high or stable [MSS]) were detected by immunohistochemistry (IHC), fragment analysis, and NGS. Tumor mutational burden (TMB) was measured by counting all somatic mutations found per tumor (high cutoff ≥ 10 mutations per MB). PD-L1 expression was tested by IHC using PD-L1 antibody clones 22c3 (Agilent; positive cutoff CPS ≥ 10) and SP142 (Ventana; positive cutoff ≥ 5% IC). Pathogenic fusion events were detected using whole transcriptome sequencing (NovaSeq). Statistical significance was determined using the Chi-square test and adjusted for multiple comparison. Results: 538 patients with included – median (range) age 71.5 (30-89) years and 37.5% female/62.5% male. Prevalence of dMMR/MSI-H was 3.9% (21/538) and TMB-high was 22.7% (96/423). Significant molecular differences were not detected in primary vs metastatic disease or in male vs female cases. dMMR/MSI-H tumors had higher frequency of TMB-high compared to MSS tumors (100% vs. 19%, p = 0.00003). dMMR/MSI-H tumors also had a higher frequency than MSS tumors for mutations in genes involved in chromatin remodeling (ASXL 82.4%, CREBBP 60%, SMARCA4 40%, KMT2D 95%, ARIDIA 100%, KMT2A 20%, KMT2C 35.3%, NSD1 20%), DNA-damage repair (FANCG 10%, ATM 45%, ATRX 40%) and other biological pathways (RNF43 10%, PTCH1 21.4%, ERBB3 30%, CDKN2A 25%, TSC2 15%, FLNC 15%, HNF1A 20%, CIC 15%, DNMT3A 17.6%); all adjusted p < 0.05. Pathogenic fusions were detected in 3.8% (17/443) cases, with FGFR3 fusion being the most common, occurring in 2.7% (12/443) cases. PD-L1 positivity was identified in 33.2% (133/400) cases tested by 22c3 antibody and 28.4% (89/313) cases tested by SP142 antibody. No difference was seen in PD-L1 positivity between MSI-H/dMMR vs. MSS tumors. Conclusions: In the largest analysis to date, we found a 3.9% prevalence of dMMR/MSI-high rate in UTUC. All dMMR/MSI-H tumors displayed TMB-high. PD-L1 positivity was comparable between dMMR/MSI-H and MSS tumors. dMMR/MSI-H tumors had a significantly higher rate of mutations in genes involved in chromatin remodeling and DDR biological pathways. These results could inform design of targeted therapy trials in UTUC.


2021 ◽  
Vol 2 (2) ◽  
pp. 60-67
Author(s):  
Rashidul Hasan Rashidul Hasan

The estimation of a suitable probability model depends mainly on the features of available temperature data at a particular place. As a result, existing probability distributions must be evaluated to establish an appropriate probability model that can deliver precise temperature estimation. The study intended to estimate the best-fitted probability model for the monthly maximum temperature at the Sylhet station in Bangladesh from January 2002 to December 2012 using several statistical analyses. Ten continuous probability distributions such as Exponential, Gamma, Log-Gamma, Beta, Normal, Log-Normal, Erlang, Power Function, Rayleigh, and Weibull distributions were fitted for these tasks using the maximum likelihood technique. To determine the model’s fit to the temperature data, several goodness-of-fit tests were applied, including the Kolmogorov-Smirnov test, Anderson-Darling test, and Chi-square test. The Beta distribution is found to be the best-fitted probability distribution based on the largest overall score derived from three specified goodness-of-fit tests for the monthly maximum temperature data at the Sylhet station.


2003 ◽  
Vol 33 (2) ◽  
pp. 365-381 ◽  
Author(s):  
Vytaras Brazauskas ◽  
Robert Serfling

Several recent papers treated robust and efficient estimation of tail index parameters for (equivalent) Pareto and truncated exponential models, for large and small samples. New robust estimators of “generalized median” (GM) and “trimmed mean” (T) type were introduced and shown to provide more favorable trade-offs between efficiency and robustness than several well-established estimators, including those corresponding to methods of maximum likelihood, quantiles, and percentile matching. Here we investigate performance of the above mentioned estimators on real data and establish — via the use of goodness-of-fit measures — that favorable theoretical properties of the GM and T type estimators translate into an excellent practical performance. Further, we arrive at guidelines for Pareto model diagnostics, testing, and selection of particular robust estimators in practice. Model fits provided by the estimators are ranked and compared on the basis of Kolmogorov-Smirnov, Cramér-von Mises, and Anderson-Darling statistics.


2018 ◽  
Vol 10 (12) ◽  
pp. 534
Author(s):  
Janilson Pinheiro de Assis ◽  
Roberto Pequeno de Sousa ◽  
Ben Deivide de Oliveira Batista ◽  
Paulo César Ferreira Linhares ◽  
Eudes de Almeida Cardoso ◽  
...  

We fitted the following seven distribution probabilities to the data of monthly average temperature in Mossor&oacute;, northeastern Brazil: Normal, Log-Normal, Beta, Gamma, Log-Pearson (Type III), Gumbel, and Weibull. To assess the goodness of fit the empirical distributions to the theoretical distribution, we applied the tests of Kolmogorov-Smirnov, Chi-square, Cramer-von Mises, Anderson-Darling, Kuiper, and Logarithm of Maximum Likelihood, at 10% of probability. The temperature series were obtained from 1970 to 2007. The Normal distribution provided the best fit to the historical series of average monthly temperature. Although the Kolmogorov-Smirnov test showed a very high level of approval, which generated some uncertainty regarding the test criteria, it is the more recommended to studies with approximately symmetric data and small series.


Author(s):  
Naz Saud ◽  
Sohail Chand

A class of goodness of fit tests for Marshal-Olkin Extended Rayleigh distribution with estimated parameters is proposed. The tests are based on the empirical distribution function. For determination of asymptotic percentage points, Kolomogorov-Sminrov, Cramer-von-Mises, Anderson-Darling,Watson, and Liao-Shimokawa test statistic are used. This article uses Monte Carlo simulations to obtain asymptotic percentage points for Marshal-Olkin extended Rayleigh distribution. Moreover, power of the goodness of fit test statistics is investigated for this lifetime model against several alternatives.


2018 ◽  
Vol 24 (3) ◽  
Author(s):  
OLUSEYI OGUNSOLA ◽  
OGUNSOLA OSAGIEDE

<p>The wind energy potential at Ikeja (Lat. 6.35 °N; Long. 3.20 °E), Nigeria was statistically analyzed using three of the mostly utilized conventional Probability Distribution Functions (PDFs) in order to determine which of these distributions would give the best means of analysis for wind in this particular location. The best fit test for these PDFs were determined from Akaike Information Criteria, Bayesian Information Criteria, Kolmogorov-Smirnov test, Cramer-von Mises statistics, Anderson-Darling Statistic, Mean Square Error and Chi-Square Test using Maximum Likelihood Estimation and Method of Moments as parameter estimates. The Weibull distribution gave the best fit in this location.</p>


Sign in / Sign up

Export Citation Format

Share Document