Testing for goodness rather than lack of fit of continuous probability distributions

The vast majority of testing procedures presented in the literature as goodness-of-fit tests fail to accomplish what the term is promising. Actually, a significant result of such a test indicates that the true distribution underlying the data differs substantially from the assumed model, whereas the true objective is usually to establish that the model fits the data sufficiently well. Meeting that objective requires to carry out a testing procedure for a problem in which the statement that the deviations between model and true distribution are small, plays the role of the alternative hypothesis. Testing procedures of this kind, for which the term tests for equivalence has been coined in statistical usage, are available for establishing goodness-of-fit of discrete distributions. We show how this methodology can be extended to settings where interest is in establishing goodness-of-fit of distributions of the continuous type.

Download Full-text

Review of Probability Distributions

Managerial Approaches Toward Queuing Systems and Simulations - Advances in Mechatronics and Mechanical Engineering ◽

10.4018/978-1-5225-5264-2.ch002 ◽

2018 ◽

pp. 25-73

Keyword(s):

Confidence Intervals ◽

Goodness Of Fit ◽

Probability Distributions ◽

Discrete Distributions ◽

Continuous Distributions ◽

R Language ◽

Goodness Of Fit Tests ◽

Anderson Darling ◽

Analysis And Study ◽

Waiting Lines

In Chapter 2, probability distributions are presented; the distributions exposed are those with more relation to the analysis and study of waiting lines; discrete distributions: binomial, geometric, Poisson; continuous distributions: uniform, exponential, erlang, and normal. Confidence intervals are calculated for some of the parameters of the distributions. A brief example of the generation of pseudorandom exponential times using a spreadsheet is presented. The chapter closes with the goodness-of-fit tests of probability distributions, especially the Anderson-Darling test. The statistical language of programming R is used in the exercises performed. Several codes are proposed in R Language to perform calculations automatically.

Download Full-text

A best-fit probability distribution for the estimation of rainfall in northern regions of Pakistan

Open Life Sciences ◽

10.1515/biol-2016-0057 ◽

2016 ◽

Vol 11 (1) ◽

pp. 432-440 ◽

Cited By ~ 10

Author(s):

M. T. Amin ◽

M. Rizwan ◽

A. A. Alazba

Keyword(s):

Probability Distribution ◽

Goodness Of Fit ◽

Probability Distributions ◽

Future Research ◽

Maximum Rainfall ◽

Type Iii ◽

Goodness Of Fit Tests ◽

Pearson Type ◽

Northern Regions ◽

Best Fit

AbstractThis study was designed to find the best-fit probability distribution of annual maximum rainfall based on a twenty-four-hour sample in the northern regions of Pakistan using four probability distributions: normal, log-normal, log-Pearson type-III and Gumbel max. Based on the scores of goodness of fit tests, the normal distribution was found to be the best-fit probability distribution at the Mardan rainfall gauging station. The log-Pearson type-III distribution was found to be the best-fit probability distribution at the rest of the rainfall gauging stations. The maximum values of expected rainfall were calculated using the best-fit probability distributions and can be used by design engineers in future research.

Download Full-text

Estimating the Best-Fitted Probability Distribution for Monthly Maximum Temperature at the Sylhet Station in Bangladesh

Journal of Mathematics and Statistics Studies ◽

10.32996/jmss.2021.2.2.7 ◽

2021 ◽

Vol 2 (2) ◽

pp. 60-67

Author(s):

Rashidul Hasan Rashidul Hasan

Keyword(s):

Probability Distribution ◽

Goodness Of Fit ◽

Probability Distributions ◽

Probability Model ◽

Temperature Data ◽

Maximum Temperature ◽

Chi Square ◽

Goodness Of Fit Tests ◽

Anderson Darling ◽

Log Normal

The estimation of a suitable probability model depends mainly on the features of available temperature data at a particular place. As a result, existing probability distributions must be evaluated to establish an appropriate probability model that can deliver precise temperature estimation. The study intended to estimate the best-fitted probability model for the monthly maximum temperature at the Sylhet station in Bangladesh from January 2002 to December 2012 using several statistical analyses. Ten continuous probability distributions such as Exponential, Gamma, Log-Gamma, Beta, Normal, Log-Normal, Erlang, Power Function, Rayleigh, and Weibull distributions were fitted for these tasks using the maximum likelihood technique. To determine the model’s fit to the temperature data, several goodness-of-fit tests were applied, including the Kolmogorov-Smirnov test, Anderson-Darling test, and Chi-square test. The Beta distribution is found to be the best-fitted probability distribution based on the largest overall score derived from three specified goodness-of-fit tests for the monthly maximum temperature data at the Sylhet station.

Download Full-text

Best Fitted Distribution For Meteorological Data In Kuala Krai

Journal of Statistical Modelling and Analytics ◽

10.22452/josma.vol3no1.2 ◽

2021 ◽

Vol 3 (1) ◽

pp. 16-25

Author(s):

Siti Mariam Norrulashikin ◽

Fadhilah Yusof ◽

Siti Rohani Mohd Nor ◽

Nur Arina Bazilah Kamisan

Keyword(s):

Climate Change ◽

Probability Distribution ◽

Goodness Of Fit ◽

Probability Distributions ◽

Meteorological Data ◽

Cumulative Distribution ◽

Distribution Model ◽

Data Series ◽

Distribution Models ◽

Goodness Of Fit Tests

Modeling meteorological variables is a vital aspect of climate change studies. Awareness of the frequency and magnitude of climate change is a critical concern for mitigating the risks associated with climate change. Probability distribution models are valuable tools for a frequency study of climate variables since it measures how the probability distribution able to fit well in the data series. Monthly meteorological data including average temperature, wind speed, and rainfall were analyzed in order to determine the most suited probability distribution model for Kuala Krai district. The probability distributions that were used in the analysis were Beta, Burr, Gamma, Lognormal, and Weibull distributions. To estimate the parameters for each distribution, the maximum likelihood estimate (MLE) was employed. Goodness-of-fit tests such as the Kolmogorov-Smirnov, and Anderson-Darling tests were conducted to assess the best suited model, and the test's reliability. Results from statistical studies indicate that Burr distributions better characterize the meteorological data of our research. The graph of probability density function, cumulative distribution function as well as Q-Q plot are presented.

Download Full-text

Stochastic Generation of Low Stream Flow Data of Iokastis Stream, Kavala City, NE Greece

Proceedings ◽

10.3390/proceedings2110579 ◽

2018 ◽

Vol 2 (11) ◽

pp. 579

Author(s):

Thomas Papalaskaris ◽

Theologos Panagiotidis

Keyword(s):

Stream Flow ◽

Goodness Of Fit ◽

Probability Distributions ◽

Specific Area ◽

Low Flow ◽

Urban Stream ◽

Flow Data ◽

Goodness Of Fit Tests ◽

Chi Squared ◽

Anderson Darling

Only a few scientific research studies, especially dealing with extremely low flow conditions, have been compiled so far, in Greece. The present study, aiming to contribute in this specific area of hydrologic investigation, generates synthetic low stream flow time series of an entire calendar year considering the stream flow data recorded during a center interval period of the year 2015. We examined the goodness of fit tests of eleven theoretical probability distributions to daily low stream flow data acquired at a certain location of the absolutely channelized urban stream which crosses the roads junction formed by Iokastis road an Chrisostomou Smirnis road, Agios Loukas residential area, Kavala city, NE Greece, using a 3-inches conventional portable Parshall flume and calculated the corresponding probability distributions parameters. The Kolmogorov-Smirnov, Anderson-Darling and Chi-Squared, GOF tests were employed to show how well the probability distributions fitted the recorded data and the results were demonstrated through interactive tables providing us the ability to effectively decide which model best fits the observed data. Finally, the observed against the calculated low flow data are plotted, compiling a log-log scale chart and calculate statistics featuring the comparison between the recorded and the forecasted low flow data.

Download Full-text

Model Efficiency and Uncertainty in Quantile Estimation of Loss Severity Distributions

Risks ◽

10.3390/risks7020055 ◽

2019 ◽

Vol 7 (2) ◽

pp. 55

Author(s):

Vytaras Brazauskas ◽

Sahadeb Upretee

Keyword(s):

Value At Risk ◽

Goodness Of Fit ◽

Risk Measures ◽

Probability Distributions ◽

Simulated Data ◽

Information Criteria ◽

Asymptotic Distributions ◽

Parametric Estimation ◽

Goodness Of Fit Tests ◽

Transfer Strategies

Quantiles of probability distributions play a central role in the definition of risk measures (e.g., value-at-risk, conditional tail expectation) which in turn are used to capture the riskiness of the distribution tail. Estimates of risk measures are needed in many practical situations such as in pricing of extreme events, developing reserve estimates, designing risk transfer strategies, and allocating capital. In this paper, we present the empirical nonparametric and two types of parametric estimators of quantiles at various levels. For parametric estimation, we employ the maximum likelihood and percentile-matching approaches. Asymptotic distributions of all the estimators under consideration are derived when data are left-truncated and right-censored, which is a typical loss variable modification in insurance. Then, we construct relative efficiency curves (REC) for all the parametric estimators. Specific examples of such curves are provided for exponential and single-parameter Pareto distributions for a few data truncation and censoring cases. Additionally, using simulated data we examine how wrong quantile estimates can be when one makes incorrect modeling assumptions. The numerical analysis is also supplemented with standard model diagnostics and validation (e.g., quantile-quantile plots, goodness-of-fit tests, information criteria) and presents an example of when those methods can mislead the decision maker. These findings pave the way for further work on RECs with potential for them being developed into an effective diagnostic tool in this context.

Download Full-text

A Probabilistic Approach to the Simulation of Non-Linear Stress-Strain Relationships for Oriented Strandboard Subject to In-Plane Tension

Key Engineering Materials ◽

10.4028/www.scientific.net/kem.478.54 ◽

2011 ◽

Vol 478 ◽

pp. 54-63 ◽

Cited By ~ 2

Author(s):

Antony T. McTigue ◽

Annette M. Harte

Keyword(s):

Goodness Of Fit ◽

Probability Distributions ◽

Probabilistic Approach ◽

Regression Equations ◽

Goodness Of Fit Tests ◽

Oriented Strandboard ◽

Anderson Darling ◽

Standardised Testing ◽

Tension Loading ◽

Tension Strength

This paper presents the results from an experimental test program conducted on commercially available oriented strandboard (OSB) panels and statistical analyses of the results. Standardised testing was used to determine the short-term behaviour of OSB/3 panels subjected to tension loading. A variety of thicknesses sourced from three different producers were used. Analysis of the results indicate that a quadratic expression in the form of  = a2 + b provides the best description of the relationship between stress (and strain ( up to the point of failure. It has also been shown that the coefficients a and b of the quadratic regression equations are negatively correlated to each other. Anderson-Darling goodness-of-fit tests were conducted on the results for tension strength and modulus of elasticity (MOE). The results indicate that the tension strength and MOE come from populations that follow either normal or lognormal probability distributions.

Download Full-text

Goodness of fit tests for discrete distributions

Communication in Statistics- Theory and Methods ◽

10.1080/03610928608829153 ◽

1986 ◽

Vol 15 (3) ◽

pp. 815-829 ◽

Cited By ~ 30

Author(s):

S Kocherlakota ◽

K Kocherlakota

Keyword(s):

Goodness Of Fit ◽

Discrete Distributions ◽

Goodness Of Fit Tests

Download Full-text

Comparison of the Goodness-of-Fit Tests for Truncated Distributions

Przegląd Statystyczny ◽

10.5604/01.3001.0014.0541 ◽

2019 ◽

Vol 65 (3) ◽

pp. 296-313

Author(s):

Agnieszka Lach ◽

Łukasz Smaga

Keyword(s):

Goodness Of Fit ◽

Asset Returns ◽

Finite Sample ◽

Simulation Experiments ◽

Testing Procedures ◽

The Past ◽

Truncated Distributions ◽

Finite Samples ◽

Goodness Of Fit Tests ◽

Power Simulation

The aim of this paper is to investigate the finite sample behavior of seven goodness-of-fit tests for left truncated distributions of Chernobai et al. (2015) in terms of size and power. Simulation experiments are based on artificial data generated from the distributions that were used in the past or are used nowadays to describe the tails of asset returns. The study was conducted for different tail thickness and for changing truncation point. Simulation results indicate that the testing procedures do not work equally well under finite samples, and some of them require quite large number of observations to perform satisfactorily.

Download Full-text

Visual Assessment vs. Statistical Goodness of Fit Tests for Identifying Parent Population

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/154193128803200701 ◽

1988 ◽

Vol 32 (7) ◽

pp. 460-464

Author(s):

Mari Berry ◽

Brian Peacock ◽

Bobbie Foote ◽

Lawrence Leemis

Keyword(s):

Goodness Of Fit ◽

Statistical Tests ◽

Visual Assessment ◽

Discrete Distributions ◽

Human Observer ◽

Chi Square ◽

Data Set ◽

Goodness Of Fit Tests ◽

Parent Distribution ◽

Chi Square Test

Statistical tests are used to identify the parent distribution corresponding to a data set. A human observer looking at a histogram can also identify a probability distribution that models the parent distribution. The accuracy of a human observer was compared to the chi-square test for discrete data and the Kolmogorov-Smirnov and chi-square tests for continuous data. The human observer proved more accurate in identifying continuous distributions and the chi-square test proved to be superior in identifying discrete distributions. The effect of sample size and number of intervals in the histogram was included in the experimental design.

Download Full-text