Goodness of fit checks for binomial N-mixture models

AbstractBinomial N-mixture models are commonly applied to analyze population survey data. By estimating detection probabilities, N-mixture models aim at extracting information about abundances in terms of actual and not just relative numbers. This separation of detection probability and abundance relies on parametric assumptions about the distribution of individuals among sites and of detections of individuals among repeat visits to sites. Current methods for checking assumptions are limited, and their computational complexity have hindered evaluations of their performances.We develop computationally efficient graphical goodness of fit checks and measures of overdispersion for binomial N-mixture models. These checks are illustrated in a case study, and evaluated in simulations under two scenarios. The two scenarios assume overdispersion in the abundance distribution via a negative binomial distribution or in the detection probability via a beta-binomial distribution. We evaluate the ability of the checks to detect lack of fit, and how lack of fit affects estimates of abundances.The simulations show that if the parametric assumptions are incorrect there can be severe biases in estimated abundances: negatively if there is overdispersion in abundance relative to the fitted model and positively if there is overdispersion in detection. Our goodness of fit checks performed well in detecting lack of fit when the abundance distribution is overdispersed, but struggled to detect lack of fit when detections were overdispersed. We show that the inability to detect lack of fit due to overdispersed detection is caused by a fundamental similarity between N-mixture models with beta-binomial detections and N-mixture models with negative binomial abundances.The strong biases in estimated abundances that can occur in the binomial N-mixture model when the distribution of individuals among sites, or the detection model, is mis-specified implies that checking goodness of fit is essential for sound inference in ecological studies that use these methods. To check the assumptions we provide computationally efficient goodness of fit checks that are available in an R-package nmixgof. However, even when a binomial N-mixture model appears to fit the data well, estimates are not robust in the presence of overdispersion unless additional information about detection is collected.

Download Full-text

On hypergeometric generalized negative binomial distribution

International Journal of Mathematics and Mathematical Sciences ◽

10.1155/s0161171202106193 ◽

2002 ◽

Vol 29 (12) ◽

pp. 727-736 ◽

Cited By ~ 11

Author(s):

M. E. Ghitany ◽

S. A. Al-Awadhi ◽

S. L. Kalla

Keyword(s):

Recurrence Relation ◽

Negative Binomial Distribution ◽

Binomial Distribution ◽

Field Data ◽

Goodness Of Fit ◽

Negative Binomial ◽

Generalized Negative Binomial Distribution ◽

The Right ◽

Three Term Recurrence Relation

It is shown that the hypergeometric generalized negative binomial distribution has moments of all positive orders, is overdispersed, skewed to the right, and leptokurtic. Also, a three-term recurrence relation for computing probabilities from the considered distribution is given. Application of the distribution to entomological field data is given and its goodness-of-fit is demonstrated.

Download Full-text

Patterns of macroparasite aggregation in wildlife host populations

Parasitology ◽

10.1017/s0031182098003448 ◽

1998 ◽

Vol 117 (6) ◽

pp. 597-610 ◽

Cited By ~ 277

Author(s):

D. J. SHAW ◽

B. T. GRENFELL ◽

A. P. DOBSON

Keyword(s):

Poisson Distribution ◽

Negative Binomial Distribution ◽

Binomial Distribution ◽

Goodness Of Fit ◽

Negative Binomial ◽

Data Sets ◽

Frequency Distributions ◽

System A ◽

Host Sex ◽

Host Parasite

Frequency distributions from 49 published wildlife host–macroparasite systems were analysed by maximum likelihood for goodness of fit to the negative binomial distribution. In 45 of the 49 (90%) data-sets, the negative binomial distribution provided a statistically satisfactory fit. In the other 4 data-sets the negative binomial distribution still provided a better fit than the Poisson distribution, and only 1 of the data-sets fitted the Poisson distribution. The degree of aggregation was large, with 43 of the 49 data-sets having an estimated k of less than 1. From these 49 data-sets, 22 subsets of host data were available (i.e. host data could be divided by either host sex, age, where or when hosts were sampled). In 11 of these 22 subsets there was significant variation in the degree of aggregation between host subsets of the same host–parasite system. A common k estimate was always larger than that obtained with all the host data considered together. These results indicate that lumping host data can hide important variations in aggregation between hosts and can exaggerate the true degree of aggregation. Wherever possible common k estimates should be used to estimate the degree of aggregation. In addition, significant differences in the degree of aggregation between subgroups of host data, were generally associated with significant differences in both mean parasite burdens and the prevalence of infection.

Download Full-text

A k-Inflated Negative Binomial Mixture Regression Model: Application to Rate–Making Systems

Asia-Pacific Journal of Risk and Insurance ◽

10.1515/apjri-2017-0014 ◽

2018 ◽

Vol 12 (2) ◽

Cited By ~ 1

Author(s):

Amir T. Payandeh Najafabadi ◽

Saeed MohammadPour

Keyword(s):

Regression Model ◽

Mixture Models ◽

Mixture Model ◽

Negative Binomial ◽

Mixture Distribution ◽

Third Party ◽

Numerical Illustration ◽

Mixture Regression ◽

Pure Premium ◽

Binomial Mixture

Abstract This article introduces a k-Inflated Negative Binomial mixture distribution/regression model as a more flexible alternative to zero-inflated Poisson distribution/regression model. An EM algorithm has been employed to estimate the model’s parameters. Then, such new model along with a Pareto mixture model have employed to design an optimal rate–making system. Namely, this article employs number/size of reported claims of Iranian third party insurance dataset. Then, it employs the k-Inflated Negative Binomial mixture distribution/regression model as well as other well developed counting models along with a Pareto mixture model to model frequency/severity of reported claims in Iranian third party insurance dataset. Such numerical illustration shows that: (1) the k-Inflated Negative Binomial mixture models provide more fair rate/pure premiums for policyholders under a rate–making system; and (2) in the situation that number of reported claims uniformly distributed in past experience of a policyholder (for instance $k_1=1$ and $k_2=1$ instead of $k_1=0$ and $k_2=2$). The rate/pure premium under the k-Inflated Negative Binomial mixture models are more appealing and acceptable.

Download Full-text

Negative Binomial Distribution to Explain the Domestic Fire Incidence in Nepal

Nepalese Journal of Statistics ◽

10.3126/njs.v5i1.41229 ◽

2021 ◽

pp. 51-66

Author(s):

Arun Kumar Yadav ◽

Santosh Kumar Shah

Keyword(s):

Negative Binomial Distribution ◽

Binomial Distribution ◽

Goodness Of Fit ◽

Negative Binomial ◽

Information Criteria ◽

Descriptive Statistics ◽

Probability Models ◽

Ecological Regions ◽

The Hill ◽

Fire Incidence

Background: Fire disaster is one of the most destructive disasters. According to global dataset of Sendai Framework, domestic fire incidence was 9.9% up to 2019. In Nepal, 62% fire incidence was reported during 2017 and 2018. However, many studies have been conducted on fire incidence, few of them are based on domestic fire incidence. Objective: To find the descriptive statistics of fire occurrences and fire fatalities, and to identify the probability distributions that best fit the data of fire occurrences observed in three ecological regions as well as overall in Nepal. Material and Methods: The data of fire incidences from May 2011 to April 2021 were retrieved from Nepal Disaster Risk Reduction Portal, Government of Nepal. At first, a statistical software "Mathwave EasyFit" of 30 days trial version was used to identify the candidate probability models. Further, the best probability model was determined after testing the goodness of fit of the candidate models by using graphical tools-histogram and theoretical densities, empirical and theoretical CDFs, Q-Q plot and P-P plot; and mathematical tools-maximum likelihood, Akaike Information Criteria and Bayesian Information Criteria by using the package “fitdistrplus” of software R version 4.1.1. Results: On an average, 135 fire incidences per month were occurred in Nepal. However, the Terai faced the highest monthly fire incidences compared to the Hill and the Mountain, it has less fatality per 100 fire incidence followed by the Hill and the Mountain. Descriptive statistics reveals that fire occurrences are moderate during November to February and high in March and April. The fire incidences were reported high during spring and winter and low during summer and autumn season which reveals that fire incidence might be related with the precipitation and temperature. The sample data was run in "Mathwave EasyFit" software which suggested Poisson, geometric and negative binomial distribution as candidate probability models. The goodness of fit of these models were further tested by graphical as well as mathematical tools where negative binomial distribution was found to be best among the candidate models for the data set. Conclusion: Incidence of fire disasters varies by ecological regions as well as by seasons. It is low in the Mountain region and during Monsoon/rainy season. Negative binomial distribution fits the best to monthly data of fire incidence in Nepal.

Download Full-text

On Testing for Goodness-of-Fit of the Negative Binomial Distribution when Expectations are Small

Biometrics ◽

10.2307/2528685 ◽

1969 ◽

Vol 25 (1) ◽

pp. 143 ◽

Cited By ~ 9

Author(s):

P. J. Pahl

Keyword(s):

Negative Binomial Distribution ◽

Binomial Distribution ◽

Goodness Of Fit ◽

Negative Binomial

Download Full-text

A Goodness-of-Fit Test for the Negative Binomial Distribution Applicable to Large Sets of Small Samples

Statistical Aspects of Water Quality Monitoring, Proceedings of the Workshop held at the Canada Centre for Inland Waters - Developments in Water Science ◽

10.1016/s0167-5648(08)70794-9 ◽

1986 ◽

pp. 215-220

Author(s):

Barbara Heller

Keyword(s):

Negative Binomial Distribution ◽

Binomial Distribution ◽

Goodness Of Fit ◽

Negative Binomial ◽

Small Samples ◽

Goodness Of Fit Test ◽

Large Sets

Download Full-text

Small Sample Properties of the Pareto/Negative Binomial Distribution Model

Marketing ZFP ◽

10.15358/0344-1369-2010-jrm-1-39 ◽

2010 ◽

Vol 32 (JRM 1) ◽

pp. 39-50

Author(s):

Daniel Hoppe ◽

Udo Wagner

Keyword(s):

Negative Binomial Distribution ◽

Binomial Distribution ◽

Negative Binomial ◽

Small Sample ◽

Distribution Model ◽

Small Sample Properties

Download Full-text

Niche Sharing in Intertidal Mollusks and Decapods in Rocky Shore of Easter Island

Vestnik Zoologii ◽

10.2478/vzoo-2019-0037 ◽

2019 ◽

Vol 53 (5) ◽

pp. 417-422

Author(s):

P. De los Ríos ◽

E. Ibáñez Arancibia

Keyword(s):

Spatial Distribution ◽

Negative Binomial Distribution ◽

Binomial Distribution ◽

Rocky Shore ◽

Negative Binomial ◽

Null Model ◽

Easter Island ◽

Published Data ◽

Distribution Analysis ◽

Field Works

Abstract The coastal marine ecosystems in Easter Island have been poorly studied, and the main studies were isolated species records based on scientific expeditions. The aim of the present study is to apply a spatial distribution analysis and niche sharing null model in published data on intertidal marine gastropods and decapods in rocky shore in Easter Island based in field works in 2010, and published information from CIMAR cruiser in 2004. The field data revealed the presence of decapods Planes minutus (Linnaeus, 1758) and Leptograpsus variegatus (Fabricius, 1793), whereas it was observed the gastropods Nodilittorina pyramidalis pascua Rosewater, 1970 and Nerita morio (G. B. Sowerby I., 1833). The available information revealed the presence of more species in data collected in 2004 in comparison to data collected in 2010, with one species markedly dominant in comparison to the other species. The spatial distribution of species reported in field works revealed that P. minutus and N. morio have aggregated pattern and negative binomial distribution, L. variegatus had uniform pattern with binomial distribution, and finally N. pyramidalis pascua, in spite of aggregated distribution pattern, had not negative binomial distribution. Finally, the results of null model revealed that the species reported did not share ecological niche due to competition absence. The results would agree with other similar information about littoral and sub-littoral fauna for Easter Island.

Download Full-text

ANALISIS RISIKO OPERASIONAL MENGGUNAKAN PENDEKATAN DISTRIBUSI KERUGIAN DENGAN METODE AGREGAT

Journal of Mathematics and Its Applications ◽

10.29244/jmap.10.2.1-10 ◽

2011 ◽

Vol 10 (2) ◽

pp. 1

Author(s):

Y. ARBI ◽

R. BUDIARTI ◽

I G. P. PURNABA

Keyword(s):

Negative Binomial Distribution ◽

Binomial Distribution ◽

Negative Binomial ◽

Financial Institution ◽

Expected Value ◽

Insurance Companies ◽

Limited Data ◽

Loss Distribution Approach ◽

Potential Loss ◽

The Impact

Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes or external problems. Insurance companies as financial institution that also faced at risk. Recording of operating losses in insurance companies, were not properly conducted so that the impact on the limited data for operational losses. In this work, the data of operational loss observed from the payment of the claim. In general, the number of insurance claims can be modelled using the Poisson distribution, where the expected value of the claims is similar with variance, while the negative binomial distribution, the expected value was bound to be less than the variance.Analysis tools are used in the measurement of the potential loss is the loss distribution approach with the aggregate method. In the aggregate method, loss data grouped in a frequency distribution and severity distribution. After doing 10.000 times simulation are resulted total loss of claim value, which is total from individual claim every simulation. Then from the result was set the value of potential loss (OpVar) at a certain level confidence.

Download Full-text