scholarly journals Objective Bayesian analysis of the 2 x 2 contingency table and the negative binomial distribution

2018 ◽  
Author(s):  
◽  
John Christian Snyder

In Bayesian analysis, the “objective” Bayesian approach seeks to select a prior distribution not by using (often subjective) scientific belief or by mathematical convenience, but rather by deriving it under a pre-specified criteria. This approach takes the decision of prior selection out of the hands of the researcher. Ideally, for a given data model, we would like to have a prior which represents a "neutral" prior belief in the phenomenon we are studying. In categorical data analysis, the odds ratio is one of several approaches to quantify how strongly the presence or absence of one property is associated with the presence or absence of another property. In this project, we present a Reference prior for the odds ratio of an unrestricted 2 x 2 table. Posterior simulation can be conducted without MCMC and is implemented on a GPU via the CUDA extensions for C. Simulation results indicate that the proposed approach to this problem is far superior to the widely used Frequentist approaches that dominate this area. Real data examples also typically yield much more sensible results, especially for small sample sizes or for tables that contain zeros. An R package is also presented to allow for easy implementation of this methodology. Next, we develop an approximate reference prior for the negative binomial distribution, applying this methodology to a continuous parameterization often used for modeling over-dispersed count data as well as the typical discrete case. Results indicate that the developed prior equals the performance of the MLE in estimating the mean of the distribution but is far superior when estimating the dispersion parameter.

Parasitology ◽  
2005 ◽  
Vol 131 (3) ◽  
pp. 393-401 ◽  
Author(s):  
S. GABA ◽  
V. GINOT ◽  
J. CABARET

Macroparasites are almost always aggregated across their host populations, hence the Negative Binomial Distribution (NBD) with its exponent parameter k is widely used for modelling, quantifying or analysing parasite distributions. However, many studies have pointed out some drawbacks in the use of the NBD, with respect to the sensitivity of k to the mean number of parasites per host or the under-representation of the heavily infected hosts in the estimate of k. In this study, we compare the fit of the NBD with 4 other widely used distributions on observed parasitic gastrointestinal nematode distributions in their sheep host populations (11 datasets). Distributions were fitted to observed data using maximum likelihood estimator and the best fits were selected using the Akaike's Information Criterion (AIC). A simulation study was also conducted in order to assess the possible bias in parameter estimations especially in the case of small sample sizes. We found that the NBD is seldom the best fit for gastrointestinal nematode distributions. The Weibull distribution was clearly more appropriate over a very wide range of degrees of aggregation, mainly because it was more flexible in fitting the heavily infected hosts. Moreover, the Weibull distribution estimates are less sensitive to sample size. Thus, when possible, we suggest to carefully check on observed data if the NBD is appropriate before conducting any further analysis on parasite distributions.


2020 ◽  
Vol 70 (4) ◽  
pp. 917-934
Author(s):  
Muhammad Mansoor ◽  
Muhammad Hussain Tahir ◽  
Gauss M. Cordeiro ◽  
Sajid Ali ◽  
Ayman Alzaatreh

AbstractA generalization of the Lindley distribution namely, Lindley negative-binomial distribution, is introduced. The Lindley and the exponentiated Lindley distributions are considered as sub-models of the proposed distribution. The proposed model has flexible density and hazard rate functions. The density function can be decreasing, right-skewed, left-skewed and approximately symmetric. The hazard rate function possesses various shapes including increasing, decreasing and bathtub. Furthermore, the survival and hazard rate functions have closed form representations which make this model tractable for censored data analysis. Some general properties of the proposed model are studied such as ordinary and incomplete moments, moment generating function, mean deviations, Lorenz and Bonferroni curve. The maximum likelihood and the Bayesian estimation methods are utilized to estimate the model parameters. In addition, a small simulation study is conducted in order to evaluate the performance of the estimation methods. Two real data sets are used to illustrate the applicability of the proposed model.


Author(s):  
R. Ashly ◽  
C. S. Rajitha

The objective of this paper is to introduce a new two parameter mixed negative binomial distribution, namely negative binomial-improved second degree Lindley(NB-ISL) distribution. This distribution is obtained by mixing the negative binomial distribution with the improved second degree Lindley distribution. Many mixed distributions have been used in the literature for modeling the over dispersed count data, which provide a better fit compared to the Poisson and negative binomial distribution. In addition, we present the basic statistical properties of the new distribution such as factorial moments, mean and variance and the behavior of mean, variance and coefficient of variation are also discussed. Parameter estimation is implemented by using maximum likelihood estimation method. The performance of the NB-ISL distribution is shown in practice by applying it on real data set and compare it with some well-known count distributions. The result shows that the negative binomial-improved second degree Lindley distribution provides a better fit compared to Poisson, negative binomial and negative binomial-Lindley distributions.


2017 ◽  
Author(s):  
Qingyang Zhang

AbstractRNA-sequencing (RNA-Seq) has become a preferred option to quantify gene expression, because it is more accurate and reliable than microarrays. In RNA-Seq experiments, the expression level of a gene is measured by the count of short reads that are mapped to the gene region. Although some normal-based statistical methods may also be applied to log-transformed read counts, they are not ideal for directly modeling RNA-Seq data. Two discrete distributions, Poisson distribution and negative binomial distribution, have been commonly used in the literature to model RNA-Seq data, where the latter is a natural extension of the former with allowance of overdispersion. Due to the technical difficulty in modeling correlated counts, most existing classifiers based on discrete distributions assume that genes are independent of each other. However, as we show in this paper, the independence assumption may cause non-ignorable bias in estimating the discriminant score, making the classification inaccurate. To this end, we drop the independence assumption and explicitly model the dependence between genes using Gaussian copula. We apply a Bayesian approach to estimate covariance matrix and the overdispersion parameter in negative binomial distribution. Both synthetic data and real data are used to demonstrate the advantages of our model.


2020 ◽  
Vol 4 (3) ◽  
pp. 484-497
Author(s):  
Puput Cahya Ambarwati ◽  
Indahwati Indahwati ◽  
Muhammad Nur Aidi

Geographic weighted regression (GWR) is one of the regression methods for spatial data. GWR with the response variable following the poisson distribution can use the geographic weighted poisson regression (GWPR). GWPR often does not complete the assumption of dispersion. The classic approach commonly used to overcome overdispersion is related to poisson distribution, which is the approach obtained from poisson and gamma distribution which is similar to negative binomial distribution function. GWR for the response variable following the negative binomial distribution can use the geographical weighted negative binomial regression (GWNBR). The data used in this study are simulation data and real data. The results of the simulation data are the tolerance limits that are still precisely modeled with GWPR are overdispersion approaching 1 based on significant amount and average p-value.. The results of research from real data, the GWNBR is the best model for overdispersion cases in malnourished children in East Java Province in 2017 compared to the GWPR based on comparison of the values ​​of AIC. 


2016 ◽  
Vol 5 (1) ◽  
pp. 53-65 ◽  
Author(s):  
Abdullahi Yusuf ◽  
Badamasi Bashir Mikail ◽  
Aliyu Isah Aliyu ◽  
Abdurrahaman L. Sulaiman

2019 ◽  
Vol 53 (5) ◽  
pp. 417-422
Author(s):  
P. De los Ríos ◽  
E. Ibáñez Arancibia

Abstract The coastal marine ecosystems in Easter Island have been poorly studied, and the main studies were isolated species records based on scientific expeditions. The aim of the present study is to apply a spatial distribution analysis and niche sharing null model in published data on intertidal marine gastropods and decapods in rocky shore in Easter Island based in field works in 2010, and published information from CIMAR cruiser in 2004. The field data revealed the presence of decapods Planes minutus (Linnaeus, 1758) and Leptograpsus variegatus (Fabricius, 1793), whereas it was observed the gastropods Nodilittorina pyramidalis pascua Rosewater, 1970 and Nerita morio (G. B. Sowerby I., 1833). The available information revealed the presence of more species in data collected in 2004 in comparison to data collected in 2010, with one species markedly dominant in comparison to the other species. The spatial distribution of species reported in field works revealed that P. minutus and N. morio have aggregated pattern and negative binomial distribution, L. variegatus had uniform pattern with binomial distribution, and finally N. pyramidalis pascua, in spite of aggregated distribution pattern, had not negative binomial distribution. Finally, the results of null model revealed that the species reported did not share ecological niche due to competition absence. The results would agree with other similar information about littoral and sub-littoral fauna for Easter Island.


2011 ◽  
Vol 10 (2) ◽  
pp. 1
Author(s):  
Y. ARBI ◽  
R. BUDIARTI ◽  
I G. P. PURNABA

Operational risk is defined as the risk of loss resulting from inadequate or failed internal processes or external problems. Insurance companies as financial institution that also faced at risk. Recording of operating losses in insurance companies, were not properly conducted so that the impact on the limited data for operational losses. In this work, the data of operational loss observed from the payment of the claim. In general, the number of insurance claims can be modelled using the Poisson distribution, where the expected value of the claims is similar with variance, while the negative binomial distribution, the expected value was bound to be less than the variance.Analysis tools are used in the measurement of the potential loss is the loss distribution approach with the aggregate method. In the aggregate method, loss data grouped in a frequency distribution and severity distribution. After doing 10.000 times simulation are resulted total loss of claim value, which is total from individual claim every simulation. Then from the result was set the value of potential loss (OpVar) at a certain level confidence.


Sign in / Sign up

Export Citation Format

Share Document