count data
Recently Published Documents





Y. Gevrekçi ◽  
Ö.İ. Güneri ◽  
Ç. Takma ◽  
A. Yeşilova

Background: The objective of this study is comparing different count data models for stillbirth data. In modeling this type of data, Poisson regression or alternative models can be preferred. Methods: The poisson, negative binomial, zero-inflated poisson, zero-inflated negative binomial, poisson-logit hurdle and negative binomial-logit hurdle regressions were compared and used to examine the effects of the gender, parity and herd-year-season independent variables on stillbirth. Furthermore, the Log-Likelihood statistics, Akaike Information Criteria, Bayesian Information Criteria and rootogram graphs were used as comparison criteria for performance of the models. According to these criteria, Negative Binomial-Logit Hurdle Regression model was chosen as the best model. Result: The parameter estimates obtained by Negative Binomial-Logit Hurdle Regression model in relation to the effects of the gender, parity and herd-year-season independent variables on stillbirth were found to be significant (p less than 0.01). It was found that while stillbirth incidence was higher in males than females, it was found to decrease as the parity increased. As a result, the Negative Binomial Logit Hurdle model was found the best model for stillbirth count data with overdispersion.

2022 ◽  
Vol 10 (4) ◽  
pp. 488-498
Yashmine Noor Islami ◽  
Dwi Ispriyanti ◽  
Puspita Kartikasari

Infant mortality (0-11 months) and maternal mortality (during pregnancy, childbirth, and postpartum) are significant indicators in determining the level of public health. Central Java Province which has 35 regencies/cities is included in the top five regions with the highest number of infant and maternal mortality in Indonesia. The data characteristics of the number of infants and maternal mortality are count data. Therefore, the Poisson Regression method can be used to analyze the factors that influence the number of infants and maternal mortality. In Poisson regression analysis, there must be a fulfilled assumption, called equidispersion. Frequently, the variance of count data is greater than the mean, which is known as the overdispersion. The research, binomial negative bivariate regression is used as a solutions to overcome the problem of overdispersion in poisson regression. This method produce a global model. In reality, the geographical, socio-cultural, and economic conditions of each region will be different. This illustrates the effect of spatial heterogeneity, so it needs to be developed into Geographically Weighted Negative Binomial Bivariate Regression (GWNBBR). The model of GWNBBR provides weighting based on the position or distance from one observation area to another. Significant variables for modeling infant mortality cases included the percentage of obstetric complications treated (X1), the percentage of infants who were exclusively breastfed (X3), and the percentage of poor people (X5). Significant variable for modeling maternal mortality cases is the percentage of poor people (X5). Based on the AIC value, GWNBBR model is better than binomial negatif bivariat regression model because it has a smaller AIC value. 

Razik Ridzuan Mohd Tajuddin ◽  
Noriszura Ismail ◽  
Kamarulzaman Ibrahim ◽  
Shaiful Anuar Abu Bakar

Stats ◽  
2022 ◽  
Vol 5 (1) ◽  
pp. 70-88
Johannes Ferreira ◽  
Ané van der Merwe

This paper proposes a previously unconsidered generalization of the Lindley distribution by allowing for a measure of noncentrality. Essential structural characteristics are investigated and derived in explicit and tractable forms, and the estimability of the model is illustrated via the fit of this developed model to real data. Subsequently, this model is used as a candidate for the parameter of a Poisson model, which allows for departure from the usual equidispersion restriction that the Poisson offers when modelling count data. This Poisson-noncentral Lindley is also systematically investigated and characteristics are derived. The value of this count model is illustrated and implemented as the count error distribution in an integer autoregressive environment, and juxtaposed against other popular models. The effect of the systematically-induced noncentrality parameter is illustrated and paves the way for future flexible modelling not only as a standalone contender in continuous Lindley-type scenarios but also in discrete and discrete time series scenarios when the often-encountered equidispersed assumption is not adhered to in practical data environments.

Stats ◽  
2022 ◽  
Vol 5 (1) ◽  
pp. 52-69
Darcy Steeg Morris ◽  
Kimberly F. Sellers

Clustered count data are commonly modeled using Poisson regression with random effects to account for the correlation induced by clustering. The Poisson mixed model allows for overdispersion via the nature of the within-cluster correlation, however, departures from equi-dispersion may also exist due to the underlying count process mechanism. We study the cross-sectional COM-Poisson regression model—a generalized regression model for count data in light of data dispersion—together with random effects for analysis of clustered count data. We demonstrate model flexibility of the COM-Poisson random intercept model, including choice of the random effect distribution, via simulated and real data examples. We find that COM-Poisson mixed models provide comparable model fit to well-known mixed models for associated special cases of clustered discrete data, and result in improved model fit for data with intermediate levels of over- or underdispersion in the count mechanism. Accordingly, the proposed models are useful for capturing dispersion not consistent with commonly used statistical models, and also serve as a practical diagnostic tool.

2022 ◽  
Mahsa Nadifar ◽  
Hossein Baghishani ◽  
Afshin Fallah

2022 ◽  
Vol 7 (2) ◽  
pp. 1726-1741
Ahmed Sedky Eldeeb ◽  
Muhammad Ahsan-ul-Haq ◽  
Mohamed. S. Eliwa ◽  

<abstract> <p>In this paper, a flexible probability mass function is proposed for modeling count data, especially, asymmetric, and over-dispersed observations. Some of its distributional properties are investigated. It is found that all its statistical and reliability properties can be expressed in explicit forms which makes the proposed model useful in time series and regression analysis. Different estimation approaches including maximum likelihood, moments, least squares, Andersonӳ-Darling, Cramer von-Mises, and maximum product of spacing estimator, are derived to get the best estimator for the real data. The estimation performance of these estimation techniques is assessed via a comprehensive simulation study. The flexibility of the new discrete distribution is assessed using four distinctive real data sets ԣoronavirus-flood peaks-forest fire-Leukemia? Finally, the new probabilistic model can serve as an alternative distribution to other competitive distributions available in the literature for modeling count data.</p> </abstract>

Osval Antonio Montesinos López ◽  
Abelardo Montesinos López ◽  
Jose Crossa

AbstractIn this chapter, we explain, under a Bayesian framework, the fundamentals and practical issues for implementing genomic prediction models for categorical and count traits. First, we derive the Bayesian ordinal model and exemplify it with plant breeding data. These examples were implemented in the library BGLR. We also derive the ordinal logistic regression. The fundamentals and practical issues of penalized multinomial logistic regression and penalized Poisson regression are given including several examples illustrating the use of the glmnet library. All the examples include main effects of environments and genotypes as well as the genotype × environment interaction term.

Sign in / Sign up

Export Citation Format

Share Document