Comparison of Different Count Models for Investigation of Some Environmental Factors Affecting Stillbirth in Holsteins

Background: The objective of this study is comparing different count data models for stillbirth data. In modeling this type of data, Poisson regression or alternative models can be preferred. Methods: The poisson, negative binomial, zero-inflated poisson, zero-inflated negative binomial, poisson-logit hurdle and negative binomial-logit hurdle regressions were compared and used to examine the effects of the gender, parity and herd-year-season independent variables on stillbirth. Furthermore, the Log-Likelihood statistics, Akaike Information Criteria, Bayesian Information Criteria and rootogram graphs were used as comparison criteria for performance of the models. According to these criteria, Negative Binomial-Logit Hurdle Regression model was chosen as the best model. Result: The parameter estimates obtained by Negative Binomial-Logit Hurdle Regression model in relation to the effects of the gender, parity and herd-year-season independent variables on stillbirth were found to be significant (p less than 0.01). It was found that while stillbirth incidence was higher in males than females, it was found to decrease as the parity increased. As a result, the Negative Binomial Logit Hurdle model was found the best model for stillbirth count data with overdispersion.

Download Full-text

Modeling the Frequency of Auto Insurance Claims by Means of Poisson and Negative Binomial Models

Annals of the Alexandru Ioan Cuza University - Economics ◽

10.1515/aicue-2015-0011 ◽

2015 ◽

Vol 62 (2) ◽

pp. 151-168 ◽

Cited By ~ 3

Author(s):

Mihaela David ◽

Dănuţ-Vasile Jemna

Keyword(s):

Risk Factors ◽

Count Data ◽

Negative Binomial ◽

Information Criteria ◽

Count Data Models ◽

Auto Insurance ◽

Insurance Portfolio ◽

Negative Binomial Models ◽

Degree Of Risk ◽

Binomial Models

Abstract Within non-life insurance pricing, an accurate evaluation of claim frequency, also known in theory as count data, represents an essential part in determining an insurance premium according to the policyholder’s degree of risk. Count regression analysis allows the identification of the risk factors and the prediction of the expected frequency of claims given the characteristics of policyholders. The aim of this paper is to verify several hypothesis related to the methodology of count data models and also to the risk factors used to explain the frequency of claims. In addition to the standard Poisson regression, Negative Binomial models are applied to a French auto insurance portfolio. The best model was chosen by means of the log-likelihood ratio and the information criteria. Based on this model, the profile of the policyholders with the highest degree of risk is determined

Download Full-text

Spatio-temporal modelling of tick life-stage count data with spatially varying coefficients

Geospatial health ◽

10.4081/gh.2021.1004 ◽

2021 ◽

Vol 16 (2) ◽

Author(s):

Thabo Lephoto ◽

Henry Mwambi ◽

Oliver Bodhlyera ◽

Holly Gaff

Keyword(s):

Count Data ◽

Negative Binomial ◽

Life Stage ◽

Poisson Model ◽

Information Criteria ◽

Count Data Models ◽

Varying Coefficients ◽

Time Interaction ◽

York County ◽

Spatio Temporal

There is a vast amount of geo-referenced data in many fields of study including ecological studies. Geo-referencing is usually by point referencing; that is, latitudes and longitudes or by areal referencing, which includes districts, counties, states, provinces and other administrative units. The availability of large geo-referenced datasets for modelling has necessitated the development and application of spatial statistical methods. However, spatial varying coefficients models exploring the abundance of tick counts remain limited. In this study we used data that was collected and prepared by researchers in the Department of Biological Sciences from the Old Dominion University, Virginia, USA. We modelled tick life-stage counts and abundance variability from 12 sampling locations, with 5 different habitats (numbered 1-5), three habitat types; namely: woods, edges and grass; collected monthly from May 2009 through December 2018. Spatio-temporal Poisson and spatio-temporal negative binomial (NB) count data models were fitted to the data and compared using the deviance information criteria (DIC). The NB model outperformed the Poisson models with all its DIC values being smaller than those of the Poisson model. Results showed that the covariates varied spatially across counties. There was a decreasing time (in years) effect over the study period. However, even though the time effect was decreasing over the study period, space-time interaction effects were seen to be increasing over time in York County.

Download Full-text

The Effect of Sample Size on the Efficiency of Count Data Models: Application to Marriage Data

Journal of Economics and Behavioral Studies ◽

10.22610/jebs.v9i3.1742 ◽

2017 ◽

Vol 9 (3) ◽

pp. 6

Author(s):

Volition Tlhalitshi Montshiwa ◽

Ntebogang Dinah Moroke

Keyword(s):

Regression Model ◽

Sample Size ◽

Count Data ◽

Negative Binomial ◽

Information Criterion ◽

Data Models ◽

Hurdle Model ◽

Negative Binomial Regression Model ◽

Count Data Models ◽

Over Dispersion

Abstract: Sample size requirements are common in many multivariate analysis techniques as one of the measures taken to ensure the robustness of such techniques, such requirements have not been of interest in the area of count data models. As such, this study investigated the effect of sample size on the efficiency of six commonly used count data models namely: Poisson regression model (PRM), Negative binomial regression model (NBRM), Zero-inflated Poisson (ZIP), Zero-inflated negative binomial (ZINB), Poisson Hurdle model (PHM) and Negative binomial hurdle model (NBHM). The data used in this study were sourced from Data First and were collected by Statistics South Africa through the Marriage and Divorce database. PRM, NBRM, ZIP, ZINB, PHM and NBHM were applied to ten randomly selected samples ranging from 4392 to 43916 and differing by 10% in size. The six models were compared using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Vuong’s test for over-dispersion, McFadden RSQ, Mean Square Error (MSE) and Mean Absolute Deviation (MAD).The results revealed that generally, the Negative Binomial-based models outperformed Poisson-based models. However, the results did not reveal the effect of sample size variations on the efficiency of the models since there was no consistency in the change in AIC, BIC, Vuong’s test for over-dispersion, McFadden RSQ, MSE and MAD as the sample size increased.

Download Full-text

The Effect of Sample Size on the Efficiency of Count Data Models: Application to Marriage Data

Journal of Economics and Behavioral Studies ◽

10.22610/jebs.v9i3(j).1742 ◽

2017 ◽

Vol 9 (3(J)) ◽

pp. 6-18

Author(s):

Volition Tlhalitshi Montshiwa ◽

Ntebogang Dinah Moroke

Keyword(s):

Regression Model ◽

Sample Size ◽

Count Data ◽

Negative Binomial ◽

Information Criterion ◽

Data Models ◽

Hurdle Model ◽

Negative Binomial Regression Model ◽

Count Data Models ◽

Over Dispersion

Abstract: Sample size requirements are common in many multivariate analysis techniques as one of the measures taken to ensure the robustness of such techniques, such requirements have not been of interest in the area of count data models. As such, this study investigated the effect of sample size on the efficiency of six commonly used count data models namely: Poisson regression model (PRM), Negative binomial regression model (NBRM), Zero-inflated Poisson (ZIP), Zero-inflated negative binomial (ZINB), Poisson Hurdle model (PHM) and Negative binomial hurdle model (NBHM). The data used in this study were sourced from Data First and were collected by Statistics South Africa through the Marriage and Divorce database. PRM, NBRM, ZIP, ZINB, PHM and NBHM were applied to ten randomly selected samples ranging from 4392 to 43916 and differing by 10% in size. The six models were compared using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Vuongâ€™s test for over-dispersion, McFadden RSQ, Mean Square Error (MSE) and Mean Absolute Deviation (MAD).The results revealed that generally, the Negative Binomial-based models outperformed Poisson-based models. However, the results did not reveal the effect of sample size variations on the efficiency of the models since there was no consistency in the change in AIC, BIC, Vuongâ€™s test for over-dispersion, McFadden RSQ, MSE and MAD as the sample size increased.

Download Full-text

Statistical models for analyzing count data: predictors of length of stay among HIV patients in Portugal using a multilevel model

BMC Health Services Research ◽

10.1186/s12913-021-06389-1 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Ahmed Nabil Shaaban ◽

Bárbara Peleteiro ◽

Maria Rosario O. Martins

Keyword(s):

Length Of Stay ◽

Regression Model ◽

Random Effects ◽

Count Data ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Comprehensive Approach ◽

Negative Binomial Regression Model ◽

Hiv Patients ◽

Binomial Regression

Abstract Background This study offers a comprehensive approach to precisely analyze the complexly distributed length of stay among HIV admissions in Portugal. Objective To provide an illustration of statistical techniques for analysing count data using longitudinal predictors of length of stay among HIV hospitalizations in Portugal. Method Registered discharges in the Portuguese National Health Service (NHS) facilities Between January 2009 and December 2017, a total of 26,505 classified under Major Diagnostic Category (MDC) created for patients with HIV infection, with HIV/AIDS as a main or secondary cause of admission, were used to predict length of stay among HIV hospitalizations in Portugal. Several strategies were applied to select the best count fit model that includes the Poisson regression model, zero-inflated Poisson, the negative binomial regression model, and zero-inflated negative binomial regression model. A random hospital effects term has been incorporated into the negative binomial model to examine the dependence between observations within the same hospital. A multivariable analysis has been performed to assess the effect of covariates on length of stay. Results The median length of stay in our study was 11 days (interquartile range: 6–22). Statistical comparisons among the count models revealed that the random-effects negative binomial models provided the best fit with observed data. Admissions among males or admissions associated with TB infection, pneumocystis, cytomegalovirus, candidiasis, toxoplasmosis, or mycobacterium disease exhibit a highly significant increase in length of stay. Perfect trends were observed in which a higher number of diagnoses or procedures lead to significantly higher length of stay. The random-effects term included in our model and refers to unexplained factors specific to each hospital revealed obvious differences in quality among the hospitals included in our study. Conclusions This study provides a comprehensive approach to address unique problems associated with the prediction of length of stay among HIV patients in Portugal.

Download Full-text

Beta-binomial models for meta-analysis with binary outcomes: Variations, extensions, and additional insights from econometrics

Research Methods in Medicine & Health Sciences ◽

10.1177/2632084321996225 ◽

2021 ◽

pp. 263208432199622

Author(s):

Tim Mathes ◽

Oliver Kuss

Keyword(s):

Simulation Study ◽

Count Data ◽

Negative Binomial ◽

Meta Analysis ◽

Negative Binomial Regression ◽

Binary Outcomes ◽

Small Scale ◽

Panel Count Data ◽

Count Data Models ◽

Meta Analyses

Background Meta-analysis of systematically reviewed studies on interventions is the cornerstone of evidence based medicine. In the following, we will introduce the common-beta beta-binomial (BB) model for meta-analysis with binary outcomes and elucidate its equivalence to panel count data models. Methods We present a variation of the standard “common-rho” BB (BBST model) for meta-analysis, namely a “common-beta” BB model. This model has an interesting connection to fixed-effect negative binomial regression models (FE-NegBin) for panel count data. Using this equivalence, it is possible to estimate an extension of the FE-NegBin with an additional multiplicative overdispersion term (RE-NegBin), while preserving a closed form likelihood. An advantage due to the connection to econometric models is, that the models can be easily implemented because “standard” statistical software for panel count data can be used. We illustrate the methods with two real-world example datasets. Furthermore, we show the results of a small-scale simulation study that compares the new models to the BBST. The input parameters of the simulation were informed by actually performed meta-analysis. Results In both example data sets, the NegBin, in particular the RE-NegBin showed a smaller effect and had narrower 95%-confidence intervals. In our simulation study, median bias was negligible for all methods, but the upper quartile for median bias suggested that BBST is most affected by positive bias. Regarding coverage probability, BBST and the RE-NegBin model outperformed the FE-NegBin model. Conclusion For meta-analyses with binary outcomes, the considered common-beta BB models may be valuable extensions to the family of BB models.

Download Full-text

A Combined PLS and Negative Binomial Regression Model for Inferring Association Networks from Next-Generation Sequencing Count Data

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2017.2665495 ◽

2018 ◽

Vol 15 (3) ◽

pp. 760-773 ◽

Cited By ~ 2

Author(s):

Maiju Pesonen ◽

Jaakko Nevalainen ◽

Steven Potter ◽

Somnath Datta ◽

Susmita Datta

Keyword(s):

Next Generation Sequencing ◽

Regression Model ◽

Count Data ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Negative Binomial Regression Model ◽

Next Generation ◽

Binomial Regression ◽

Generation Sequencing

Download Full-text

PEMODELAN DENGAN GEOGRAPHICALLY WEIGHTED NEGATIVE BINOMIAL REGRESSION (Studi kasus: Banyaknya Penderita Kusta di Jawa Barat)

Xplore Journal of Statistics ◽

10.29244/xplore.v10i3.833 ◽

2021 ◽

Vol 10 (3) ◽

pp. 226-236

Author(s):

Khusnul Khotimah ◽

Itasia Dina Sulvianti ◽

Pika Silvianti

Keyword(s):

Regression Model ◽

Count Data ◽

Poisson Regression ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Kernel Weight ◽

Negative Binomial Regression Model ◽

West Java ◽

Binomial Regression ◽

Spatial Heterogenity

The number of leper in West Java is an example of the count data case. The analyzes commonly used in count data is Poisson regression. This research will determine the variables that influence the number of leper in West Java. The data used is the number of leper in West Java in 2019. This data has an overdispersion condition and spatial heterogenity. To handle overdispersion, the negative binomial regression model can be employed. While spatial heterogenity is overcome by adding adaptive bisquare kernel weight. This research resulted Geographically Weighted Negative Binomial Regression (GWNBR) with a weighting adaptive bisquare kernel classifies regency/city in West Java into ten groups based on the variables that sigfinicantly influence the number of leper. In general, the variable in the percentage of households with Clean and Healthy Behavior (PHBS) has a significant effect in all regency/city in West Java. Especially for Bogor Regency, Depok City, Bogor City, and Pangandaran Regency, the variable of the percentage of people poverty does not have a significant effect on the number leper.

Download Full-text

Using Count Data Models to Predict Epiphytic Bryophyte Recruitment in Schima superba Gardn. et Champ. Plantations in Urban Forests

Forests ◽

10.3390/f11020174 ◽

2020 ◽

Vol 11 (2) ◽

pp. 174

Author(s):

Dexian Zhao ◽

Zhenkai Sun ◽

Cheng Wang ◽

Zezhou Hao ◽

Baoqiang Sun ◽

...

Keyword(s):

Count Data ◽

Human Disturbance ◽

Negative Binomial ◽

Urban Environments ◽

Tree Planting ◽

Count Data Models ◽

Urban Tree ◽

Schima Superba ◽

Hurdle Models ◽

Positive Effects

Epiphytic bryophytes are known to perform essential ecosystem functions, but their sensitivity to environmental quality and change makes their survival and development vulnerable to global changes, especially habitat loss in urban environments. Fortunately, extensive urban tree planting programs worldwide have had a positive effect on the colonization and development of epiphytic bryophytes. However, how epiphytic bryophytes occur and grow on planted trees remain poorly known, especially in urban environments. In the present study, we surveyed the distribution of epiphytic bryophytes on tree trunks in a Schima superba Gardn. et Champ. urban plantation and then developed count data models, including tree characteristics, stand characteristics, human disturbance, terrain factors, and microclimate to predict the drivers on epiphytic bryophyte recruitment. Different counting models (Poisson, Negative binomial, Zero-inflated Poisson, Zero-inflated negative binomial, Hurdle-Poisson, Hurdle-negative binomial) were compared for a data analysis to account for the zero-inflated data structure. Our results show that (i) the shaded side and base of tree trunks were the preferred locations for bryophytes to colonize in urban plantations, (ii) both hurdle models performed well in modeling epiphytic bryophyte recruitment, and (iii) both hurdle models showed that the tree height, diameter at breast height (DBH), leaf area index (LAI), and altitude (ALT) promoted the occurrence of epiphytic bryophytes, but the height under branch and interference intensity of human activities opposed the occurrence of epiphytic bryophytes. Specifically, DBH and LAI had positive effects on the species richness recruitment count; similarly, DBH and ALT had positive effects on the abundance recruitment count, but slope had a negative effect. To promote the occurrence and growth of epiphytic bryophytes in urban tree planting programs, we suggest that managers regulate suitable habitats by cultivating and protecting large trees, promoting canopy closure, and controlling human disturbance.

Download Full-text

Random Parameter Negative Binomial Model of Signalized Intersections

Mathematical Problems in Engineering ◽

10.1155/2016/1436364 ◽

2016 ◽

Vol 2016 ◽

pp. 1-8 ◽

Cited By ~ 3

Author(s):

Minho Park ◽

Dongmin Lee ◽

Jinwoo Jeon

Keyword(s):

Negative Binomial ◽

Random Parameter ◽

Signalized Intersections ◽

Negative Binomial Model ◽

Marginal Effect ◽

Binomial Model ◽

Random Parameters ◽

Factors Affecting ◽

Independent Variables ◽

Insight Into

Factors affecting accident frequencies at 72 signalized intersections in the Gyeonggi-Do (province) over a four-year period (2007~2010) were explored using the random parameters negative binomial model. The empirical results from the comparison with fixed parameters binomial model show that the random parameters model outperforms its fixed parameters counterpart and provides a fuller understanding of the factors which determine accident frequencies at signalized intersections. In addition, elasticity and marginal effect were estimated to gain more insight into the effects of one-percent and one-unit changes in the dependent variable from changes in the independent variables.

Download Full-text