EXTENSIONS OF HURDLE MODELS FOR OVERDISPERSED COUNT DATA

2012 ◽  
Vol 22 (11) ◽  
pp. 1398-1404 ◽  
Author(s):  
Helmut Farbmacher
Keyword(s):  
Author(s):  
Cindy Xin Feng

AbstractCounts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.


Forests ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 174
Author(s):  
Dexian Zhao ◽  
Zhenkai Sun ◽  
Cheng Wang ◽  
Zezhou Hao ◽  
Baoqiang Sun ◽  
...  

Epiphytic bryophytes are known to perform essential ecosystem functions, but their sensitivity to environmental quality and change makes their survival and development vulnerable to global changes, especially habitat loss in urban environments. Fortunately, extensive urban tree planting programs worldwide have had a positive effect on the colonization and development of epiphytic bryophytes. However, how epiphytic bryophytes occur and grow on planted trees remain poorly known, especially in urban environments. In the present study, we surveyed the distribution of epiphytic bryophytes on tree trunks in a Schima superba Gardn. et Champ. urban plantation and then developed count data models, including tree characteristics, stand characteristics, human disturbance, terrain factors, and microclimate to predict the drivers on epiphytic bryophyte recruitment. Different counting models (Poisson, Negative binomial, Zero-inflated Poisson, Zero-inflated negative binomial, Hurdle-Poisson, Hurdle-negative binomial) were compared for a data analysis to account for the zero-inflated data structure. Our results show that (i) the shaded side and base of tree trunks were the preferred locations for bryophytes to colonize in urban plantations, (ii) both hurdle models performed well in modeling epiphytic bryophyte recruitment, and (iii) both hurdle models showed that the tree height, diameter at breast height (DBH), leaf area index (LAI), and altitude (ALT) promoted the occurrence of epiphytic bryophytes, but the height under branch and interference intensity of human activities opposed the occurrence of epiphytic bryophytes. Specifically, DBH and LAI had positive effects on the species richness recruitment count; similarly, DBH and ALT had positive effects on the abundance recruitment count, but slope had a negative effect. To promote the occurrence and growth of epiphytic bryophytes in urban tree planting programs, we suggest that managers regulate suitable habitats by cultivating and protecting large trees, promoting canopy closure, and controlling human disturbance.


2018 ◽  
Vol 28 (5) ◽  
pp. 1540-1551
Author(s):  
Maengseok Noh ◽  
Youngjo Lee

Poisson models are widely used for statistical inference on count data. However, zero-inflation or zero-deflation with either overdispersion or underdispersion could occur. Currently, there is no available model for count data, that allows excessive occurrence of zeros along with underdispersion in non-zero counts, even though there have been reported necessity of such models. Furthermore, given an excessive zero rate, we need a model that allows a larger degree of overdispersion than existing models. In this paper, we use a random-effect model to produce a general statistical model for accommodating such phenomenon occurring in real data analyses.


2006 ◽  
Vol 16 (4) ◽  
pp. 463-481 ◽  
Author(s):  
C. E. Rose ◽  
S. W. Martin ◽  
K. A. Wannemuehler ◽  
B. D. Plikaytis

2016 ◽  
Vol 51 (1) ◽  
pp. 68-78 ◽  
Author(s):  
Jean-Noel Vergnes ◽  
Jean-Philippe Boucher ◽  
Nathalie Lelong ◽  
Michel Sixou ◽  
Cathy Nabet

Methods for analysing dental caries and associated risk indicators have evolved considerably in recent decades. The use of zero-inflated or hurdle models is increasing so as to take account of the decayed, missing, and filled teeth (DMFT) distribution, which is positively skewed and has a high proportion of zero scores. However, there is a need to develop new statistical models that involve pragmatic biological considerations on dental caries in epidemiological surveys. In this paper, we show that the zero-inflated and the hurdle models can both be expressed as a compound sum. Using the same compound sum, we then present the generalized negative binomial (GNB) distribution for dental caries count data, and provide a numerical application using the data of the EPIPAP study. The GNB model generates the best score functions while handling the lifetime dental caries disease process better. In conclusion, the GNB model suits the nature of some count data, in particular when structural zeros are unlikely to occur and when several latent spells can lead to new countable events. For these reasons, the use of the GNB distribution appears to be relevant for the modelling of dental caries count data.


2016 ◽  
Vol 25 (6) ◽  
pp. 2558-2576 ◽  
Author(s):  
Brian Neelon ◽  
Howard H Chang ◽  
Qiang Ling ◽  
Nicole S Hastings

Motivated by a study exploring spatiotemporal trends in emergency department use, we develop a class of two-part hurdle models for the analysis of zero-inflated areal count data. The models consist of two components—one for the probability of any emergency department use and one for the number of emergency department visits given use. Through a hierarchical structure, the models incorporate both patient- and region-level predictors, as well as spatially and temporally correlated random effects for each model component. The random effects are assigned multivariate conditionally autoregressive priors, which induce dependence between the components and provide spatial and temporal smoothing across adjacent spatial units and time periods, resulting in improved inferences. To accommodate potential overdispersion, we consider a range of parametric specifications for the positive counts, including truncated negative binomial and generalized Poisson distributions. We adopt a Bayesian inferential approach, and posterior computation is handled conveniently within standard Bayesian software. Our results indicate that the negative binomial and generalized Poisson hurdle models vastly outperform the Poisson hurdle model, demonstrating that overdispersed hurdle models provide a useful approach to analyzing zero-inflated spatiotemporal data.


Sign in / Sign up

Export Citation Format

Share Document