Managing Inflation

Poisson and negative binomial regression procedures have proliferated, and now are available in virtually all statistical packages. Along with the regression procedures themselves are procedures for addressing issues related to the over-dispersion and excessive zeros commonly observed in count data. These approaches, zero-inflated Poisson and zero-inflated negative binomial models, use logit or probit models for the “excess” zeros and count regression models for the counted data. Although these models are often appropriate on statistical grounds, their interpretation may prove substantively difficult. This article explores this dilemma, using data from a study of individuals released from facilities maintained by the Massachusetts Department of Correction.

Download Full-text

Modeling Count Data: The Poisson and Negative Binomial Regression Models

Econometrics ◽

10.1007/978-1-137-37502-5_12 ◽

2015 ◽

pp. 236-248

Author(s):

Damodar Gujarati

Keyword(s):

Count Data ◽

Regression Models ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Binomial Regression

Download Full-text

A Study of Count Regression Models for Mortality Rate

CAUCHY ◽

10.18860/ca.v7i1.13642 ◽

2021 ◽

Vol 7 (1) ◽

pp. 142-151

Author(s):

Anwar Fitrianto

Keyword(s):

Mortality Rate ◽

Regression Model ◽

Count Data ◽

Bayesian Information Criterion ◽

Regression Models ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Information Criterion ◽

Poisson Regression Model ◽

Binomial Regression

This paper discusses how overdispersed count data to be fit. Poisson regression model, Negative Binomial 1 regression model (NEGBIN 1) and Negative Binomial regression 2 (NEGBIN 2) model were proposed to fit mortality rate data. The method used is comparing the values of Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) to find out which method suits the data the most. The results show that the data indeed display higher variability. Among the three models, the model preferred is NEGBIN 1 model.

Download Full-text

A bivariate zero-inflated negative binomial regression model for count data with excess zeros

Economics Letters ◽

10.1016/s0165-1765(02)00262-8 ◽

2003 ◽

Vol 78 (3) ◽

pp. 373-378 ◽

Cited By ~ 37

Author(s):

Peiming Wang

Keyword(s):

Regression Model ◽

Count Data ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Negative Binomial Regression Model ◽

Excess Zeros ◽

Binomial Regression

Download Full-text

Longitudinal Analysis of Light Rail and Streetcar Safety in the United States

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120927004 ◽

2020 ◽

Vol 2674 (9) ◽

pp. 83-95

Author(s):

Abubakr Ziedan ◽

Candace Brakewood

Keyword(s):

Regression Models ◽

Negative Binomial ◽

Negative Binomial Regression ◽

The United States ◽

Descriptive Statistics ◽

Light Rail ◽

Right Of Way ◽

Binomial Regression ◽

Using Data ◽

American Cities

Many American cities have launched or expanded light rail or streetcar services recently, which has resulted in a 61% increase in light rail and streetcar revenue miles nationwide during the period 2006–2016. Moreover, light rail and streetcars exhibit higher fatality rates per passenger mile traveled compared with other transit modes. In light of these trends, this study explores light rail and streetcar collisions, injuries, and fatalities using data obtained from the National Transit Database. This study applies a two-part methodology. In the first part, descriptive statistics are calculated for light rail and streetcar collisions, injuries, and fatalities, and a comparative analysis of light rail and streetcars is performed. In the second part, multilevel negative binomial regression models are used to analyze light rail and streetcar collisions and injuries. Three key findings have emerged from this study. First, the results generally align with findings from prior studies that show the majority of light rail and streetcar collisions occur in mixed right-of-way or near at-grade crossings. Second, this analysis revealed an issue predominantly at stations: 42% of light rail injuries were people waiting or leaving. Third, suicide was the leading cause of light rail fatalities, which represents 28% of all light rail fatalities. The implications of this study are important for cities that currently operate these modes or are planning to introduce new light rail or streetcar service to improve safety.

Download Full-text

Overdisp: A Stata (and Mata) Package for Direct Detection of Overdispersion in Poisson and Negative Binomial Regression Models

Statistics Optimization & Information Computing ◽

10.19139/soic-2310-5070-557 ◽

2020 ◽

Vol 8 (3) ◽

pp. 773-789

Author(s):

Luiz Paulo Lopes Fávero ◽

Patrícia Belfiore ◽

Marco Aurélio dos Santos ◽

R. Freitas Souza

Keyword(s):

Count Data ◽

Data Model ◽

Regression Models ◽

Negative Binomial ◽

Direct Detection ◽

Negative Binomial Regression ◽

The Other ◽

Explanatory Variables ◽

Data Regression ◽

Binomial Regression

Stata has several procedures that can be used in analyzing count-data regression models and, more specifically, in studying the behavior of the dependent variable, conditional on explanatory variables. Identifying overdispersion in countdata models is one of the most important procedures that allow researchers to correctly choose estimations such as Poisson or negative binomial, given the distribution of the dependent variable. The main purpose of this paper is to present a new command for the identification of overdispersion in the data as an alternative to the procedure presented by Cameron and Trivedi [5], since it directly identifies overdispersion in the data, without the need to previously estimate a specific type of count-data model. When estimating Poisson or negative binomial regression models in which the dependent variable is quantitative, with discrete and non-negative values, the new Stata package overdisp helps researchers to directly propose more consistent and adequate models. As a second contribution, we also present a simulation to show the consistency of the overdispersion test using the overdisp command. Findings show that, if the test indicates equidispersion in the data, there are consistent evidence that the distribution of the dependent variable is, in fact, Poisson. If, on the other hand, the test indicates overdispersion in the data, researchers should investigate more deeply whether the dependent variable actually exhibits better adherence to the Poisson-Gamma distribution or not.

Download Full-text

A Bayesian inference tool for identifying artifactual calls from differential transcript abundance analyses

10.1101/2020.02.27.967240 ◽

2020 ◽

Author(s):

Stefano Mangiola ◽

Evan A Thomas ◽

Martin Modrák ◽

Anthony T Papenfuss

Keyword(s):

Negative Binomial ◽

Genetic Material ◽

Negative Binomial Regression ◽

Transcript Abundance ◽

Negative Binomial Regression Model ◽

Data Set ◽

Statistical Framework ◽

Binomial Regression ◽

Over Dispersion ◽

Binomial Models

AbstractRelative transcript abundance has proven to be a valuable tool for inferring the phenotype of biological systems from genetic material. Several methods for the analysis of differential transcript abundance have been developed, and some of the most popular are based on negative binomial models. Although most genes are fitted reasonably well by the negative binomial distribution, the presence of outlier observations that do not fit such models can lead to artifactual identification of significant changes in transcription. Identifying those transcripts for the correct interpretation of results is extremely important. A robust and automated tool for detecting sample/transcript pairs that do not fit a negative binomial regression model is currently lacking. Here we propose ppcseq, a robust statistical framework that models hierarchically sample- and gene-wise features such as sequencing depth bias, the association between mean transcript abundance and its over-dispersion, and provides a theoretical transcript abundance distribution, on which the observed transcript abundance can be tested for outliers. We show using a publicly available data set where nearly 10% of differentially abundant transcripts had fold change inflated by the presence of outliers. This method has broad utility in filtering artifactual results of differential transcript abundance analyses based on a negative binomial framework.

Download Full-text

Analysis of the correlation of socioeconomic, sanitary, and demographic factors with homicide deaths - Bahia, Brazil, 2013-2015

Revista Brasileira de Enfermagem ◽

10.1590/0034-7167-2019-0346 ◽

2020 ◽

Vol 73 (6) ◽

Author(s):

Tiago Oliveira de Souza ◽

Edinilsa Ramos de Souza ◽

Liana Wernersbach Pinto

Keyword(s):

Regression Models ◽

Ecological Study ◽

Negative Binomial ◽

Demographic Factors ◽

Negative Binomial Regression ◽

Legal Intervention ◽

Explanatory Variables ◽

Binomial Regression ◽

Education Levels ◽

Using Data

ABSTRACT Objective: To analyze the correlation of socioeconomic, sanitary, and demographic factors with homicides in Bahia, from 2013 to 2015. Methods: Ecological study, using data from the Information System on Mortality and from the Superintendence of Economic and Social Studies. The depending variable is the corrected homicide rate. Explanatory variables were categorized in four axes. Simple and multiple negative binomial regression models were used. Results: Positive associations were found between homicides and the Index of Economy and Finances (IEF), the Human Development Index, the Gini Index, population density, and legal intervention death rates (LIDR). The variables Index of Education Levels (IEL), rates of death with undetermined intentions (RDUI), and the proportion of ill-defined causes (IDC) presented a negative association with the homicide rates. Conclusion: The specific features of the context of each community, in addition to broader socioeconomic municipal factors, directly interfere in life conditions and increase the risk of dying by homicide.

Download Full-text

Statistical models for analyzing count data: predictors of length of stay among HIV patients in Portugal using a multilevel model

BMC Health Services Research ◽

10.1186/s12913-021-06389-1 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Ahmed Nabil Shaaban ◽

Bárbara Peleteiro ◽

Maria Rosario O. Martins

Keyword(s):

Length Of Stay ◽

Regression Model ◽

Random Effects ◽

Count Data ◽

Negative Binomial ◽

Negative Binomial Regression ◽

Comprehensive Approach ◽

Negative Binomial Regression Model ◽

Hiv Patients ◽

Binomial Regression

Abstract Background This study offers a comprehensive approach to precisely analyze the complexly distributed length of stay among HIV admissions in Portugal. Objective To provide an illustration of statistical techniques for analysing count data using longitudinal predictors of length of stay among HIV hospitalizations in Portugal. Method Registered discharges in the Portuguese National Health Service (NHS) facilities Between January 2009 and December 2017, a total of 26,505 classified under Major Diagnostic Category (MDC) created for patients with HIV infection, with HIV/AIDS as a main or secondary cause of admission, were used to predict length of stay among HIV hospitalizations in Portugal. Several strategies were applied to select the best count fit model that includes the Poisson regression model, zero-inflated Poisson, the negative binomial regression model, and zero-inflated negative binomial regression model. A random hospital effects term has been incorporated into the negative binomial model to examine the dependence between observations within the same hospital. A multivariable analysis has been performed to assess the effect of covariates on length of stay. Results The median length of stay in our study was 11 days (interquartile range: 6–22). Statistical comparisons among the count models revealed that the random-effects negative binomial models provided the best fit with observed data. Admissions among males or admissions associated with TB infection, pneumocystis, cytomegalovirus, candidiasis, toxoplasmosis, or mycobacterium disease exhibit a highly significant increase in length of stay. Perfect trends were observed in which a higher number of diagnoses or procedures lead to significantly higher length of stay. The random-effects term included in our model and refers to unexplained factors specific to each hospital revealed obvious differences in quality among the hospitals included in our study. Conclusions This study provides a comprehensive approach to address unique problems associated with the prediction of length of stay among HIV patients in Portugal.

Download Full-text

Transition models for count data: a flexible alternative to fixed distribution models

Statistical Methods & Applications ◽

10.1007/s10260-021-00558-6 ◽

2021 ◽

Author(s):

Moritz Berger ◽

Gerhard Tutz

Keyword(s):

Count Data ◽

Regression Models ◽

Negative Binomial ◽

Real Data ◽

Distribution Models ◽

Explanatory Variables ◽

Excess Zeros ◽

Proposed Model ◽

Transition Models ◽

Fixed Distribution

AbstractA flexible semiparametric class of models is introduced that offers an alternative to classical regression models for count data as the Poisson and Negative Binomial model, as well as to more general models accounting for excess zeros that are also based on fixed distributional assumptions. The model allows that the data itself determine the distribution of the response variable, but, in its basic form, uses a parametric term that specifies the effect of explanatory variables. In addition, an extended version is considered, in which the effects of covariates are specified nonparametrically. The proposed model and traditional models are compared in simulations and by utilizing several real data applications from the area of health and social science.

Download Full-text