Overdispersion study of poisson and zero-inflated poisson regression for some characteristics of the data on lamda, n, p

Poisson distribution is one of discrete distribution that is often used in modeling of rare events. The data obtained in form of counts with non-negative integers. One of analysis that is used in modeling count data is Poisson regression. Deviation of assumption that often occurs in the Poisson regression is overdispersion. Cause of overdispersion is an excess zero probability on the response variable. Solving model that be used to overcome of overdispersion is zero-inflated Poisson (ZIP) regression. The research aimed to develop a study of overdispersion for Poisson and ZIP regression on some characteristics of the data. Overdispersion on some characteristics of the data that were studied in this research are simulated by combining the parameter of Poisson distribution (λ), zero probability (p), and sample size (n) on the response variable then comparing the Poisson and ZIP regression models. Overdispersion study on data simulation showed that the larger λ, n, and p, the better is the model of ZIP than Poisson regression. The results of this simulation are also strengthened by the exploration of Pearson residual in Poisson and ZIP regression.

Download Full-text

Marginal regression models for clustered count data based on zero-inflated Conway-Maxwell-Poisson distribution with applications

Biometrics ◽

10.1111/biom.12436 ◽

2015 ◽

Vol 72 (2) ◽

pp. 606-618 ◽

Cited By ~ 13

Author(s):

Hyoyoung Choo-Wosoba ◽

Steven M. Levy ◽

Somnath Datta

Keyword(s):

Poisson Distribution ◽

Count Data ◽

Regression Models ◽

Marginal Regression

Download Full-text

PARAMETER ESTIMATION ON HURDLE POISSON REGRESSION MODEL WITH CENSORED DATA

Jurnal Teknologi ◽

10.11113/jt.v57.1533 ◽

2012 ◽

Vol 57 (1) ◽

Author(s):

SEYED EHSAN SAFFAR ◽

ROBIAH ADNAN ◽

WILLIAM GREENE

Keyword(s):

Regression Model ◽

Count Data ◽

Poisson Regression ◽

Goodness Of Fit ◽

Poisson Model ◽

Likelihood Method ◽

Poisson Regression Model ◽

Response Variable ◽

The Mean ◽

Over Dispersion

A Poisson model typically is assumed for count data. In many cases, there are many zeros in the dependent variable and because of these many zeros, the mean and the variance values of the dependent variable are not the same as before. In fact, the variance value of the dependent variable will be much more than the mean value of the dependent variable and this is called over–dispersion. Therefore, Poisson model is not suitable anymore for this kind of data because of too many zeros. Thus, it is suggested to use a hurdle Poisson regression model to overcome over–dispersion problem. Furthermore, the response variable in such cases is censored for some values. In this paper, a censored hurdle Poisson regression model is introduced on count data with many zeros. In this model, we consider a response variable and one or more than one explanatory variables. The estimation of regression parameters using the maximum likelihood method is discussed and the goodness–of–fit for the regression model is examined. We study the effects of right censoring on estimated parameters and their standard errors via an example.

Download Full-text

Poisson Regression Models for Count Data: Use in the Number of Deaths in the Santo Angelo (Brazil)

Journal of Basic & Applied Sciences ◽

10.6000/1927-5129.2012.08.02.01 ◽

2012 ◽

Cited By ~ 1

Author(s):

Russo

Keyword(s):

Count Data ◽

Poisson Regression ◽

Regression Models ◽

Data Use

Download Full-text

Mixed INAR(1) Poisson regression models: Analyzing heterogeneity and serial dependencies in longitudinal count data

Journal of Econometrics ◽

10.1016/s0304-4076(98)00069-4 ◽

1998 ◽

Vol 89 (1-2) ◽

pp. 317-338 ◽

Cited By ~ 37

Author(s):

Ulf Böckenholt

Keyword(s):

Count Data ◽

Poisson Regression ◽

Regression Models ◽

Longitudinal Count Data

Download Full-text

A Multivariate Poisson Deep Learning Model for Genomic Prediction of Count Data

G3 Genes|Genome|Genetics ◽

10.1534/g3.120.401631 ◽

2020 ◽

Vol 10 (11) ◽

pp. 4177-4190

Author(s):

Osval Antonio Montesinos-López ◽

José Cricelio Montesinos-López ◽

Pawan Singh ◽

Nerida Lozano-Ramirez ◽

Alberto Barrón-López ◽

...

Keyword(s):

Deep Learning ◽

Count Data ◽

Poisson Regression ◽

Regression Models ◽

Deep Neural Network ◽

Activation Function ◽

Data Sets ◽

Learning Models ◽

Generalized Poisson ◽

Generalized Poisson Regression

The paradigm called genomic selection (GS) is a revolutionary way of developing new plants and animals. This is a predictive methodology, since it uses learning methods to perform its task. Unfortunately, there is no universal model that can be used for all types of predictions; for this reason, specific methodologies are required for each type of output (response variables). Since there is a lack of efficient methodologies for multivariate count data outcomes, in this paper, a multivariate Poisson deep neural network (MPDN) model is proposed for the genomic prediction of various count outcomes simultaneously. The MPDN model uses the minus log-likelihood of a Poisson distribution as a loss function, in hidden layers for capturing nonlinear patterns using the rectified linear unit (RELU) activation function and, in the output layer, the exponential activation function was used for producing outputs on the same scale of counts. The proposed MPDN model was compared to conventional generalized Poisson regression models and univariate Poisson deep learning models in two experimental data sets of count data. We found that the proposed MPDL outperformed univariate Poisson deep neural network models, but did not outperform, in terms of prediction, the univariate generalized Poisson regression models. All deep learning models were implemented in Tensorflow as back-end and Keras as front-end, which allows implementing these models on moderate and large data sets, which is a significant advantage over previous GS models for multivariate count data.

Download Full-text

Modified Poisson Regression Models of Count Data and Parameter Estimation

المجلة العلمیة لقطاع کلیات التجارة ◽

10.21608/jsfc.2015.25927 ◽

2015 ◽

Vol 14 (2) ◽

pp. 1-38

Keyword(s):

Parameter Estimation ◽

Count Data ◽

Poisson Regression ◽

Regression Models

Download Full-text

Influence analysis for count data based on generalized Poisson regression models

Statistics ◽

10.1080/02331880903138494 ◽

2010 ◽

Vol 44 (4) ◽

pp. 341-360 ◽

Cited By ~ 5

Author(s):

Feng-Chang Xie ◽

Bo-Cheng Wei

Keyword(s):

Count Data ◽

Poisson Regression ◽

Regression Models ◽

Influence Analysis ◽

Generalized Poisson ◽

Generalized Poisson Regression

Download Full-text

A Weighted Poisson Distribution for Underdispersed Count Data

International Journal of Statistics and Probability ◽

10.5539/ijsp.v10n4p157 ◽

2021 ◽

Vol 10 (4) ◽

pp. 157

Author(s):

Chedly Gelin Louzayadio ◽

Rodnellin Onesime Malouata ◽

Michel Diafouka Koukouatikissa

Keyword(s):

Poisson Distribution ◽

Count Data ◽

Exponential Family ◽

Discrete Distribution ◽

Random Variable ◽

Poisson Random Variable ◽

Weighted Version ◽

Poisson Variable ◽

Two Parameter ◽

New Distribution

In this paper, we present a new weighted Poisson distribution for modeling underdispersed count data. Weighted Poisson distribution occurs naturally in contexts where the probability that a particular observation of Poisson variable enters the sample gets multiplied by some non-negative weight function. Suppose a realization y of Y a Poisson random variable enters the investigator’s record with probability proportional to w(y): Clearly, the recorded y is not an observation on Y, but on the random variable Yw, which is said to be the weighted version of Y. This distribution a two-parameter is from the exponential family, it includes and generalizes the Poisson distribution by weighting. It is a discrete distribution that is more flexible than other weighted Poisson distributions that have been proposed for modeling underdispersed count data, for example, the extended Poisson distribution (Dimitrov and Kolev, 2000). We present some moment properties and we estimate its parameters. One classical example is considered to compare the fits of this new distribution with the extended Poisson distribution.

Download Full-text

Poisson regression for linguists: A tutorial introduction to modeling count data with brms

10.31219/osf.io/93kaf ◽

2021 ◽

Author(s):

Bodo Winter ◽

Paul - Christian Bürkner

Keyword(s):

Logistic Regression ◽

Poisson Distribution ◽

Upper Bound ◽

Count Data ◽

Poisson Regression ◽

R Package ◽

Canonical Distribution ◽

Discourse Particles ◽

Hands On ◽

Case Markers

Count data is prevalent in many different areas of linguistics, such as when counting words, syntactic constructions, discourse particles, case markers, or speech errors. The Poisson distribution is the canonical distribution for characterizing count data with no or unknown upper bound. Whereas logistic regression is very common in linguistics, Poisson regression is little known. This tutorial introduces readers to foundational concepts needed for Poisson regression, followed by a hands-on tutorial using the R package brms. We discuss a dataset where Catalan and Korean speakers change the frequency of their co-speech gestures as a function of politeness contexts. This dataset also involves exposure variables (the incorporation of time to deal with unequal intervals) and overdispersion (excess variance). Altogether, we hope that more linguists will consider Poisson regression for the analysis of count data.

Download Full-text

Sample Size Required to Observe at Least k Rare Events

Psychological Reports ◽

10.2466/pr0.1967.21.1.70 ◽

1967 ◽

Vol 21 (1) ◽

pp. 70-72 ◽

Cited By ~ 3

Author(s):

John E. Overall

Keyword(s):

Sample Size ◽

Poisson Distribution ◽

Special Interest ◽

Rare Events ◽

Total Sample ◽

Research Interest ◽

Occurrence Rate ◽

Total Sample Size

Tables based upon Poisson distribution are presented which specify total sample size adequate to ensure ( p > .95) that a certain specified number of rare events will be observed. The tables are useful in planning sampling surveys where special interest is in obatining at least a specified number of cases of special research interest. The tables can also be used as basis for rejecting the hypothesis that occurrence rate for rate event exceeds some specified value in the population.

Download Full-text