scholarly journals Overdispersion study of poisson and zero-inflated poisson regression for some characteristics of the data on lamda, n, p

Author(s):  
Lili Puspita Rahayu ◽  
Kusman Sadik ◽  
Indahwati Indahwati

Poisson distribution is one of discrete distribution that is often used in modeling of rare events. The data obtained in form of counts with non-negative integers. One of analysis that is used in modeling count data is Poisson regression. Deviation of assumption that often occurs in the Poisson regression is overdispersion. Cause of overdispersion is an excess zero probability on the response variable. Solving model that be used to overcome of overdispersion is zero-inflated Poisson (ZIP) regression. The research aimed to develop a study of overdispersion for Poisson and ZIP regression on some characteristics of the data. Overdispersion on some characteristics of the data that were studied in this research are simulated by combining the parameter of Poisson distribution (λ), zero probability (p), and sample size (n) on the response variable then comparing the Poisson and ZIP regression models. Overdispersion study on data simulation showed that the larger λ, n, and p, the better is the model of ZIP than Poisson regression. The results of this simulation are also strengthened by the exploration of Pearson residual in Poisson and ZIP regression.

2012 ◽  
Vol 57 (1) ◽  
Author(s):  
SEYED EHSAN SAFFAR ◽  
ROBIAH ADNAN ◽  
WILLIAM GREENE

A Poisson model typically is assumed for count data. In many cases, there are many zeros in the dependent variable and because of these many zeros, the mean and the variance values of the dependent variable are not the same as before. In fact, the variance value of the dependent variable will be much more than the mean value of the dependent variable and this is called over–dispersion. Therefore, Poisson model is not suitable anymore for this kind of data because of too many zeros. Thus, it is suggested to use a hurdle Poisson regression model to overcome over–dispersion problem. Furthermore, the response variable in such cases is censored for some values. In this paper, a censored hurdle Poisson regression model is introduced on count data with many zeros. In this model, we consider a response variable and one or more than one explanatory variables. The estimation of regression parameters using the maximum likelihood method is discussed and the goodness–of–fit for the regression model is examined. We study the effects of right censoring on estimated parameters and their standard errors via an example.


2020 ◽  
Vol 10 (11) ◽  
pp. 4177-4190
Author(s):  
Osval Antonio Montesinos-López ◽  
José Cricelio Montesinos-López ◽  
Pawan Singh ◽  
Nerida Lozano-Ramirez ◽  
Alberto Barrón-López ◽  
...  

The paradigm called genomic selection (GS) is a revolutionary way of developing new plants and animals. This is a predictive methodology, since it uses learning methods to perform its task. Unfortunately, there is no universal model that can be used for all types of predictions; for this reason, specific methodologies are required for each type of output (response variables). Since there is a lack of efficient methodologies for multivariate count data outcomes, in this paper, a multivariate Poisson deep neural network (MPDN) model is proposed for the genomic prediction of various count outcomes simultaneously. The MPDN model uses the minus log-likelihood of a Poisson distribution as a loss function, in hidden layers for capturing nonlinear patterns using the rectified linear unit (RELU) activation function and, in the output layer, the exponential activation function was used for producing outputs on the same scale of counts. The proposed MPDN model was compared to conventional generalized Poisson regression models and univariate Poisson deep learning models in two experimental data sets of count data. We found that the proposed MPDL outperformed univariate Poisson deep neural network models, but did not outperform, in terms of prediction, the univariate generalized Poisson regression models. All deep learning models were implemented in Tensorflow as back-end and Keras as front-end, which allows implementing these models on moderate and large data sets, which is a significant advantage over previous GS models for multivariate count data.


2021 ◽  
Vol 10 (4) ◽  
pp. 157
Author(s):  
Chedly Gelin Louzayadio ◽  
Rodnellin Onesime Malouata ◽  
Michel Diafouka Koukouatikissa

In this paper, we present a new weighted Poisson distribution for modeling underdispersed count data. Weighted Poisson distribution occurs naturally in contexts where the probability that a particular observation of Poisson variable enters the sample gets multiplied by some non-negative weight function. Suppose a realization y of Y a Poisson random variable enters the investigator’s record with probability proportional to w(y): Clearly, the recorded y is not an observation on Y, but on the random variable Yw, which is said to be the weighted version of Y. This distribution a two-parameter is from the exponential family, it includes and generalizes the Poisson distribution by weighting. It is a discrete distribution that is more flexible than other weighted Poisson distributions that have been proposed for modeling underdispersed count data, for example, the extended Poisson distribution (Dimitrov and Kolev, 2000). We present some moment properties and we estimate its parameters. One classical example is considered to compare the fits of this new distribution with the extended Poisson distribution.


2021 ◽  
Author(s):  
Bodo Winter ◽  
Paul - Christian Bürkner

Count data is prevalent in many different areas of linguistics, such as when counting words, syntactic constructions, discourse particles, case markers, or speech errors. The Poisson distribution is the canonical distribution for characterizing count data with no or unknown upper bound. Whereas logistic regression is very common in linguistics, Poisson regression is little known. This tutorial introduces readers to foundational concepts needed for Poisson regression, followed by a hands-on tutorial using the R package brms. We discuss a dataset where Catalan and Korean speakers change the frequency of their co-speech gestures as a function of politeness contexts. This dataset also involves exposure variables (the incorporation of time to deal with unequal intervals) and overdispersion (excess variance). Altogether, we hope that more linguists will consider Poisson regression for the analysis of count data.


1967 ◽  
Vol 21 (1) ◽  
pp. 70-72 ◽  
Author(s):  
John E. Overall

Tables based upon Poisson distribution are presented which specify total sample size adequate to ensure ( p > .95) that a certain specified number of rare events will be observed. The tables are useful in planning sampling surveys where special interest is in obatining at least a specified number of cases of special research interest. The tables can also be used as basis for rejecting the hypothesis that occurrence rate for rate event exceeds some specified value in the population.


Sign in / Sign up

Export Citation Format

Share Document