count distribution
Recently Published Documents


TOTAL DOCUMENTS

90
(FIVE YEARS 24)

H-INDEX

17
(FIVE YEARS 1)

Author(s):  
Silan Li ◽  
Xiaoya Hu ◽  
Tao Jiang ◽  
Rongqing Zhang ◽  
Liuqing Yang ◽  
...  

2021 ◽  
Vol 104 (12) ◽  
Author(s):  
Florian List ◽  
Nicholas L. Rodd ◽  
Geraint F. Lewis

2021 ◽  
Vol 13 (3) ◽  
pp. 37-48
Author(s):  
Surajit Pal ◽  
Susanta Kumar Gauri

The high-quality processes usually have more count of zeros than are expected under chance variation of its underlying Poisson or other count distribution. Therefore, these processes are usually referred to as zero-inflated processes. The zeroinflated processes are commonly modelled by zero-inflated Poisson (ZIP) or zero-inflated negative binomial (ZINB) distribution. In a manufacturing set up, the evaluation of process capability index of a zero-inflated process can be useful in many ways, e.g. i) predicting how well the process will hold the specifications, ii) selecting between competing vendors, and iii) assisting product developers/designers in modifying the process, etc. However, researchers have given very little attentions on this aspect of zero-inflated processes. Only one such attempt is reported in literature. But, it does not always represent the true capabilities of zero-inflated processes, and sometimes it may give very misleading impression about the capability of the concerned process. In this article, the concept of Borges and Ho (2001) is applied to zero-inflated processes and a new approach for computation of process capability index of zero-inflated processes is developed. The proposed method reveals the true capabilities of zero-inflated processes consistently. Application of the proposed approach and its effectiveness are illustrated using two datasets published by past researchers.


2021 ◽  
Author(s):  
Yuge Wang ◽  
Hongyu Zhao

Advances in single-cell RNA sequencing (scRNA-seq) have led to successes in discovering novel cell types and understanding cellular heterogeneity among complex cell populations through cluster analysis. However, cluster analysis is not able to reveal continuous spectrum of states and underlying gene expression programs (GEPs) shared across cell types. We introduce scAAnet, an autoencoder for single-cell non-linear archetypal analysis, to identify GEPs and infer the relative activity of each GEP across cells. We use a count distribution-based loss term to account for the sparsity and overdispersion of the raw count data and add an archetypal constraint to the loss function of scAAnet. We first show that scAAnet outperforms existing methods for archetypal analysis across different metrics through simulations. We then demonstrate the ability of scAAnet to extract biologically meaningful GEPs using publicly available scRNA-seq datasets including a pancreatic islet dataset, a lung idiopathic pulmonary fibrosis dataset and a prefrontal cortex dataset.


Author(s):  
Chénangnon Frédéric Tovissodé ◽  
Romain Lucas Glèlè Kakaï

The normal and Poisson distribution assumptions in the normal-Poisson mixed effects regression model are often too restrictive for many real count data. Several works have independently relaxed the Poisson conditional distribution assumption for counts or the normal distribution assumption for random effects. This work couples some recent advances in these two regards to develop a skew t–discrete gamma regression model in which the count outcomes have full dispersion flexibility and random effets can be skewed and heavy tailed. Inference in the model is achieved by maximum likelihood using pseudo-adaptive Gaussian quadature. The use of the proposal is demonstrated on a popular owl sibling negotiation data. It appears that, for this example, the proposed approach outperforms models based on normal random effects and the Poisson or negative binomial count distribution.


Author(s):  
Cindy Xin Feng

AbstractCounts data with excessive zeros are frequently encountered in practice. For example, the number of health services visits often includes many zeros representing the patients with no utilization during a follow-up time. A common feature of this type of data is that the count measure tends to have excessive zero beyond a common count distribution can accommodate, such as Poisson or negative binomial. Zero-inflated or hurdle models are often used to fit such data. Despite the increasing popularity of ZI and hurdle models, there is still a lack of investigation of the fundamental differences between these two types of models. In this article, we reviewed the zero-inflated and hurdle models and highlighted their differences in terms of their data generating processes. We also conducted simulation studies to evaluate the performances of both types of models. The final choice of regression model should be made after a careful assessment of goodness of fit and should be tailored to a particular data in question.


2021 ◽  
Vol 14 ◽  
pp. 1-8
Author(s):  
Yook-Ngor Phang ◽  
Seng-Huat Ong ◽  
Yeh-Ching Low

The Poisson inverse Gaussian and generalized Poisson distributions are widely used in modelling overdispersed count data which are commonly found in healthcare, insurance, engineering, econometric and ecology. The inverse trinomial distribution is a relatively new count distribution arising from a one-dimensional random walk model (Shimizu & Yanagimoto, 1991). The Poisson inverse Gaussian distribution is a popular count model that has been proposed as an alternative to the negative binomial distribution. The inverse trinomial and generalized Poisson models possess a common characteristic of having a cubic variance function, while the Poisson inverse Gaussian has a quadratic variance function. The nature of the variance function seems to be an important property in modelling overdispersed count data. Hence it is of interest to be able to select among the three models in practical applications. This paper considers discrimination of three models based on the likelihood ratio statistic and computes via Monte Carlo simulation the probability of correct selection.


Author(s):  
Chenangnon Frédéric Tovissodé ◽  
Romain Glele Kakai

It is quite easy to stochastically distort an original count variable to obtain a new count variable with relatively more variability than in the original variable. Many popular overdispersion models (variance greater than mean) can indeed be obtained by mixtures, compounding or randomlystopped sums. There is no analogous stochastic mechanism for the construction of underdispersed count variables (variance less than mean), starting from an original count distribution of interest. This work proposes a generic method to stochastically distort an original count variable to obtain a new count variable with relatively less variability than in the original variable. The proposed mechanism, termed condensation, attracts probability masses from the quantiles in the tails of the original distribution and redirect them toward quantiles around the expected value. If the original distribution can be simulated, then the simulation of variates from a condensed distribution is straightforward. Moreover, condensed distributions have a simple mean-parametrization, a characteristic useful in a count regression context. An application to the negative binomial distribution resulted in a distribution allowing under, equi and overdispersion. In addition to graphical insights, fields of applications of special cases of condensed Poisson and condensed negative binomial distributions were pointed out as an indication of the potential of condensation for a flexible analysis of count data


Author(s):  
Yixuan Zou ◽  
Jan Hannig ◽  
Derek S. Young

AbstractZero-inflated and hurdle models are widely applied to count data possessing excess zeros, where they can simultaneously model the process from how the zeros were generated and potentially help mitigate the effects of overdispersion relative to the assumed count distribution. Which model to use depends on how the zeros are generated: zero-inflated models add an additional probability mass on zero, while hurdle models are two-part models comprised of a degenerate distribution for the zeros and a zero-truncated distribution. Developing confidence intervals for such models is challenging since no closed-form function is available to calculate the mean. In this study, generalized fiducial inference is used to construct confidence intervals for the means of zero-inflated Poisson and Poisson hurdle models. The proposed methods are assessed by an intensive simulation study. An illustrative example demonstrates the inference methods.


Author(s):  
Afida Nurul Hilma ◽  
Dian Lestari ◽  
Sindy Devila

In order to find a counting distribution that can handle the condition when the data has no zero-count. Distribution named Zero-truncated Poisson-Lindley distribution is developed. It can handle the condition when the data has no zero-count both in over-dispersion and under-dispersion. In this paper, characteristics of Zero-truncated Poisson-Lindley distribution are obtained and estimate distribution parameters using the maximum likelihood method. Then, the application of the model to real data is given.


Sign in / Sign up

Export Citation Format

Share Document