Analysis of correlated count data using generalised linear mixed models exemplified by field data on aggressive behaviour of boars

Population-averaged and subject-specific models are available to evaluate count data when repeated observations per subject are present. The latter are also known in the literature as generalised linear mixed models (GLMM). In GLMM repeated measures are taken into account explicitly through random animal effects in the linear predictor. In this paper the relevant GLMMs are presented based on conditional Poisson or negative binomial distribution of the response variable for given random animal effects. Equations for the repeatability of count data are derived assuming normal distribution and logarithmic gamma distribution for the random animal effects. Using count data on aggressive behaviour events of pigs (barrows, sows and boars) in mixed-sex housing, we demonstrate the use of the Poisson »log-gamma intercept«, the Poisson »normal intercept« and the »normal intercept« model with negative binomial distribution. Since not all count data can definitely be seen as Poisson or negative-binomially distributed, questions of model selection and model checking are examined. Emanating from the example, we also interpret the least squares means, estimated on the link as well as the response scale. Options provided by the SAS procedure NLMIXED for estimating model parameters and for estimating marginal expected values are presented.

Download Full-text

Analysis of correlated count data using generalised linear mixed models exemplified by field data on aggressive behaviour of boars

Archives Animal Breeding ◽

10.7482/0003-9438-57-026 ◽

2015 ◽

Vol 57 (1) ◽

pp. 1-19

Author(s):

Norbert Mielenz ◽

Joachim Spilke ◽

Eberhard von Borell

Keyword(s):

Negative Binomial Distribution ◽

Count Data ◽

Binomial Distribution ◽

Mixed Models ◽

Aggressive Behaviour ◽

Negative Binomial ◽

Linear Mixed Models ◽

Model Parameters ◽

Generalised Linear Mixed Models ◽

Animal Effects

Abstract. Population-averaged and subject-specific models are available to evaluate count data when repeated observations per subject are present. The latter are also known in the literature as generalised linear mixed models (GLMM). In GLMM repeated measures are taken into account explicitly through random animal effects in the linear predictor. In this paper the relevant GLMMs are presented based on conditional Poisson or negative binomial distribution of the response variable for given random animal effects. Equations for the repeatability of count data are derived assuming normal distribution and logarithmic gamma distribution for the random animal effects. Using count data on aggressive behaviour events of pigs (barrows, sows and boars) in mixed-sex housing, we demonstrate the use of the Poisson »log-gamma intercept«, the Poisson »normal intercept« and the »normal intercept« model with negative binomial distribution. Since not all count data can definitely be seen as Poisson or negative-binomially distributed, questions of model selection and model checking are examined. Emanating from the example, we also interpret the least squares means, estimated on the link as well as the response scale. Options provided by the SAS procedure NLMIXED for estimating model parameters and for estimating marginal expected values are presented.

Download Full-text

COMPARATIVE STUDY OF CATTLE TICK RESISTANCE USING GENERALIZED LINEAR MIXED MODELS

REVISTA BRASILEIRA DE BIOMETRIA ◽

10.28951/rbb.v37i1.341 ◽

2019 ◽

Vol 37 (1) ◽

pp. 41

Author(s):

Amanda Marchi MAIORANO ◽

Thiago Santos MOTA ◽

Ana Carolina VERDUGO ◽

Ricardo Antonio da Silva FARIA ◽

Beatriz Pressi Molina da SILVA ◽

...

Keyword(s):

Negative Binomial Distribution ◽

Count Data ◽

Binomial Distribution ◽

Mixed Models ◽

Negative Binomial ◽

Bos Taurus ◽

Generalized Linear Mixed Models ◽

Linear Mixed Models ◽

Tick Count ◽

Tick Resistance

Comparison of tick resistance in Bos taurus indicus (Nelore) and Bos taurus taurus (Simmental and Caracu) subspecies was investigated utilizing generalized linear mixed models (GLMMs) with Poisson and Negative binomial distributions. Nelore animals (NE) are known to present greater resistance than t. taurus. Difference between tick resistance in Simmental (SI) and Caracu (CA) breeds has never been reported previously. Three artificial tick infestations were conducted to evaluate tick resistance in these breeds. The statistic point of the present study was to show alternative models for the evaluation of tick count data, the GLMMs. Analysis for tick resistance by GLMM with Negative binomial distribution has never been assessed previously. The analyses were performed by the use of the PROC GLIMMIX procedure of the SAS program. The results showed that GLMM with Negative binomial distribution is appropriated to evaluate tick count data with excess of zero observations avoiding overdispersion problems. Finally, considering multiple comparisons with the Bonferroni test, different pattern of tick infestation was observed for the studied breeds, suggesting that NE is the most resistant breed followed by CA.

Download Full-text

Suitability of Several Statistical Models to Simulate Observed Distribution of Sample Test Results in Inspections of Aflatoxin-Contaminated Peanut Lots

Journal of AOAC International ◽

10.1093/jaoac/79.4.981 ◽

1996 ◽

Vol 79 (4) ◽

pp. 981-988 ◽

Cited By ~ 7

Author(s):

Thomas Whitaker ◽

Francis Giesbrecht ◽

Jeremy Wu

Keyword(s):

Parameter Estimation ◽

Negative Binomial Distribution ◽

Binomial Distribution ◽

Statistical Models ◽

Negative Binomial ◽

Estimation Method ◽

Likelihood Method ◽

Model Parameters ◽

Test Results ◽

Parameter Estimation Method

Abstract The acceptability of 10 theoretical distributions to simulate observed distribution of sample aflatoxin test results was evaluated by using 2 parameter estimation methods and 3 goodness of fit (GOF) tests. All theoretical distributions were compared with 120 observed distributions of aflatoxin test results of farmers' stock peanuts. For a given parameter estimation method and GOF test, the negative binomial distribution had the highest percentage of statistically acceptable fits. The log normal and Poisson-gamma (gamma shape parameter = 0.5) distributions had slightly fewer but an almost equal percentage of acceptable fits. For the 3 most acceptable statistical models, the negative binomial had the greatest percentage of best or closest fits. Both the parameter estimation method and the GOF test had an influence on which theoretical distribution had the largest number of acceptable fits. All theoretical distributions, except the negative binomial distribution, had more acceptable fits when model parameters were determined by the maximum likelihood method. The negative binomial had slightly more acceptable fits when model parameters were estimated by the method of moments. The results also demonstrated the importance of using the same GOF test for comparing the acceptability of several theoretical distributions.

Download Full-text

Fitting the truncated negative binomial distribution to count data

Environmental and Ecological Statistics ◽

10.1007/s10651-016-0343-1 ◽

2016 ◽

Vol 23 (3) ◽

pp. 359-385 ◽

Cited By ~ 3

Author(s):

Claude Manté ◽

Saikou Oumar Kidé ◽

Anne-Francoise Yao-Lafourcade ◽

Bastien Mérigot

Keyword(s):

Negative Binomial Distribution ◽

Count Data ◽

Binomial Distribution ◽

Negative Binomial

Download Full-text

The Lindley negative-binomial distribution: Properties, estimation and applications to lifetime data

Mathematica Slovaca ◽

10.1515/ms-2017-0404 ◽

2020 ◽

Vol 70 (4) ◽

pp. 917-934

Author(s):

Muhammad Mansoor ◽

Muhammad Hussain Tahir ◽

Gauss M. Cordeiro ◽

Sajid Ali ◽

Ayman Alzaatreh

Keyword(s):

Negative Binomial Distribution ◽

Binomial Distribution ◽

Hazard Rate ◽

Negative Binomial ◽

Real Data ◽

Moment Generating Function ◽

Estimation Methods ◽

Model Parameters ◽

Proposed Model ◽

Rate Functions

AbstractA generalization of the Lindley distribution namely, Lindley negative-binomial distribution, is introduced. The Lindley and the exponentiated Lindley distributions are considered as sub-models of the proposed distribution. The proposed model has flexible density and hazard rate functions. The density function can be decreasing, right-skewed, left-skewed and approximately symmetric. The hazard rate function possesses various shapes including increasing, decreasing and bathtub. Furthermore, the survival and hazard rate functions have closed form representations which make this model tractable for censored data analysis. Some general properties of the proposed model are studied such as ordinary and incomplete moments, moment generating function, mean deviations, Lorenz and Bonferroni curve. The maximum likelihood and the Bayesian estimation methods are utilized to estimate the model parameters. In addition, a small simulation study is conducted in order to evaluate the performance of the estimation methods. Two real data sets are used to illustrate the applicability of the proposed model.

Download Full-text

A Novel Bayesian Outlier Score Based on the Negative Binomial Distribution for Detecting Aberrantly Expressed Genes in RNA-Seq Gene Expression Count Data

IEEE Access ◽

10.1109/access.2021.3082311 ◽

2021 ◽

pp. 1-1

Author(s):

Edin Salkovic ◽

Halima Bensmail

Keyword(s):

Gene Expression ◽

Negative Binomial Distribution ◽

Count Data ◽

Binomial Distribution ◽

Negative Binomial ◽

Rna Seq

Download Full-text

Fast zero-inflated negative binomial mixed modeling approach for analyzing longitudinal metagenomics data

Bioinformatics ◽

10.1093/bioinformatics/btz973 ◽

2020 ◽

Vol 36 (8) ◽

pp. 2345-2351 ◽

Cited By ~ 2

Author(s):

Xinyan Zhang ◽

Nengjun Yi

Keyword(s):

Count Data ◽

Mixed Models ◽

Negative Binomial ◽

Linear Mixed Models ◽

Human Microbiome ◽

Real Data ◽

R Package ◽

Supplementary Information ◽

Sequencing Data ◽

Metagenomics Data

Abstract Motivation Longitudinal metagenomics data, including both 16S rRNA and whole-metagenome shotgun sequencing data, enhanced our abilities to understand the dynamic associations between the human microbiome and various diseases. However, analytic tools have not been fully developed to simultaneously address the main challenges of longitudinal metagenomics data, i.e. high-dimensionality, dependence among samples and zero-inflation of observed counts. Results We propose a fast zero-inflated negative binomial mixed modeling (FZINBMM) approach to analyze high-dimensional longitudinal metagenomic count data. The FZINBMM approach is based on zero-inflated negative binomial mixed models (ZINBMMs) for modeling longitudinal metagenomic count data and a fast EM-IWLS algorithm for fitting ZINBMMs. FZINBMM takes advantage of a commonly used procedure for fitting linear mixed models, which allows us to include various types of fixed and random effects and within-subject correlation structures and quickly analyze many taxa. We found that FZINBMM remarkably outperformed in computational efficiency and was statistically comparable with two R packages, GLMMadaptive and glmmTMB, that use numerical integration to fit ZINBMMs. Extensive simulations and real data applications showed that FZINBMM outperformed other previous methods, including linear mixed models, negative binomial mixed models and zero-inflated Gaussian mixed models. Availability and implementation FZINBMM has been implemented in the R package NBZIMM, available in the public GitHub repository http://github.com//nyiuab//NBZIMM. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Graphic Model-based Gene Regulatory Network Reconstruction using RNA Sequencing Count Data

JOURNAL OF ADVANCES IN BIOTECHNOLOGY ◽

10.24297/jbt.v8i0.8298 ◽

2019 ◽

Vol 8 ◽

pp. 1078-1085

Author(s):

Liliana Lopez-Kleine ◽

Cristian Andres Gonzalez-Prieto

Keyword(s):

Rna Sequencing ◽

Negative Binomial Distribution ◽

Count Data ◽

Binomial Distribution ◽

Regulatory Networks ◽

Graphical Model ◽

Negative Binomial ◽

Sequencing Data ◽

Model Based ◽

Gene Regulatory

Interactions between genes, such as regulations are best represented by gene regulatory networks (GRN). These are often constructed based on gene expression data. Few methods for the construction of GRN exist for RNA sequencing count data. One of the most used methods for microarray data is based on graphical Gaussian networks. Considering that count data have different distributions, a method assuming RNA sequencing counts distribute Poisson has been proposed recently. Nevertheless, it has been argued that the most likely distribution of RNA sequencing counts is not Poisson due to overdispersion. Therefore, the negative binomial distribution is much more likely. For this distribution, no model-based method for the construction of GRN has been proposed until now. Here, we present a graphical, model-based method for the construction of GRN assuming a negative binomial distribution of the RNA sequencing count data. The R code is available under request. We used the method proposed both on simulated RNA sequencing count data and on real data. The graph is showed, and its descriptive measurements were assessed. They were found some interesting biological conclusions. We confirm that using negative binomial distribution for fitting the model is suitable because RNA sequencing data present overdispersion.

Download Full-text