Modeling Longitudinal Microbiome Compositional Data: A Two-Part Linear Mixed Model with Shared Random Effects

Author(s):  
Yongli Han ◽  
Courtney Baker ◽  
Emily Vogtmann ◽  
Xing Hua ◽  
Jianxin Shi ◽  
...  
Biostatistics ◽  
2014 ◽  
Vol 15 (4) ◽  
pp. 706-718 ◽  
Author(s):  
Danping Liu ◽  
Paul S. Albert

Abstract In disease screening, the combination of multiple biomarkers often substantially improves the diagnostic accuracy over a single marker. This is particularly true for longitudinal biomarkers where individual trajectory may improve the diagnosis. We propose a pattern mixture model (PMM) framework to predict a binary disease status from a longitudinal sequence of biomarkers. The marker distribution given the disease status is estimated from a linear mixed effects model. A likelihood ratio statistic is computed as the combination rule, which is optimal in the sense of the maximum receiver operating characteristic (ROC) curve under the correctly specified mixed effects model. The individual disease risk score is then estimated by Bayes’ theorem, and we derive the analytical form of the 95% confidence interval. We show that this PMM is an approximation to the shared random effects (SRE) model proposed by Albert (2012. A linear mixed model for predicting a binary event from longitudinal data under random effects mis-specification. Statistics in Medicine31(2), 143–154). Further, with extensive simulation studies, we found that the PMM is more robust than the SRE model under wide classes of models. This new PPM approach for combining biomarkers is motivated by and applied to a fetal growth study, where the interest is in predicting macrosomia using longitudinal ultrasound measurements.


2020 ◽  
pp. 1-37
Author(s):  
Tal Yarkoni

Abstract Most theories and hypotheses in psychology are verbal in nature, yet their evaluation overwhelmingly relies on inferential statistical procedures. The validity of the move from qualitative to quantitative analysis depends on the verbal and statistical expressions of a hypothesis being closely aligned—that is, that the two must refer to roughly the same set of hypothetical observations. Here I argue that many applications of statistical inference in psychology fail to meet this basic condition. Focusing on the most widely used class of model in psychology—the linear mixed model—I explore the consequences of failing to statistically operationalize verbal hypotheses in a way that respects researchers' actual generalization intentions. I demonstrate that whereas the "random effect" formalism is used pervasively in psychology to model inter-subject variability, few researchers accord the same treatment to other variables they clearly intend to generalize over (e.g., stimuli, tasks, or research sites). The under-specification of random effects imposes far stronger constraints on the generalizability of results than most researchers appreciate. Ignoring these constraints can dramatically inflate false positive rates, and often leads researchers to draw sweeping verbal generalizations that lack a meaningful connection to the statistical quantities they are putatively based on. I argue that failure to take the alignment between verbal and statistical expressions seriously lies at the heart of many of psychology's ongoing problems (e.g., the replication crisis), and conclude with a discussion of several potential avenues for improvement.


2020 ◽  
pp. 1471082X2096691
Author(s):  
Amani Almohaimeed ◽  
Jochen Einbeck

Random effect models have been popularly used as a mainstream statistical technique over several decades; and the same can be said for response transformation models such as the Box–Cox transformation. The latter aims at ensuring that the assumptions of normality and of homoscedasticity of the response distribution are fulfilled, which are essential conditions for inference based on a linear model or a linear mixed model. However, methodology for response transformation and simultaneous inclusion of random effects has been developed and implemented only scarcely, and is so far restricted to Gaussian random effects. We develop such methodology, thereby not requiring parametric assumptions on the distribution of the random effects. This is achieved by extending the ‘Nonparametric Maximum Likelihood’ towards a ‘Nonparametric profile maximum likelihood’ technique, allowing to deal with overdispersion as well as two-level data scenarios.


2018 ◽  
Vol 147 ◽  
Author(s):  
A. Aswi ◽  
S. M. Cramb ◽  
P. Moraga ◽  
K. Mengersen

AbstractDengue fever (DF) is one of the world's most disabling mosquito-borne diseases, with a variety of approaches available to model its spatial and temporal dynamics. This paper aims to identify and compare the different spatial and spatio-temporal Bayesian modelling methods that have been applied to DF and examine influential covariates that have been reportedly associated with the risk of DF. A systematic search was performed in December 2017, using Web of Science, Scopus, ScienceDirect, PubMed, ProQuest and Medline (via Ebscohost) electronic databases. The search was restricted to refereed journal articles published in English from January 2000 to November 2017. Thirty-one articles met the inclusion criteria. Using a modified quality assessment tool, the median quality score across studies was 14/16. The most popular Bayesian statistical approach to dengue modelling was a generalised linear mixed model with spatial random effects described by a conditional autoregressive prior. A limited number of studies included spatio-temporal random effects. Temperature and precipitation were shown to often influence the risk of dengue. Developing spatio-temporal random-effect models, considering other priors, using a dataset that covers an extended time period, and investigating other covariates would help to better understand and control DF transmission.


Parasitology ◽  
2001 ◽  
Vol 122 (5) ◽  
pp. 563-569 ◽  
Author(s):  
D. A. ELSTON ◽  
R. MOSS ◽  
T. BOULINIER ◽  
C. ARROWSMITH ◽  
X. LAMBIN

The statistical aggregation of parasites among hosts is often described empirically by the negative binomial (Poisson-gamma) distribution. Alternatively, the Poisson-lognormal model can be used. This has the advantage that it can be fitted as a generalized linear mixed model, thereby quantifying the sources of aggregation in terms of both fixed and random effects. We give a worked example, assigning aggregation in the distribution of sheep ticksIxodes ricinuson red grouseLagopus lagopus scoticuschicks to temporal (year), spatial (altitude and location), brood and individual effects. Apparent aggregation among random individuals in random broods fell 8-fold when spatial and temporal effects had been accounted for.


2000 ◽  
Vol 40 (7) ◽  
pp. 969 ◽  
Author(s):  
R. A. Lawes ◽  
M. K. Wegener ◽  
K. E. Basford ◽  
R. J. Lawn

Commercial cane sugar (CCS), as measured by sugar mills, is in decline in the wet tropics of Australia. One of these mills, Tully Sugar Ltd, has measured CCS in the factory as required by legislation and also measured whole clean stalk CCS through a small mill, which is free of contaminants. ‘Factory CCS’ measures the CCS of cane entering the mill, after it has been harvested. The harvesting and transport process delivers to the mill cane that is contaminated by extraneous matter such as leaf material and soil. Beween 1988 and 1998, 1516 blocks were sampled for ‘small mill CCS’. These data were combined with block productivity information to determine the trends in small mill CCS and factory CCS using a linear mixed model analysis as the data were unbalanced. Other data, including the date of harvest for factory CCS, date of sampling for small mill CCS, farm of origin and cane variety were available and fitted as random effects in the mixed model. Year was fixed to determine time related trends in the 2 measures of CCS. Small mill CCS was higher than factory CCS and remained constant from 1988 to 1998. Predicted factory CCS declined from 12.76 units in 1988 to 10.91 units in 1998. We conclude that the CCS levels in whole clean stalks were actually stable, since small mill CCS remained constant over the 10-year period. Possible reasons for the differences in the trends for the 2 CCS measures are discussed.


Biometrics ◽  
2004 ◽  
Vol 60 (4) ◽  
pp. 945-953 ◽  
Author(s):  
Wendimagegn Ghidey ◽  
Emmanuel Lesaffre ◽  
Paul Eilers

2010 ◽  
Vol 49 (01) ◽  
pp. 54-64 ◽  
Author(s):  
J. Menke

Summary Objectives: Meta-analysis allows to summarize pooled sensitivities and specificities from several primary diagnostic test accuracy studies. Often these pooled estimates are indirectly obtained from a hierarchical summary receiver operating characteristics (HSROC) analysis. This article presents a generalized linear random-effects model with the new SAS PROC GLIMMIX that obtains the pooled estimates for sensitivity and specificity directly. Methods: Firstly, the formula of the bivariate random-effects model is presented in context with the literature. Then its implementation with the new SAS PROC GLIMMIX is empirically evaluated in comparison to the indirect HSROC approach, utilizing the published 2 x 2 count data of 50 meta-analyses. Results: According to the empirical evaluation the meta-analytic results from the bivariate GLIMMIX approach are nearly identical to the results from the indirect HSROC approach. Conclusions: A generalized linear mixed model with PROC GLIMMIX offers a straightforward method for bivariate random-effects meta-analysis of sensitivity and specificity.


Sign in / Sign up

Export Citation Format

Share Document