scholarly journals Zero-inflated generalized Dirichlet multinomial regression model for microbiome compositional data analysis

Biostatistics ◽  
2018 ◽  
Vol 20 (4) ◽  
pp. 698-713 ◽  
Author(s):  
Zheng-Zheng Tang ◽  
Guanhua Chen

Summary There is heightened interest in using high-throughput sequencing technologies to quantify abundances of microbial taxa and linking the abundance to human diseases and traits. Proper modeling of multivariate taxon counts is essential to the power of detecting this association. Existing models are limited in handling excessive zero observations in taxon counts and in flexibly accommodating complex correlation structures and dispersion patterns among taxa. In this article, we develop a new probability distribution, zero-inflated generalized Dirichlet multinomial (ZIGDM), that overcomes these limitations in modeling multivariate taxon counts. Based on this distribution, we propose a ZIGDM regression model to link microbial abundances to covariates (e.g. disease status) and develop a fast expectation–maximization algorithm to efficiently estimate parameters in the model. The derived tests enable us to reveal rich patterns of variation in microbial compositions including differential mean and dispersion. The advantages of the proposed methods are demonstrated through simulation studies and an analysis of a gut microbiome dataset.

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5643 ◽  
Author(s):  
Fiona Chong ◽  
Matthew Spencer

Ecologists often analyze relative abundances, which are an example of compositional data. However, they have made surprisingly little use of recent advances in the field of compositional data analysis. Compositions form a vector space in which addition and scalar multiplication are replaced by operations known as perturbation and powering. This algebraic structure makes it easy to understand how relative abundances change along environmental gradients. We illustrate this with an analysis of changes in hard-substrate marine communities along a depth gradient. We fit a quadratic multivariate regression model with multinomial observations to point count data obtained from video transects. As well as being an appropriate observation model in this case, the multinomial deals with the problem of zeros, which often makes compositional data analysis difficult. We show how the algebra of compositions can be used to understand patterns in dissimilarity. We use the calculus of simplex-valued functions to estimate rates of change, and to summarize the structure of the community over a vertical slice. We discuss the benefits of the compositional approach in the interpretation and visualization of relative abundance data.


mSystems ◽  
2018 ◽  
Vol 3 (4) ◽  
Author(s):  
J. Rivera-Pinto ◽  
J. J. Egozcue ◽  
V. Pawlowsky-Glahn ◽  
R. Paredes ◽  
M. Noguera-Julian ◽  
...  

ABSTRACTHigh-throughput sequencing technologies have revolutionized microbiome research by allowing the relative quantification of microbiome composition and function in different environments. In this work we focus on the identification of microbial signatures, groups of microbial taxa that are predictive of a phenotype of interest. We do this by acknowledging the compositional nature of the microbiome and the fact that it carries relative information. Thus, instead of defining a microbial signature as a linear combination in real space corresponding to the abundances of a group of taxa, we consider microbial signatures given by the geometric means of data from two groups of taxa whose relative abundances, or balance, are associated with the response variable of interest. In this work we presentselbal, a greedy stepwise algorithm for selection of balances or microbial signatures that preserves the principles of compositional data analysis. We illustrate the algorithm with 16S rRNA abundance data from a Crohn’s microbiome study and an HIV microbiome study.IMPORTANCEWe propose a new algorithm for the identification of microbial signatures. These microbial signatures can be used for diagnosis, prognosis, or prediction of therapeutic response based on an individual’s specific microbiota.


2020 ◽  
Author(s):  
Jacob Bien ◽  
Xiaohan Yan ◽  
Léo Simpson ◽  
Christian L. Müller

AbstractModern high-throughput sequencing technologies provide low-cost microbiome survey data across all habitats of life at unprecedented scale. At the most granular level, the primary data consist of sparse counts of amplicon sequence variants or operational taxonomic units that are associated with taxonomic and phylogenetic group information. In this contribution, we leverage the hierarchical structure of amplicon data and propose a data-driven, parameter-free, and scalable tree-guided aggregation framework to associate microbial subcompositions with response variables of interest. The excess number of zero or low count measurements at the read level forces traditional microbiome data analysis workflows to remove rare sequencing variants or group them by a fixed taxonomic rank, such as genus or phylum, or by phylogenetic similarity. By contrast, our framework, which we call trac (tree-aggregation of compositional data), learns data-adaptive taxon aggregation levels for predictive modeling making user-defined aggregation obsolete while simultaneously integrating seamlessly into the compositional data analysis framework. We illustrate the versatility of our framework in the context of large-scale regression problems in human-gut, soil, and marine microbial ecosystems. We posit that the inferred aggregation levels provide highly interpretable taxon groupings that can help microbial ecologists gain insights into the structure and functioning of the underlying ecosystem of interest.


2019 ◽  
Vol 13 (1) ◽  
pp. 57-64
Author(s):  
Mahdi Rezapour ◽  
Amirarsalan Mehrara Molan ◽  
Khaled Ksaibati

Background: Run Off The Road (ROTR) crashes are some of the most severe crashes that could occur on roadways. The main countermeasure that can be taken to address this type of crashe is traffic barrier installation. Although ROTR crashes can be mitigated significantly by traffic barriers, still traffic barrier crashes resulted in considerable amount of severe crashes. Besides, the types of traffic barriers, driver actions and performance play an important role in the severity of these crashes. Methods: This study was conducted by incorporating only traffic barrier crashes in Wyoming. Based on the literature review there are unique contributory factors in different crash types. Therefore, in addition to focusing on traffic barrier crashes, crashes were divided into two different highway classes: interstate and non-interstate highways. Results: The result of proportional odds assumption was an indication that multinomial logistic regression model is appropriate for both non-interstate and interstates crashes involved with traffic barriers. The results indicated that road surface conditions, age, driver restraint and negotiating a curve were some of the factors that impact the severity of traffic barrier crashes on non-interstate highways. On the other hand, the results of interstate barrier crashes indicated that besides types of barriers, driver condition, citation record, speed limit compliance were some of the factors that impacted the interstate traffic barrier crash severity. Conclusion: The results of this study would provide the policymakers with the directions to take appropriate countermeasures to alleviate the severity of traffic barrier crashes.


2020 ◽  
Vol Publish Ahead of Print ◽  
Author(s):  
Emma Purón-González ◽  
Arnulfo González-Cantú ◽  
Edgar Ulises Coronado-Alejandro ◽  
Oswaldo Enrique Sánchez-Dávila ◽  
Héctor Cobos-Aguilar ◽  
...  

Critical Care ◽  
2021 ◽  
Vol 25 (1) ◽  
Author(s):  
Daniel O. Thomas-Rüddel ◽  
Peter Hoffmann ◽  
Daniel Schwarzkopf ◽  
Christian Scheer ◽  
Friedhelm Bach ◽  
...  

Abstract Background Fever and hypothermia have been observed in septic patients. Their influence on prognosis is subject to ongoing debates. Methods We did a secondary analysis of a large clinical dataset from a quality improvement trial. A binary logistic regression model was calculated to assess the association of the thermal response with outcome and a multinomial regression model to assess factors associated with fever or hypothermia. Results With 6542 analyzable cases we observed a bimodal temperature response characterized by fever or hypothermia, normothermia was rare. Hypothermia and high fever were both associated with higher lactate values. Hypothermia was associated with higher mortality, but this association was reduced after adjustment for other risk factors. Age, community-acquired sepsis, lower BMI and lower outside temperatures were associated with hypothermia while bacteremia and higher procalcitonin values were associated with high fever. Conclusions Septic patients show either a hypothermic or a fever response. Whether hypothermia is a maladaptive response, as indicated by the higher mortality in hypothermic patients, or an adaptive response in patients with limited metabolic reserves under colder environmental conditions, remains an open question. Trial registration The original trial whose dataset was analyzed was registered at ClinicalTrials.gov (NCT01187134) on August 23, 2010, the first patient was included on July 1, 2011.


2019 ◽  
Vol 65 (4) ◽  
pp. 474-498 ◽  
Author(s):  
Tara N. Richards ◽  
Marie Skubak Tillyer ◽  
Emily M. Wright

This study examines the predictors of sexual assault case clearance, with a focus on arrest and two types of exceptional clearance: victim refusal to cooperate and prosecutorial declination to prosecute. Using National Incident Based Reporting System (NIBRS) data on crime incidents that contain a sexual offense ( N = 21,977), we estimated a multinomial regression model to examine the predictors of different clearance types for cases of sexual assault. Results indicated that the likelihood of victim refusal decreases in cases perpetrated by strangers, involving victim injury, occurring in public, and involving multiple offenses. A similar pattern of findings was observed for the decision to decline to prosecute a case. In addition, prosecutors are more likely to decline to prosecute cases with male victims and older victims. We discuss the implications of our findings and directions for future research.


Sign in / Sign up

Export Citation Format

Share Document