scholarly journals Correction to: Opportunities and limits of combining microbiome and genome data for complex trait prediction

2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Miguel Pérez-Enciso ◽  
Laura M. Zingaretti ◽  
Yuliaxis Ramayo-Caldas ◽  
Gustavo de los Campos
2020 ◽  
Author(s):  
Miguel Pérez-Enciso ◽  
Laura M. Zingaretti ◽  
Yuliaxis Ramayo-Caldas ◽  
Gustavo de los Campos

AbstractThe analysis and prediction of complex traits using microbiome data combined with host genomic information is a topic of utmost interest. However, numerous questions remain to be answered: How useful can the microbiome be for complex trait prediction? Are microbiability estimates reliable? Can the underlying biological links between the host’s genome, microbiome, and the phenome be recovered? Here, we address these issues by (i) developing a novel simulation strategy that uses real microbiome and genotype data as input, and (ii) proposing a variance-component approach which, in the spirit of mediation analyses, quantifies the proportion of phenotypic variance explained by genome and microbiome, and dissects it into direct and indirect effects. The proposed simulation approach can mimic a genetic link between the microbiome and SNP data via a permutation procedure that retains the distributional properties of the data. Results suggest that microbiome data could significantly improve phenotype prediction accuracy, irrespective of whether some abundances are under direct genetic control by the host or not. Overall, random-effects linear methods appear robust for variance components estimation, despite the highly leptokurtic distribution of microbiota abundances. Nevertheless, we observed that accuracy depends in part on the number of microorganisms’ taxa influencing the trait of interest. While we conclude that overall genome-microbiome-links can be characterized via variance components, we are less optimistic about the possibility of identifying the causative effects, i.e., individual SNPs affecting abundances; power at this level would require much larger sample sizes than the ones typically available for genome-microbiome-phenome data.Author summaryThe microbiome consists of the microorganisms that live in a particular environment, including those in our organism. There is consistent evidence that these communities play an important role in numerous traits of relevance, including disease susceptibility or feed efficiency. Moreover, it has been shown that the microbiome can be relatively stable throughout an individual’s life and that is affected by the host genome. These reasons have prompted numerous studies to determine whether and how the microbiome can be used for prediction of complex phenotypes, either using microbiome alone or in combination with host’s genome data. However, numerous questions remain to be answered such as the reliability of parameter estimates, or which is the underlying relationship between microbiome, genome, and phenotype. The few available empirical studies do not provide a clear answer to these problems. Here we address these issues by developing a novel simulation strategy and we show that, although the microbiome can significantly help in prediction, it will be difficult to retrieve the actual biological basis of interactions between the microbiome and the trait.


Genetics ◽  
2019 ◽  
Vol 211 (4) ◽  
pp. 1131-1141 ◽  
Author(s):  
Naomi R. Wray ◽  
Kathryn E. Kemper ◽  
Benjamin J. Hayes ◽  
Michael E. Goddard ◽  
Peter M. Visscher

PLoS ONE ◽  
2015 ◽  
Vol 10 (10) ◽  
pp. e0138903 ◽  
Author(s):  
David C. Haws ◽  
Irina Rish ◽  
Simon Teyssedre ◽  
Dan He ◽  
Aurelie C. Lozano ◽  
...  

2018 ◽  
Vol 34 (10) ◽  
pp. 746-754 ◽  
Author(s):  
Gustavo de los Campos ◽  
Ana Ines Vazquez ◽  
Stephen Hsu ◽  
Louis Lello

2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Louis Lello ◽  
Timothy G. Raben ◽  
Stephen D. H. Hsu

2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Miguel Pérez-Enciso ◽  
Laura M. Zingaretti ◽  
Yuliaxis Ramayo-Caldas ◽  
Gustavo de los Campos

Abstract Background Analysis and prediction of complex traits using microbiome data combined with host genomic information is a topic of utmost interest. However, numerous questions remain to be answered: how useful can the microbiome be for complex trait prediction? Are estimates of microbiability reliable? Can the underlying biological links between the host’s genome, microbiome, and phenome be recovered? Methods Here, we address these issues by (i) developing a novel simulation strategy that uses real microbiome and genotype data as inputs, and (ii) using variance-component approaches (Bayesian Reproducing Kernel Hilbert Space (RKHS) and Bayesian variable selection methods (Bayes C)) to quantify the proportion of phenotypic variance explained by the genome and the microbiome. The proposed simulation approach can mimic genetic links between the microbiome and genotype data by a permutation procedure that retains the distributional properties of the data. Results Using real genotype and rumen microbiota abundances from dairy cattle, simulation results suggest that microbiome data can significantly improve the accuracy of phenotype predictions, regardless of whether some microbiota abundances are under direct genetic control by the host or not. This improvement depends logically on the microbiome being stable over time. Overall, random-effects linear methods appear robust for variance components estimation, in spite of the typically highly leptokurtic distribution of microbiota abundances. The predictive performance of Bayes C was higher but more sensitive to the number of causative effects than RKHS. Accuracy with Bayes C depended, in part, on the number of microorganisms’ taxa that influence the phenotype. Conclusions While we conclude that, overall, genome-microbiome-links can be characterized using variance component estimates, we are less optimistic about the possibility of identifying the causative host genetic effects that affect microbiota abundances, which would require much larger sample sizes than are typically available for genome-microbiome-phenome studies. The R code to replicate the analyses is in https://github.com/miguelperezenciso/simubiome.


Author(s):  
Louis Lello ◽  
Timothy G. Raben ◽  
Stephen D.H. Hsu

AbstractWe test a variety of polygenic predictors using tens of thousands of genetic siblings for whom we have SNP genotypes, health status, and phenotype information in late adulthood. Siblings have typically experienced similar environments during childhood, and exhibit negligible population stratification relative to each other. Therefore, the ability to predict differences in disease risk or complex trait values between siblings is a strong test of genomic prediction in humans. We compare validation results obtained using non-sibling subjects to those obtained among siblings and find that typically most of the predictive power persists in within-family designs. In the case of disease risk we test the extent to which higher polygenic risk score (PRS) identifies the affected sibling, and also compute Relative Risk Reduction as a function of risk score threshold. For quantitative traits we examine between-sibling differences in trait values as a function of predicted differences, and compare to performance in non-sibling pairs. Example results: Given 1 sibling with normal-range PRS score (<84 percentile) and 1 sibling with high PRS score (top few percentiles), the predictors identify the affected sibling about 70-90% of the time across a variety of disease conditions, including Breast Cancer, Heart Attack, Diabetes, etc. For height, the predictor correctly identifies the taller sibling roughly 80 percent of the time when the (male) height difference is 2 inches or more.


Sign in / Sign up

Export Citation Format

Share Document