scholarly journals Scalable estimation of microbial co-occurrence networks with Variational Autoencoders

2021 ◽  
Author(s):  
James Morton ◽  
Justin Silverman ◽  
Gleb Tikhonov ◽  
Harri Lahdesmaki ◽  
Richard Bonneau

Estimating microbe-microbe interactions is critical for understanding the ecological laws governing microbial communities. Rapidly decreasing sequencing costs have promised new opportunities to estimate microbe-microbe interactions across thousands of uncultured, unknown microbes. However, typical microbiome datasets are very high dimensional and accurate estimation of microbial correlations requires tens of thousands of samples, exceeding the computational capabilities of existing methodologies. Furthermore, the vast majority of microbiome studies collect compositional metagenomics data which enforces a negative bias when computing microbe-microbe correlations. The Multinomial Logistic Normal (MLN) distribution has been shown to be effective at inferring microbe-microbe correlations, however scalable Bayesian inference of these distributions has remained elusive. Here, we show that carefully constructed Variational Autoencoders (VAEs) augmented with the Isometric Log-ratio (ILR) transform can estimate low-rank MLN distributions thousands of times faster than existing methods. These VAEs can be trained on tens of thousands of samples, enabling co-occurrence inference across tens of thousands of microbes without regularization. The latent embedding distances computed from these VAEs are competitive with existing beta-diversity methods across a variety of mouse and human microbiome classification and regression tasks, with notable improvements on longitudinal studies.

2018 ◽  
Vol 18 (11) ◽  
pp. 2933-2949 ◽  
Author(s):  
Laura C. Dawkins ◽  
David B. Stephenson

Abstract. Natural hazards, such as European windstorms, have widespread effects that result in insured losses at multiple locations throughout a continent. Multivariate extreme-value statistical models for such environmental phenomena must therefore accommodate very high dimensional spatial data, as well as correctly representing dependence in the extremes to ensure accurate estimation of these losses. Ideally one would employ a flexible model, able to characterise all forms of extremal dependence. However, such models are restricted to a few dozen dimensions, hence an a priori diagnostic approach must be used to identify the dominant form of extremal dependence. Here, we present various approaches for exploring the dominant extremal dependence class in very high dimensional spatial hazard fields: tail dependency measures, copula fits, and conceptual loss distributions. These approaches are illustrated by application to a data set of high-dimensional historical European windstorm footprints (6103 spatial maps of 3-day maximum gust speeds at 14 872 locations). We find there is little evidence of asymptotic extremal dependency in windstorm footprints. Furthermore, empirical extremal properties and conceptual losses are shown to be well reproduced using Gaussian copulas but not by extremally dependent models such as Gumbel copulas. It is conjectured that the lack of asymptotic dependence is a generic property of turbulent flows. These results open up the possibility of using geostatistical Gaussian process models for fast simulation of windstorm hazard fields.


2019 ◽  
Vol 19 (1) ◽  
pp. 39-53 ◽  
Author(s):  
Martin Eigel ◽  
Johannes Neumann ◽  
Reinhold Schneider ◽  
Sebastian Wolf

AbstractThis paper examines a completely non-intrusive, sample-based method for the computation of functional low-rank solutions of high-dimensional parametric random PDEs, which have become an area of intensive research in Uncertainty Quantification (UQ). In order to obtain a generalized polynomial chaos representation of the approximate stochastic solution, a novel black-box rank-adapted tensor reconstruction procedure is proposed. The performance of the described approach is illustrated with several numerical examples and compared to (Quasi-)Monte Carlo sampling.


2021 ◽  
Author(s):  
Leyuan Li ◽  
Zhibin Ning ◽  
Xu Zhang ◽  
James Butcher ◽  
Caitlin Simopoulos ◽  
...  

Functional redundancy is a key property of ecosystems and represents the fact that phylogenetically unrelated taxa can play similar functional roles within an ecosystem. The redundancy of potential functions of human microbiome has been recently quantified using metagenomics data. Yet, the redundancy of functions which are actually expressed within the human microbiome remains largely unexplored. Here, we quantify the protein-level functional redundancy in the human gut microbiome using metaproteomics and network approaches. In particular, our ultra-deep metaproteomics approach revealed high protein-level functional redundancy and high nestedness in proteomic content networks - bipartite graphs that connect taxa with their expressed functions. We further examined multiple metaproteomics datasets and showed that various environmental factors, including individuality, biogeography, xenobiotics, and disease, significantly altered the protein-level functional redundancy. Finally, by projecting the bipartite proteomic content networks into unipartite weighted genus networks, functional hub genera across individual microbiomes were discovered, suggesting that there may be a universal principle of functional organization in microbiome assembly.


Author(s):  
Nguyen Thanh Tung ◽  
Joshua Zhexue Huang ◽  
Imran Khan ◽  
Mark Junjie Li ◽  
Graham Williams

Sign in / Sign up

Export Citation Format

Share Document