scholarly journals MiMeNet: Exploring microbiome-metabolome relationships using neural networks

2021 ◽  
Vol 17 (5) ◽  
pp. e1009021
Author(s):  
Derek Reiman ◽  
Brian T. Layden ◽  
Yang Dai

The advance in microbiome and metabolome studies has generated rich omics data revealing the involvement of the microbial community in host disease pathogenesis through interactions with their host at a metabolic level. However, the computational tools to uncover these relationships are just emerging. Here, we present MiMeNet, a neural network framework for modeling microbe-metabolite relationships. Using ten iterations of 10-fold cross-validation on three paired microbiome-metabolome datasets, we show that MiMeNet more accurately predicts metabolite abundances (mean Spearman correlation coefficients increase from 0.108 to 0.309, 0.276 to 0.457, and -0.272 to 0.264) and identifies more well-predicted metabolites (increase in the number of well-predicted metabolites from 198 to 366, 104 to 143, and 4 to 29) compared to state-of-art linear models for individual metabolite predictions. Additionally, we demonstrate that MiMeNet can group microbes and metabolites with similar interaction patterns and functions to illuminate the underlying structure of the microbe-metabolite interaction network, which could potentially shed light on uncharacterized metabolites through “Guilt by Association”. Our results demonstrated that MiMeNet is a powerful tool to provide insights into the causes of metabolic dysregulation in disease, facilitating future hypothesis generation at the interface of the microbiome and metabolomics.

2020 ◽  
Author(s):  
Derek Reiman ◽  
Brian T Layden ◽  
Yang Dai

The advance in microbiome and metabolome studies has generated rich omics data revealing the involvement of the microbial community in host disease pathogenesis through interactions with their host at a metabolic level. However, the computational tools to uncover these relationships are just emerging. Here, we present MiMeNet, a neural network framework for modeling microbe-metabolite relationships. Using ten iterations of 10-fold cross-validation on three paired microbiome-metabolome datasets, we show that MiMeNet more accurately predicts metabolite abundances (mean Spearman correlation coefficients increase from 0.108 to 0.309, 0.276 to 0.457, and -0.272 to 0.264) and identifies more well-predicted metabolites (increase in the number of well-predicted metabolites from 198 to 366, 104 to 143, and 4 to 29) compared to state-of-art linear models for individual metabolite predictions. Additionally, we demonstrate that MiMeNet can group microbes and metabolites with similar interaction patterns and functions to illuminate the underlying structure of the microbe-metabolite interaction network, which could potentially shed light on uncharacterized metabolites through "Guilt by Association". Our results demonstrated that MiMeNet is a powerful tool to provide insights into the causes of metabolic dysregulation in disease, facilitating future hypothesis generation at the interface of the microbiome and metabolomics.


2013 ◽  
Vol 33 (1) ◽  
pp. 116-121
Author(s):  
Luisa Pereira Figueiredo ◽  
Marali Vilela Dias ◽  
Wanderson Alexandre Valente ◽  
Soraia Vilela Borges ◽  
Anirene Galvão Tavares Pereira ◽  
...  

The industrialization of passion fruit in the form of juice produces considerable amounts of residue that could be used as food. The objective of the present study was to determine the effects of the volume of passion fruit juice added to the syrup and the cooking time on the color and texture of passion fruit albedo preserved in syrup. Multi-linear models were well fit to describe the value for a* (for the albedo) the values for b* (for the albedo and syrup), which exhibited high correlation coefficients of 98%, 84%, and 88%, respectively. The volume of passion fruit juice added and the cooking time of the albedos in the syrup, involved in the processing of passion fruit albedo preserves in syrup, significantly affected color analyses. The texture was not affected by the parameters studied. Therefore, the use of larger volumes of passion fruit juice and longer cooking time is recommended for the production of passion fruit albedo preserves in syrup to achieve the characteristic yellow color of the fruit.


Autoregressive (AR) random fields are widely use to describe changes in the status of real-physical objects and implemented for analyzing linear & non-linear models. AR models are Markov processes with a higher order dependence for one-dimensional time series. Actually, various estimation methods were used in order to evaluate the autoregression parameters. Although in many applications background knowledge can often shed light on the search for a suitable model, but other applications lack this knowledge and often require the type of trial errors to choose a model. This article presents a brief survey of the literatures related to the linear and non-linear autoregression models, including several extensions of the main mode models and the models developed. The use of autoregression to describe such system requires that they be of sufficiently high orders which leads to increase the computational costs.


2021 ◽  
Author(s):  
Ville N Pimenoff ◽  
Ramon Cleries

Viruses infecting humans are manifold and several of them provoke significant morbidity and mortality. Simulations creating large synthetic datasets from observed multiple viral strain infections in a limited population sample can be a powerful tool to infer significant pathogen occurrence and interaction patterns, particularly if limited number of observed data units is available. Here, to demonstrate diverse human papillomavirus (HPV) strain occurrence patterns, we used log-linear models combined with Bayesian framework for graphical independence network (GIN) analysis. That is, to simulate datasets based on modeling the probabilistic associations between observed viral data points, i.e different viral strain infections in a set of population samples. Our GIN analysis outperformed in precision all oversampling methods tested for simulating large synthetic viral strain-level prevalence dataset from observed set of HPVs data. Altogether, we demonstrate that network modeling is a potent tool for creating synthetic viral datasets for comprehensive pathogen occurrence and interaction pattern estimations.


Author(s):  
Jozef Bujko ◽  
Juraj Candrák ◽  
Peter Strapák ◽  
Július Žitný ◽  
Cyril Hrnčár ◽  
...  

The aim of study was to analyse the reproduction and factors affecting on reproduction traits of dairy cows in population of Slovak Spotted cattle from 2007 to 2016 the results for 37,274 dairy cows: days to first service (DFS), days open (DO), number of inseminations per conception (NIC), age of first calving (AFC) and calving interval (CI). The basic statistical analysis were analysed using the SAS version 9.3. For the actual computation a linear models with fixed effects was used: For the actual computation a linear models with fixed effects was used: yijklm = μ + HYSi + BTj+ Fk+ Bl +eijklm. The linear model represents coefficients determination R2 = 0.452117% (P < 0.001) for DFS, R2 = 0.377715% (P < 0.001) for DO, R2 = 0.348442% (P < 0.001) for NIC and R2 = 0.317128% (P < 0.001) for CI with all fixed effects. Correlation coefficients among DFS with DO, NIC, AFC and CI were r = 0.37275, r = -0.06881, r = 0.06493 and r = 0.08348. These coefficients were highly statistically significant (P < 0.001).


2020 ◽  
Author(s):  
M. Natália Dias Soeiro Cordeiro ◽  
Amit Kumar Halder

Abstract Quantitative structure activity relationships (QSAR) modelling is a well-known computational tool, often used in a wide variety of applications. Yet one of the major drawbacks of conventional QSAR modelling tools is that models are set up based on a limited number of experimental and/or theoretical conditions. To overcome this, the so-called multitasking or multi-target QSAR (mt-QSAR) approaches have emerged as new computational tools able to integrate diverse chemical and biological data into a single model equation, thus extending and improving the reliability of this type of modelling. We have developed QSAR-Co-X, an open source python−based toolkit (available to download at https://github.com/ncordeirfcup/QSAR-Co-X) for supporting mt-QSAR modelling following the Box-Jenkins moving average approach. The new toolkit embodies several functionalities for dataset selection and curation plus computation of descriptors, for setting up linear and non-linear models, as well as for a comprehensive results analysis. The workflow within this toolkit is guided by a cohort of multiple statistical parameters along with graphical outputs onwards assessing both the predictivity and the robustness of the derived mt-QSAR models. To monitor and demonstrate the functionalities of the designed toolkit, three case-studies pertaining to previously reported datasets are examined here. We believe that this new toolkit, along with our previously launched QSAR-Co code, will significantly contribute to make mt-QSAR modelling widely and routinely applicable.


Author(s):  
Ricardo M. Llamas ◽  
Mario Guevara ◽  
Danny Rorabaugh ◽  
Michela Taufer ◽  
Rodrigo Vargas

Soil moisture plays a key role in the Earth&rsquo;s water and carbon cycles, but acquisition of continuous (i.e., gap-free) soil moisture measurements across large regions is a challenging task due to limitations of currently available point measurements. Satellites offer critical information for soil moisture over large areas on a regular basis (e.g., ESA CCI, NASA SMAP), however, there are regions where satellite-derived soil moisture cannot be estimated because of certain circumstances such as high canopy density, frozen soil, or extreme dry conditions. We compared and tested two approaches--Ordinary Kriging (OK) interpolation and General Linear Models (GLM)--to model soil moisture and fill spatial data gaps from the European Space Agency Climate Change Initiative (ESA CCI) version 3.2 (and compared them with version 4.4) from January 2000 to September 2012, over a region of 465,777 km2 across the Midwest of the USA. We tested our proposed methods to fill gaps in the original ESA CCI product, and two data subsets, removing 25% and 50% of the initially available valid pixels. We found a significant correlation coefficient (r = 0.523, RMSE = 0.092 m3m-3) between the original satellite-derived soil moisture product with ground-truth data from the North American Soil Moisture Database (NASMD). Predicted soil moisture using OK also had significant correlation coefficients with NASMD data, when using 100% (r = 0.522, RMSE = 0.092 m3m-3), 75% (r = 0.526, RMSE = 0.092 m3m-3) and 50% (r = 0.53, RMSE = 0.092 m3m-3) of available valid pixels for each month of the study period. GLM had lower but significant correlation coefficients with NASMD data (average r = 0.478, RMSE = 0.092 m3m-3) when using the same subsets of available data (i.e., 100%, 75%, 50%). Our results provide support for OK as a technique to gap-fill spatial missing values of satellite-derived soil moisture products across the Midwest of the USA.


Author(s):  
Renique Murray

The husk of fresh cocoa pods has traditionally been considered a waste by-product in the production of chocolate and other related confectionaries. However, in recent times new research has shed light on an increasing number ofuses for this material. Of particular interest are applications that utilize the cocoa pod husk (CPH) for its mechanical properties. In most instances, the CPH rawmaterial is allowed to age for several days before pre-processing or utilization in the intended application. Despite this, the impact of aging on its mechanical properties is an area that has not been well investigated. Consequently, this work seeksto determine the impact of aging upon the mechanical properties of CPH. To investigate this, several CPH properties were identified and selected for evaluation. These included CPH tensile strength, CPH compressive strength, cocoapod transverse compressive strength, cocoa pod longitudinal compressive strength, CPH cutting force, cocoa pod cutting force, CPH hardness, and CPHcolour. These properties were subsequently assessed over an aging period of seven days. The results obtained indicated that most CPH mechanical properties vary significantly with aging time. Moreover, CPH colour was found to bestrongly related to the mechanical properties of pod longitudinal compressive strength and CPH hardness, with correlation coefficients of -0.71 and 0.86 respectively. Further, these relationships were found to be strongly linear in natureand regression analyses indicated that up to 83% of the variation in longitudinal compressive strength can be accounted for by changes in colour, hardness andaging time. These results provide the basis for the potential development of image analysis and computer vision approaches to CPH sorting and grading.


2019 ◽  
Author(s):  
Arnaud Cougoul ◽  
Xavier Bailly ◽  
Ernst C. Wit

AbstractMicroorganisms often live in symbiotic relationship with their environment and they play a central role in many biological processes. They form a complex system of interacting species. Within the gut micro-biota these interaction patterns have been shown to be involved in obesity, diabetes and mental disease. Understanding the mechanisms that govern this ecosystem is therefore an important scientific challenge. Recently, the acquisition of large samples of microbiota data through metabarcoding or metagenomics has become easier.Until now correlation-based network analysis and graphical modelling have been used to identify the putative interaction networks formed by the species of microorganisms, but these methods do not take into account all features of microbiota data. Indeed, correlation-based network cannot distinguish between direct and indirect correlations and simple graphical models cannot include covariates as environmental factors that shape the microbiota abundance. Furthermore, the compositional nature of the microbiota data is often ignored or existing normalizations are often based on log-transformations, which is somewhat arbitrary and therefore affects the results in unknown ways.We have developed a novel method, called MAGMA, for detecting interactions between microbiota that takes into account the noisy structure of the microbiota data, involving an excess of zero counts, overdispersion, compositionality and possible covariate inclusion. The method is based on Copula Gaus-sian graphical models whereby we model the marginals with zero-inflated negative binomial generalized linear models. The inference is based on an efficient median imputation procedure combined with the graphical lasso.We show that our method beats all existing methods in recovering microbial association networks in an extensive simulation study. Moreover, the analysis of two 16S microbial data studies with our method reveals interesting new biology.MAGMA is implemented as an R-package and is freely available at https://gitlab.com/arcgl/rmagma, which also includes the scripts used to prepare the material in this paper.


Sign in / Sign up

Export Citation Format

Share Document