Computational Strategies for Biological Interpretation of Metabolomics Data

Author(s):  
Jianguo Xia
2011 ◽  
Vol 2011 ◽  
pp. 1-7 ◽  
Author(s):  
Gabi Kastenmüller ◽  
Werner Römisch-Margl ◽  
Brigitte Wägele ◽  
Elisabeth Altmaier ◽  
Karsten Suhre

Metabolomics is an emerging field that is based on the quantitative measurement of as many small organic molecules occurring in a biological sample as possible. Due to recent technical advances, metabolomics can now be used widely as an analytical high-throughput technology in drug testing and epidemiological metabolome and genome wide association studies. Analogous to chip-based gene expression analyses, the enormous amount of data produced by modern kit-based metabolomics experiments poses new challenges regarding their biological interpretation in the context of various sample phenotypes. We developedmetaP-serverto facilitate data interpretation.metaP-serverprovides automated and standardized data analysis for quantitative metabolomics data, covering the following steps from data acquisition to biological interpretation: (i) data quality checks, (ii) estimation of reproducibility and batch effects, (iii) hypothesis tests for multiple categorical phenotypes, (iv) correlation tests for metric phenotypes, (v) optionally including all possible pairs of metabolite concentration ratios, (vi) principal component analysis (PCA), and (vii) mapping of metabolites onto colored KEGG pathway maps. Graphical output is clickable and cross-linked to sample and metabolite identifiers. Interactive coloring of PCA and bar plots by phenotype facilitates on-line data exploration. For users of commercial metabolomics kits, cross-references to the HMDB, LipidMaps, KEGG, PubChem, and CAS databases are provided.metaP-serveris freely accessible athttp://metabolomics.helmholtz-muenchen.de/metap2/.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Mihir Mongia ◽  
Hosein Mohimani

AbstractVarious studies have shown associations between molecular features and phenotypes of biological samples. These studies, however, focus on a single phenotype per study and are not applicable to repository scale metabolomics data. Here we report MetSummarizer, a method for predicting (i) the biological phenotypes of environmental and host-oriented samples, and (ii) the raw ingredient composition of complex mixtures. We show that the aggregation of various metabolomic datasets can improve the accuracy of predictions. Since these datasets have been collected using different standards at various laboratories, in order to get unbiased results it is crucial to detect and discard standard-specific features during the classification step. We further report high accuracy in prediction of the raw ingredient composition of complex foods from the Global Foodomics Project.


2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Biting Wang ◽  
Zengrui Wu ◽  
Weihua Li ◽  
Guixia Liu ◽  
Yun Tang

Abstract Background The traditional Chinese medicine Huangqi decoction (HQD) consists of Radix Astragali and Radix Glycyrrhizae in a ratio of 6: 1, which has been used for the treatment of liver fibrosis. In this study, we tried to elucidate its action of mechanism (MoA) via a combination of metabolomics data, network pharmacology and molecular docking methods. Methods Firstly, we collected prototype components and metabolic products after administration of HQD from a publication. With known and predicted targets, compound-target interactions were obtained. Then, the global compound-liver fibrosis target bipartite network and the HQD-liver fibrosis protein–protein interaction network were constructed, separately. KEGG pathway analysis was applied to further understand the mechanisms related to the target proteins of HQD. Additionally, molecular docking simulation was performed to determine the binding efficiency of compounds with targets. Finally, considering the concentrations of prototype compounds and metabolites of HQD, the critical compound-liver fibrosis target bipartite network was constructed. Results 68 compounds including 17 prototype components and 51 metabolic products were collected. 540 compound-target interactions were obtained between the 68 compounds and 95 targets. Combining network analysis, molecular docking and concentration of compounds, our final results demonstrated that eight compounds (three prototype compounds and five metabolites) and eight targets (CDK1, MMP9, PPARD, PPARG, PTGS2, SERPINE1, TP53, and HIF1A) might contribute to the effects of HQD on liver fibrosis. These interactions would maintain the balance of ECM, reduce liver damage, inhibit hepatocyte apoptosis, and alleviate liver inflammation through five signaling pathways including p53, PPAR, HIF-1, IL-17, and TNF signaling pathway. Conclusions This study provides a new way to understand the MoA of HQD on liver fibrosis by considering the concentrations of components and metabolites, which might be a model for investigation of MoA of other Chinese herbs.


Metabolites ◽  
2020 ◽  
Vol 11 (1) ◽  
pp. 8
Author(s):  
Michiel Bongaerts ◽  
Ramon Bonte ◽  
Serwet Demirdas ◽  
Edwin H. Jacobs ◽  
Esmee Oussoren ◽  
...  

Untargeted metabolomics is an emerging technology in the laboratory diagnosis of inborn errors of metabolism (IEM). Analysis of a large number of reference samples is crucial for correcting variations in metabolite concentrations that result from factors, such as diet, age, and gender in order to judge whether metabolite levels are abnormal. However, a large number of reference samples requires the use of out-of-batch samples, which is hampered by the semi-quantitative nature of untargeted metabolomics data, i.e., technical variations between batches. Methods to merge and accurately normalize data from multiple batches are urgently needed. Based on six metrics, we compared the existing normalization methods on their ability to reduce the batch effects from nine independently processed batches. Many of those showed marginal performances, which motivated us to develop Metchalizer, a normalization method that uses 10 stable isotope-labeled internal standards and a mixed effect model. In addition, we propose a regression model with age and sex as covariates fitted on reference samples that were obtained from all nine batches. Metchalizer applied on log-transformed data showed the most promising performance on batch effect removal, as well as in the detection of 195 known biomarkers across 49 IEM patient samples and performed at least similar to an approach utilizing 15 within-batch reference samples. Furthermore, our regression model indicates that 6.5–37% of the considered features showed significant age-dependent variations. Our comprehensive comparison of normalization methods showed that our Log-Metchalizer approach enables the use out-of-batch reference samples to establish clinically-relevant reference values for metabolite concentrations. These findings open the possibilities to use large scale out-of-batch reference samples in a clinical setting, increasing the throughput and detection accuracy.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kalifa Manjang ◽  
Shailesh Tripathi ◽  
Olli Yli-Harja ◽  
Matthias Dehmer ◽  
Galina Glazko ◽  
...  

AbstractThe identification of prognostic biomarkers for predicting cancer progression is an important problem for two reasons. First, such biomarkers find practical application in a clinical context for the treatment of patients. Second, interrogation of the biomarkers themselves is assumed to lead to novel insights of disease mechanisms and the underlying molecular processes that cause the pathological behavior. For breast cancer, many signatures based on gene expression values have been reported to be associated with overall survival. Consequently, such signatures have been used for suggesting biological explanations of breast cancer and drug mechanisms. In this paper, we demonstrate for a large number of breast cancer signatures that such an implication is not justified. Our approach eliminates systematically all traces of biological meaning of signature genes and shows that among the remaining genes, surrogate gene sets can be formed with indistinguishable prognostic prediction capabilities and opposite biological meaning. Hence, our results demonstrate that none of the studied signatures has a sensible biological interpretation or meaning with respect to disease etiology. Overall, this shows that prognostic signatures are black-box models with sensible predictions of breast cancer outcome but no value for revealing causal connections. Furthermore, we show that the number of such surrogate gene sets is not small but very large.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Joshua E. Lewis ◽  
Melissa L. Kemp

AbstractResistance to ionizing radiation, a first-line therapy for many cancers, is a major clinical challenge. Personalized prediction of tumor radiosensitivity is not currently implemented clinically due to insufficient accuracy of existing machine learning classifiers. Despite the acknowledged role of tumor metabolism in radiation response, metabolomics data is rarely collected in large multi-omics initiatives such as The Cancer Genome Atlas (TCGA) and consequently omitted from algorithm development. In this study, we circumvent the paucity of personalized metabolomics information by characterizing 915 TCGA patient tumors with genome-scale metabolic Flux Balance Analysis models generated from transcriptomic and genomic datasets. Metabolic biomarkers differentiating radiation-sensitive and -resistant tumors are predicted and experimentally validated, enabling integration of metabolic features with other multi-omics datasets into ensemble-based machine learning classifiers for radiation response. These multi-omics classifiers show improved classification accuracy, identify clinical patient subgroups, and demonstrate the utility of personalized blood-based metabolic biomarkers for radiation sensitivity. The integration of machine learning with genome-scale metabolic modeling represents a significant methodological advancement for identifying prognostic metabolite biomarkers and predicting radiosensitivity for individual patients.


Sign in / Sign up

Export Citation Format

Share Document