Pathway Tools version 23.0 update: software for pathway/genome informatics and systems biology

Briefings in Bioinformatics ◽

10.1093/bib/bbz104 ◽

2019 ◽

Cited By ~ 6

Author(s):

Peter D Karp ◽

Peter E Midford ◽

Richard Billington ◽

Anamika Kothari ◽

Markus Krummenacker ◽

...

Keyword(s):

Data Analysis ◽

Metabolic Networks ◽

Metabolic Flux ◽

Metabolic Model ◽

Supplementary Information ◽

Biological Knowledge ◽

Omics Data ◽

Dynamic Interactions ◽

Metabolomics Data ◽

Knowledge Resources

Abstract Motivation Biological systems function through dynamic interactions among genes and their products, regulatory circuits and metabolic networks. Our development of the Pathway Tools software was motivated by the need to construct biological knowledge resources that combine these many types of data, and that enable users to find and comprehend data of interest as quickly as possible through query and visualization tools. Further, we sought to support the development of metabolic flux models from pathway databases, and to use pathway information to leverage the interpretation of high-throughput data sets. Results In the past 4 years we have enhanced the already extensive Pathway Tools software in several respects. It can now support metabolic-model execution through the Web, it provides a more accurate gap filler for metabolic models; it supports development of models for organism communities distributed across a spatial grid; and model results may be visualized graphically. Pathway Tools supports several new omics-data analysis tools including the Omics Dashboard, multi-pathway diagrams called pathway collages, a pathway-covering algorithm for metabolomics data analysis and an algorithm for generating mechanistic explanations of multi-omics data. We have also improved the core pathway/genome databases management capabilities of the software, providing new multi-organism search tools for organism communities, improved graphics rendering, faster performance and re-designed gene and metabolite pages. Availability The software is free for academic use; a fee is required for commercial use. See http://pathwaytools.com. Contact [email protected] Supplementary information Supplementary data are available at Briefings in Bioinformatics online.

Download Full-text

Atom Identifiers Generated by a Neighborhood-Specific Graph Coloring Method Enable Compound Harmonization across Metabolic Databases

Metabolites ◽

10.3390/metabo10090368 ◽

2020 ◽

Vol 10 (9) ◽

pp. 368

Author(s):

Huan Jin ◽

Joshua M. Mitchell ◽

Hunter N. B. Moseley

Keyword(s):

Metabolic Network ◽

Graph Coloring ◽

Metabolic Networks ◽

Metabolic Flux ◽

Enzyme Commission ◽

Metabolic Model ◽

Metabolic Reprogramming ◽

Automatic Identification ◽

Flux Analysis ◽

Database Curation

Metabolic flux analysis requires both a reliable metabolic model and reliable metabolic profiles in characterizing metabolic reprogramming. Advances in analytic methodologies enable production of high-quality metabolomics datasets capturing isotopic flux. However, useful metabolic models can be difficult to derive due to the lack of relatively complete atom-resolved metabolic networks for a variety of organisms, including human. Here, we developed a neighborhood-specific graph coloring method that creates unique identifiers for each atom in a compound facilitating construction of an atom-resolved metabolic network. What is more, this method is guaranteed to generate the same identifier for symmetric atoms, enabling automatic identification of possible additional mappings caused by molecular symmetry. Furthermore, a compound coloring identifier derived from the corresponding atom coloring identifiers can be used for compound harmonization across various metabolic network databases, which is an essential first step in network integration. With the compound coloring identifiers, 8865 correspondences between KEGG (Kyoto Encyclopedia of Genes and Genomes) and MetaCyc compounds are detected, with 5451 of them confirmed by other identifiers provided by the two databases. In addition, we found that the Enzyme Commission numbers (EC) of reactions can be used to validate possible correspondence pairs, with 1848 unconfirmed pairs validated by commonality in reaction ECs. Moreover, we were able to detect various issues and errors with compound representation in KEGG and MetaCyc databases by compound coloring identifiers, demonstrating the usefulness of this methodology for database curation.

Download Full-text

Thermodynamic Genome-Scale Metabolic Modeling of Metallodrug Resistance in Colorectal Cancer

Cancers ◽

10.3390/cancers13164130 ◽

2021 ◽

Vol 13 (16) ◽

pp. 4130

Author(s):

Helena A. Herrmann ◽

Mate Rusz ◽

Dina Baier ◽

Michael A. Jakupec ◽

Bernhard K. Keppler ◽

...

Keyword(s):

Metabolic Flux ◽

Carcinoma Cell ◽

Metabolic Model ◽

Cellular Reprogramming ◽

Metabolic Reprogramming ◽

Metabolomics Data ◽

Extracellular Metabolite ◽

Metabolite Concentrations ◽

Genome Scale ◽

Resistant Cells

Background: Mass spectrometry-based metabolomics approaches provide an immense opportunity to enhance our understanding of the mechanisms that underpin the cellular reprogramming of cancers. Accurate comparative metabolic profiling of heterogeneous conditions, however, is still a challenge. Methods: Measuring both intracellular and extracellular metabolite concentrations, we constrain four instances of a thermodynamic genome-scale metabolic model of the HCT116 colorectal carcinoma cell line to compare the metabolic flux profiles of cells that are either sensitive or resistant to ruthenium- or platinum-based treatments with BOLD-100/KP1339 and oxaliplatin, respectively. Results: Normalizing according to growth rate and normalizing resistant cells according to their respective sensitive controls, we are able to dissect metabolic responses specific to the drug and to the resistance states. We find the normalization steps to be crucial in the interpretation of the metabolomics data and show that the metabolic reprogramming in resistant cells is limited to a select number of pathways. Conclusions: Here, we elucidate the key importance of normalization steps in the interpretation of metabolomics data, allowing us to uncover drug-specific metabolic reprogramming during acquired metal-drug resistance.

Download Full-text

Aristotle: stratified causal discovery for omics data

BMC Bioinformatics ◽

10.1186/s12859-021-04521-w ◽

2022 ◽

Vol 23 (1) ◽

Author(s):

Mehrdad Mansouri ◽

Sahand Khakabimamaghani ◽

Leonid Chindelevitch ◽

Martin Ester

Keyword(s):

Synthetic Data ◽

Causal Analysis ◽

Causal Discovery ◽

Simultaneous Increase ◽

Biological Knowledge ◽

Omics Data ◽

Metabolomics Data ◽

Widespread Application ◽

Anthracycline Cardiotoxicity ◽

Stratification Method

Abstract Background There has been a simultaneous increase in demand and accessibility across genomics, transcriptomics, proteomics and metabolomics data, known as omics data. This has encouraged widespread application of omics data in life sciences, from personalized medicine to the discovery of underlying pathophysiology of diseases. Causal analysis of omics data may provide important insight into the underlying biological mechanisms. Existing causal analysis methods yield promising results when identifying potential general causes of an observed outcome based on omics data. However, they may fail to discover the causes specific to a particular stratum of individuals and missing from others. Methods To fill this gap, we introduce the problem of stratified causal discovery and propose a method, Aristotle, for solving it. Aristotle addresses the two challenges intrinsic to omics data: high dimensionality and hidden stratification. It employs existing biological knowledge and a state-of-the-art patient stratification method to tackle the above challenges and applies a quasi-experimental design method to each stratum to find stratum-specific potential causes. Results Evaluation based on synthetic data shows better performance for Aristotle in discovering true causes under different conditions compared to existing causal discovery methods. Experiments on a real dataset on Anthracycline Cardiotoxicity indicate that Aristotle’s predictions are consistent with the existing literature. Moreover, Aristotle makes additional predictions that suggest further investigations.

Download Full-text

GAIT-GM: Galaxy tools for modeling metabolite changes as a function of gene expression

10.1101/2020.12.25.424407 ◽

2020 ◽

Author(s):

Lauren M. McIntyre ◽

Francisco Huertas ◽

Olexander Moskalenko ◽

Marta Llansola ◽

Vicente Felipo ◽

...

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Text Mining ◽

Kegg Pathway ◽

Annotation Tool ◽

Omics Data ◽

Metabolomics Data ◽

Pathway Data ◽

User Friendly ◽

Omics Data Analysis

AbstractGalaxy is a user-friendly platform with a strong development community and a rich set of tools for omics data analysis. While multi-omics experiments are becoming popular, tools for multi-omics data analysis are poorly represented in this platform. Here we present GAIT-GM, a set of new Galaxy tools for integrative analysis of gene expression and metabolomics data. In the Annotation Tool, features are mapped to KEGG pathway using a text mining approach to increase the number of mapped metabolites. Several interconnected databases are used to maximally map gene IDs across species. In the Integration Tool, changes in metabolite levels are modelled as a function of gene expression in a flexible manner. Both unbiased exploration of relationships between genes and metabolites and biologically informed models based on pathway data are enabled. The GAIT-GM tools are freely available at https://github.com/SECIMTools/gait-gm.

Download Full-text

Using MetaboAnalyst 4.0 for Metabolomics Data Analysis, Interpretation, and Integration with Other Omics Data

Computational Methods and Data Analysis for Metabolomics - Methods in Molecular Biology ◽

10.1007/978-1-0716-0239-3_17 ◽

2020 ◽

pp. 337-360 ◽

Cited By ~ 7

Author(s):

Jasmine Chong ◽

Jianguo Xia

Keyword(s):

Data Analysis ◽

Omics Data ◽

Metabolomics Data

Download Full-text

Taxonomic weighting improves the accuracy of a gap-filling algorithm for metabolic models

Bioinformatics ◽

10.1093/bioinformatics/btz813 ◽

2019 ◽

Vol 36 (6) ◽

pp. 1823-1830

Author(s):

Wai Kit Ong ◽

Peter E Midford ◽

Peter D Karp

Keyword(s):

Metabolic Networks ◽

Metabolic Model ◽

Error Rates ◽

Supplementary Information ◽

Gap Filling ◽

Frequency Method ◽

Metabolic Models ◽

Genome Annotations ◽

Taxonomic Groups ◽

The Cost

Abstract Motivation The increasing availability of annotated genome sequences enables construction of genome-scale metabolic networks, which are useful tools for studying organisms of interest. However, due to incomplete genome annotations, draft metabolic models contain gaps that must be filled in a time-consuming process before they are usable. Optimization-based algorithms that fill these gaps have been developed, however, gap-filling algorithms show significant error rates and often introduce incorrect reactions. Results Here, we present a new gap-filling method that computes the costs of candidate gap-filling reactions from a universal reaction database (MetaCyc) based on taxonomic information. When gap-filling a metabolic model for an organism M (such as Escherichia coli), the cost for reaction R is based on the frequency with which R occurs in other organisms within the phylum of M (in this case, Proteobacteria). The assumption behind this method is that different taxonomic groups are biased toward using different metabolic reactions. Evaluation of the new gap-filler on randomly degraded variants of the EcoCyc metabolic model for E.coli showed an increase in the average F1-score to 99.0 (when using the variable weights by frequency method at the phylum level), compared to 91.0 using the previous MetaFlux gap-filler and 80.3 using a basic gap-filler. Evaluation on two other microbial metabolic models showed similar improvements. Availability and implementation The Pathway Tools software (including MetaFlux) is free for academic use and is available at http://pathwaytools.com. Additional code for reproducing the results presented here is available at www.ai.sri.com/pkarp/pubs/taxgap/supplementary.zip. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Miso: an R package for multiple isotope labeling assisted metabolomics data analysis

Bioinformatics ◽

10.1093/bioinformatics/btz092 ◽

2019 ◽

Vol 35 (18) ◽

pp. 3524-3526 ◽

Cited By ~ 3

Author(s):

Yonghui Dong ◽

Liron Feldberg ◽

Asaph Aharoni

Keyword(s):

Data Analysis ◽

Isotope Labeling ◽

R Package ◽

Mass Spectrometry Data ◽

Data Matrix ◽

Supplementary Information ◽

Metabolomics Data ◽

Biological Studies ◽

Analysis Workflow ◽

Efficient Data

Abstract Motivation The use of stable isotope labeling is highly advantageous for structure elucidation in metabolomics studies. However, computational tools dealing with multiple-precursor-based labeling studies are still missing. Hence, we developed Miso, an R package providing automated and efficient data analysis workflow to detect the complete repertoire of labeled molecules from multiple-precursor-based labeling experiments. Results The capability of Miso is demonstrated by the analysis of liquid chromatography-mass spectrometry data obtained from duckweed plants fed with one unlabeled and two differently labeled tyrosine (unlabeled tyrosine, tyrosine-2H4 and tyrosine-13C915N1). The resulting data matrix generated by Miso contains sets of unlabeled and labeled ions with their retention time, m/z values and number of labeled atoms that can be directly utilized for database query and biological studies. Availability and implementation Miso is publicly available on the CRAN repository (https://cran.r-project.org/web/packages/Miso). A reproducible case study and a detailed tutorial are available from GitHub (https://github.com/YonghuiDong/Miso_example). Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Thermodynamic genome-scale metabolic modeling of metallodrug resistance in colorectal cancer

10.1101/2021.06.09.447534 ◽

2021 ◽

Author(s):

Helena A Herrmann ◽

Mate Rusz ◽

Dina Baier ◽

Michael A. Jakupec ◽

Bernhard K. Keppler ◽

...

Keyword(s):

Metabolic Flux ◽

Carcinoma Cell ◽

Metabolic Model ◽

Cellular Reprogramming ◽

Metabolic Reprogramming ◽

Metabolomics Data ◽

Extracellular Metabolite ◽

Metabolite Concentrations ◽

Genome Scale ◽

Resistant Cells

Background: Mass spectrometry-based metabolomics approaches provide an immense opportunity to enhance our understanding of the mechanisms that underpin the cellular reprogramming of cancers. Accurate comparative metabolic profiling of heterogeneous conditions, however, is still a challenge. Methods: Measuring both intracellular and extracellular metabolite concentrations, we constrain four instances of a thermodynamic genome-scale metabolic model of the HCT116 colorectal carcinoma cell line to compare the metabolic flux profiles of cells that are either sensitive or resistant to ruthenium- or platinum-based treatments with BOLD-100/KP1339 and oxaliplatin, respectively. Results: Normalizing according to growth rate and normalizing resistant cells according to their respective sensitive controls, we are able to dissect metabolic responses specific to the drug and to the resistance states. We find the normalization steps to be crucial in the interpretation of the metabolomics data and show that the metabolic reprogramming in resistant cells is limited to a select number of pathways. Conclusions: Here we elucidate the key importance of normalization steps in the interpretation of metabolomics data, allowing us to uncover drug-specific metabolic reprogramming during acquired metal-drug resistance.

Download Full-text

struct: an R/Bioconductor-based framework for standardized metabolomics data analysis and beyond

Bioinformatics ◽

10.1093/bioinformatics/btaa1031 ◽

2020 ◽

Author(s):

Gavin Rhys Lloyd ◽

Andris Jankevics ◽

Ralf J M Weber

Keyword(s):

Data Analysis ◽

Source Code ◽

Omics Data ◽

Bioconductor Package ◽

Metabolomics Data ◽

Diverse Range ◽

Omics Technologies ◽

Significant Challenge ◽

Combining Methods ◽

Data Analysis Methods

Abstract Summary Implementing and combining methods from a diverse range of R/Bioconductor packages into ‘omics’ data analysis workflows represents a significant challenge in terms of standardization, readability and reproducibility. Here, we present an R/Bioconductor package, named struct (Statistics in R using Class-based Templates), which defines a suite of class-based templates that allows users to develop and implement highly standardized and readable statistical analysis workflows. Struct integrates with the STATistics Ontology to ensure consistent reporting and maximizes semantic interoperability. We also present a toolbox, named structToolbox, which includes an extensive set of commonly used data analysis methods that have been implemented using struct. This toolbox can be used to build data-analysis workflows for metabolomics and other omics technologies. Availability and implementation struct and structToolbox are implemented in R, and are freely available from Bioconductor (http://bioconductor.org/packages/struct and http://bioconductor.org/packages/structToolbox), including documentation and vignettes. Source code is available and maintained at https://github.com/computational-metabolomics.

Download Full-text

Atom Identifiers Generated by a Graph Coloring Method Enable Compound Harmonization Across Metabolic Databases

10.1101/2020.06.19.161877 ◽

2020 ◽

Author(s):

Huan Jin ◽

Joshua M. Mitchell ◽

Hunter N.B. Moseley

Keyword(s):

Metabolic Network ◽

Graph Coloring ◽

Metabolic Networks ◽

Metabolic Flux ◽

Enzyme Commission ◽

Metabolic Model ◽

Metabolic Reprogramming ◽

Automatic Identification ◽

Flux Analysis ◽

Database Curation

AbstractMetabolic flux analysis requires both a reliable metabolic model and metabolic profiles in characterizing metabolic reprogramming. Advances in analytic methodologies enable production of high-quality metabolomics datasets capturing isotopic flux. However, useful metabolic models can be difficult to derive due to the lack of relatively complete atom-resolved metabolic networks for a variety of organisms, including human. Here, we developed a graph coloring method that creates unique identifiers for each atom in a compound facilitating construction of an atom-resolved metabolic network. What is more, this method is guaranteed to generate the same identifier for symmetric atoms, enabling automatic identification of possible additional mappings caused by molecular symmetry. Furthermore, a compound coloring identifier derived from the corresponding atom coloring identifiers can be used for compound harmonization across various metabolic network databases, which is an essential first step in network integration. With the compound coloring identifiers, 8865 correspondences between KEGG and MetaCyc compounds are detected, with 5451 of them confirmed by other identifiers provided by the two databases. In addition, we found that the Enzyme Commission numbers (EC) of reactions can be used to validate possible correspondence pairs, with 1848 unconfirmed pairs validated by commonality in reaction ECs. Moreover, we were able to detect various issues and errors with compound representation in KEGG and MetaCyc databases by compound coloring identifiers, demonstrating the usefulness of this methodology for database curation.

Download Full-text