ErrorTracer: an algorithm for identifying the origins of inconsistencies in genome-scale metabolic models

Scale Models ◽

Model Size ◽

Metabolic Models ◽

Model Publication ◽

Genome Scale ◽

Model Visualization ◽

Community Standard

Abstract Motivation The number and complexity of genome-scale metabolic models is steadily increasing, empowered by automated model-generation algorithms. The quality control of the models, however, has always remained a significant challenge, the most fundamental being reactions incapable of carrying flux. Numerous automated gap-filling algorithms try to address this problem, but can rarely resolve all of a model’s inconsistencies. The need for fast inconsistency checking algorithms has also been emphasized with the recent community push for automated model-validation before model publication. Previously, we wrote a graphical software to allow the modeller to solve the remaining errors manually. Nevertheless, model size and complexity remained a hindrance to efficiently tracking origins of inconsistency. Results We developed the ErrorTracer algorithm in order to address the shortcomings of existing approaches: ErrorTracer searches for inconsistencies, classifies them and identifies their origins. The algorithm is ∼2 orders of magnitude faster than current community standard methods, using only seconds even for large-scale models. This allows for interactive exploration in direct combination with model visualization, markedly simplifying the whole error-identification and correction work flow. Availability and implementation Windows and Linux executables and source code are available under the EPL 2.0 Licence at https://github.com/TheAngryFox/ModelExplorer and https://www.ntnu.edu/almaaslab/downloads. Supplementary information Supplementary data are available at Bioinformatics online.

Deciphering the metabolic capabilities of Bifidobacteria using genome-scale metabolic models

Scientific Reports ◽

10.1038/s41598-019-54696-9 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 4

Author(s):

N. T. Devika ◽

Karthik Raman

Keyword(s):

Short Chain Fatty Acids ◽

Acetate Production ◽

Scale Models ◽

Powerful Approach ◽

Metabolic Models ◽

Commercial Applications ◽

Different Strains ◽

Genome Scale ◽

Modelling Approach

AbstractBifidobacteria, the initial colonisers of breastfed infant guts, are considered as the key commensals that promote a healthy gastrointestinal tract. However, little is known about the key metabolic differences between different strains of these bifidobacteria, and consequently, their suitability for their varied commercial applications. In this context, the present study applies a constraint-based modelling approach to differentiate between 36 important bifidobacterial strains, enhancing their genome-scale metabolic models obtained from the AGORA (Assembly of Gut Organisms through Reconstruction and Analysis) resource. By studying various growth and metabolic capabilities in these enhanced genome-scale models across 30 different nutrient environments, we classified the bifidobacteria into three specific groups. We also studied the ability of the different strains to produce short-chain fatty acids, finding that acetate production is niche- and strain-specific, unlike lactate. Further, we captured the role of critical enzymes from the bifid shunt pathway, which was found to be essential for a subset of bifidobacterial strains. Our findings underline the significance of analysing metabolic capabilities as a powerful approach to explore distinct properties of the gut microbiome. Overall, our study presents several insights into the nutritional lifestyles of bifidobacteria and could potentially be leveraged to design species/strain-specific probiotics or prebiotics.

BiGG Models 2020: multi-strain genome-scale models and expansion across the phylogenetic tree

Nucleic Acids Research ◽

10.1093/nar/gkz1054 ◽

2019 ◽

Cited By ~ 7

Author(s):

Charles J Norsigian ◽

Neha Pusarla ◽

John Luke McConn ◽

James T Yurkovich ◽

Andreas Dräger ◽

...

Keyword(s):

Phylogenetic Tree ◽

Knowledge Base ◽

The Past ◽

Scale Models ◽

Metabolic Models ◽

New Community ◽

Genome Annotations ◽

Genome Scale ◽

High Quality Genome ◽

Strain Genome

Abstract The BiGG Models knowledge base (http://bigg.ucsd.edu) is a centralized repository for high-quality genome-scale metabolic models. For the past 12 years, the website has allowed users to browse and search metabolic models. Within this update, we detail new content and features in the repository, continuing the original effort to connect each model to genome annotations and external databases as well as standardization of reactions and metabolites. We describe the addition of 31 new models that expand the portion of the phylogenetic tree covered by BiGG Models. We also describe new functionality for hosting multi-strain models, which have proven to be insightful in a variety of studies centered on comparisons of related strains. Finally, the models in the knowledge base have been benchmarked using Memote, a new community-developed validator for genome-scale models to demonstrate the improving quality and transparency of model content in BiGG Models.

FALCONET: an R package to accelerate automatic visualisation of genome scale metabolic models

10.1101/662056 ◽

2019 ◽

Author(s):

Hongzhong Lu ◽

Zhengming Zhu ◽

Eduard J Kerkhoven ◽

Jens Nielsen

Keyword(s):

Metabolic Networks ◽

R Package ◽

Network Size ◽

Research Community ◽

Large Network ◽

Strain Design ◽

Scale Models ◽

Genome Scale ◽

Integrative Omics

AbstractSummaryFALCONET (FAst visuaLisation of COmputational NETworks) enables the automatic for-mation and visualisation of metabolic maps from genome-scale models with R and CellDesigner, readily facilitating the visualisation of multi-layers omics datasets in the context of metabolic networks.MotivationUntil now, numerous GEMs have been reconstructed and used as scaffolds to conduct integrative omics analysis and in silico strain design. Due to the large network size of GEMs, it is challenging to produce and visualize these networks as metabolic maps for further in-depth analyses.ResultsHere, we presented the R package - FALCONET, which facilitates drawing and visualizing metabolic maps in an automatic manner. This package will benefit the research community by allowing a wider use of GEMs in systems biology.Availability and implementationFALCONET is available on https://github.com/SysBioChalmers/FALCONET and released under the MIT [email protected] informationSupplementary data are available online.

Fast automated reconstruction of genome-scale metabolic models for microbial species and communities

10.1101/223198 ◽

2018 ◽

Cited By ~ 2

Author(s):

Daniel Machado ◽

Sergej Andrejev ◽

Melanie Tramontano ◽

Kiran Raosaheb Patil

Keyword(s):

Microbial Communities ◽

Single Species ◽

Model Organisms ◽

Universal Model ◽

Microbial Species ◽

Scale Models ◽

Metabolic Models ◽

User Friendly ◽

Genome Scale ◽

Automated Tool

AbstractGenome-scale metabolic models are instrumental in uncovering operating principles of cellular metabolism and model-guided re-engineering. Recent applications of metabolic models have also demonstrated their usefulness in unraveling cross-feeding within microbial communities. Yet, the application of genome-scale models, especially to microbial communities, is lagging far behind the availability of sequenced genomes. This is largely due to the time-consuming steps of manual cura-tion required to obtain good quality models and thus physiologically meaningful simulation results. Here, we present an automated tool – CarveMe – for reconstruction of species and community level metabolic models. We introduce the concept of a universal model, which is manually curated and simulation-ready. Starting with this universal model and annotated genome sequences, CarveMe uses a top-down approach to build single-species and community models in a fast and scalable manner. We build reconstructions for two model organisms, Escherichia coli and Bacillus subtillis, as well as a collection of human gut bacteria, and show that CarveMe models perform similarly to manually curated models in reproducing experimental phenotypes. Finally, we demonstrate the scalability of CarveMe through reconstructing 5587 bacterial models. Overall, CarveMe provides an open-source and user-friendly tool towards broadening the use of metabolic modeling in studying microbial species and communities.

Efficient enzyme coupling algorithms identify functional pathways in genome-scale metabolic models

10.1101/608430 ◽

2019 ◽

Author(s):

Dikshant Pradhan ◽

Jason A. Papin ◽

Paul A. Jensen

Keyword(s):

Metabolic Engineering ◽

Mathematical Framework ◽

Continuous Variables ◽

Convex Constraints ◽

Coupling Analysis ◽

Flux Coupling ◽

Scale Models ◽

Metabolic Models ◽

Genome Scale ◽

Flux Coupling Analysis

AbstractFlux coupling identifies sets of reactions whose fluxes are “coupled" or correlated in genome-scale models. By identified sets of coupled reactions, modelers can 1.) reduce the dimensionality of genome-scale models, 2.) identify reactions that must be modulated together during metabolic engineering, and 3.) identify sets of important enzymes using high-throughput data. We present three computational tools to improve the efficiency, applicability, and biological interpretability of flux coupling analysis.The first algorithm (cachedFCF) uses information from intermediate solutions to decrease the runtime of standard flux coupling methods by 10-100 fold. Importantly, cachedFCF makes no assumptions regarding the structure of the underlying model, allowing efficient flux coupling analysis of models with non-convex constraints.We next developed a mathematical framework (FALCON) that incorporates enzyme activity as continuous variables in genome-scale models. Using data from gene expression and fitness assays, we verified that enzyme sets calculated directly from FALCON models are more functionally coherent than sets of enzymes collected from coupled reaction sets.Finally, we present a method (delete-and-couple) for expanding enzyme sets to allow redundancies and branches in the associated metabolic pathways. The expanded enzyme sets align with known biological pathways and retain functional coherence. The expanded enzyme sets allow pathway-level analyses of genome-scale metabolic models.Together, our algorithms extend flux coupling techniques to enzymatic networks and models with transcriptional regulation and other non-convex constraints. By expanding the efficiency and flexibility of flux coupling, we believe this popular technique will find new applications in metabolic engineering, microbial pathogenesis, and other fields that leverage network modeling.

Integrating –omics data into genome-scale metabolic network models: principles and challenges

Essays in Biochemistry ◽

10.1042/ebc20180011 ◽

2018 ◽

Vol 62 (4) ◽

pp. 563-574 ◽

Cited By ~ 10

Author(s):

Charlotte Ramon ◽

Mattia G. Gollub ◽

Jörg Stelling

Keyword(s):

Data Integration ◽

Large Scale ◽

Network Models ◽

Omics Data ◽

Scale Models ◽

Common Framework ◽

Genome Scale ◽

Constraint Based Models ◽

Omics Data Integration

At genome scale, it is not yet possible to devise detailed kinetic models for metabolism because data on the in vivo biochemistry are too sparse. Predictive large-scale models for metabolism most commonly use the constraint-based framework, in which network structures constrain possible metabolic phenotypes at steady state. However, these models commonly leave many possibilities open, making them less predictive than desired. With increasingly available –omics data, it is appealing to increase the predictive power of constraint-based models (CBMs) through data integration. Many corresponding methods have been developed, but data integration is still a challenge and existing methods perform less well than expected. Here, we review main approaches for the integration of different types of –omics data into CBMs focussing on the methods’ assumptions and limitations. We argue that key assumptions – often derived from single-enzyme kinetics – do not generally apply in the context of networks, thereby explaining current limitations. Emerging methods bridging CBMs and biochemical kinetics may allow for –omics data integration in a common framework to provide more accurate predictions.

ComMet: A method for comparing metabolic states in genome-scale metabolic models

10.1101/2020.09.14.296145 ◽

2020 ◽

Author(s):

Chaitra Sarathy ◽

Marian Breuer ◽

Martina Kutmon ◽

Michiel E. Adriaens ◽

Chris T. Evelo ◽

...

Keyword(s):

Large Scale ◽

Metabolic Flux ◽

Knowledge Bases ◽

Human Models ◽

Comprehensive Knowledge ◽

Metabolic States ◽

Network Modules ◽

Metabolic Models ◽

Branched Chain ◽

Being a comprehensive knowledge bases of cellular metabolism, Genome-scale metabolic models (GEMs) serve as mathematical tools for studying cellular flux states in various or-ganisms. However, analysis of large-scale GEMs, such as human models, still presents considerable challenges with respect to objective selection and reaction flux constraints. In this study, we introduce a model-based method, ComMet (Comparison of Metabolic states), for comprehensive analysis of large metabolic flux spaces and comparison of various metabolic states. ComMet allows (a) an in-depth characterisation of flux states achievable by GEMs, (b) comparison of flux spaces from several conditions of interest, (c) identification of metabolically distinct network modules and (d) visualisation of network modules as reaction and metabolic map. As a proof-of-principle, we employed ComMet to extract the biochemical differences in the human adipocyte network (iAdipocytes1809) arising due to unlimited/blocked uptake of branched-chain amino acids. Our study opens avenues for exploring several metabolic condi-tions of interest in both microbe and human models. ComMet is open-source and is available at https://github.com/macsbio/commet.

ssbio: a Python framework for structural systems biology

Bioinformatics ◽

10.1093/bioinformatics/bty077 ◽

2018 ◽

Vol 34 (12) ◽

pp. 2155-2157 ◽

Cited By ~ 15

Author(s):

Nathan Mih ◽

Elizabeth Brunk ◽

Ke Chen ◽

Edward Catoiu ◽

Anand Sastry ◽

...

Keyword(s):

Structural Information ◽

Protein Structures ◽

Structural Data ◽

Third Party ◽

Scale Models ◽

Protein Properties ◽

Scale Network ◽

Structural Systems Biology ◽

Abstract Summary Working with protein structures at the genome-scale has been challenging in a variety of ways. Here, we present ssbio, a Python package that provides a framework to easily work with structural information in the context of genome-scale network reconstructions, which can contain thousands of individual proteins. The ssbio package provides an automated pipeline to construct high quality genome-scale models with protein structures (GEM-PROs), wrappers to popular third-party programs to compute associated protein properties, and methods to visualize and annotate structures directly in Jupyter notebooks, thus lowering the barrier of linking 3D structural data with established systems workflows. Availability and implementation ssbio is implemented in Python and available to download under the MIT license at http://github.com/SBRG/ssbio. Documentation and Jupyter notebook tutorials are available at http://ssbio.readthedocs.io/en/latest/. Interactive notebooks can be launched using Binder at https://mybinder.org/v2/gh/SBRG/ssbio/master?filepath=Binder.ipynb. Supplementary information Supplementary data are available at Bioinformatics online.

GEMtractor: Extracting Views into Genome-scale Metabolic Models

10.1101/790725 ◽

2019 ◽

Author(s):

Martin Scharm ◽

Olaf Wolkenhauer ◽

Mahdi Jalili ◽

Ali Salehzadeh-Yazdi

Keyword(s):

Topological Analysis ◽

Web Based ◽

Link Type ◽

Scale Models ◽

Multipartite Graphs ◽

Metabolic Models ◽

ABSTRACTSummaryComputational metabolic models typically encode for graphs of species, reactions, and enzymes. Comparing genome-scale models through topological analysis of multipartite graphs is challenging. However, in many practical cases it is not necessary to compare the full networks. The GEMtractor is a web-based tool to trim models encoded in SBML. It can be used to extract subnetworks, for example focusing on reaction- and enzyme-centric views into the model.Availability and ImplementationThe GEMtractor is licensed under the terms of GPLv3 and developed at github.com/binfalse/GEMtractor – a public version is available at sbi.uni-rostock.de/[email protected] and [email protected]

Gsmodutils: a python based framework for test-driven genome scale metabolic model development

Bioinformatics ◽

10.1093/bioinformatics/btz088 ◽

2019 ◽

Vol 35 (18) ◽

pp. 3397-3403 ◽

Cited By ~ 2

Author(s):

James Gilbert ◽

Nicole Pearcy ◽

Rupert Norman ◽

Thomas Millat ◽

Klaus Winzer ◽

...

Keyword(s):

Open Source ◽

Large Scale ◽

Model Development ◽

Metabolic Model ◽

Validation Data ◽

Engineering Research ◽

Modelling Framework ◽

Wide Range ◽

AbstractMotivationGenome scale metabolic models (GSMMs) are increasingly important for systems biology and metabolic engineering research as they are capable of simulating complex steady-state behaviour. Constraints based models of this form can include thousands of reactions and metabolites, with many crucial pathways that only become activated in specific simulation settings. However, despite their widespread use, power and the availability of tools to aid with the construction and analysis of large scale models, little methodology is suggested for their continued management. For example, when genome annotations are updated or new understanding regarding behaviour is discovered, models often need to be altered to reflect this. This is quickly becoming an issue for industrial systems and synthetic biotechnology applications, which require good quality reusable models integral to the design, build, test and learn cycle.ResultsAs part of an ongoing effort to improve genome scale metabolic analysis, we have developed a test-driven development methodology for the continuous integration of validation data from different sources. Contributing to the open source technology based around COBRApy, we have developed the gsmodutils modelling framework placing an emphasis on test-driven design of models through defined test cases. Crucially, different conditions are configurable allowing users to examine how different designs or curation impact a wide range of system behaviours, minimizing error between model versions.Availability and implementationThe software framework described within this paper is open source and freely available from http://github.com/SBRCNottingham/gsmodutils.Supplementary informationSupplementary data are available at Bioinformatics online.