BOFdat: generating biomass objective function stoichiometric coefficients from experimental data

AbstractGenome-scale models (GEMs) rely on a biomass objective function (BOF) to predict phenotype from genotype. Here we present BOFdat, a Python package that offers functions to generate biomass objective function stoichiometric coefficients (BOFsc) from macromolecular cell composition and relative abundances of macromolecules obtained from omic datasets. Growth-associated and non-growth associated maintenance (GAM and NGAM) costs can also be calculated by BOFdat.BOFdat is freely available on the Python Package Index (pip install BOFdat). The source code and an example usage (Jupyter Notebook and example files) are available on GitHub (https://github.com/jclachance/BOFdat). The documentation and API are available through ReadTheDocs (https://bofdat.readthedocs.io)[email protected], [email protected], [email protected]

Download Full-text

Medusa: software to build and analyze ensembles of genome-scale metabolic network reconstructions

10.1101/547174 ◽

2019 ◽

Cited By ~ 1

Author(s):

Gregory L. Medlock ◽

Jason A. Papin

Keyword(s):

Machine Learning ◽

Experimental Data ◽

Computational Biology ◽

Metabolic Network ◽

Metabolic Networks ◽

Ensemble Simulations ◽

Link Type ◽

Genome Scale ◽

Python Package

AbstractUncertainty in the structure and parameters of networks is ubiquitous across computational biology. In constraint-based reconstruction and analysis of metabolic networks, this uncertainty is present both during the reconstruction of networks and in simulations performed with them. Here, we present Medusa, a Python package for the generation and analysis of ensembles of genome-scale metabolic network reconstructions. Medusa builds on the COBRApy package for constraint-based reconstruction and analysis by compressing a set of models into a compact ensemble object, providing functions for the generation of ensembles using experimental data, and extending constraint-based analyses to ensemble scale. We demonstrate how Medusa can be used to generate ensembles, perform ensemble simulations, and how machine learning can be used in conjunction with Medusa to guide the curation of genome-scale metabolic network reconstructions. Medusa is available under the permissive MIT license from the Python Packaging Index (https://pypi.org/) and from github (https://github.com/gregmedlock/Medusa/), and comprehensive documentation is available at https://medusa.readthedocs.io/en/latest/.

Download Full-text

An unconventional uptake rate objective function approach enhances applicability of genome-scale models for mammalian cells

npj Systems Biology and Applications ◽

10.1038/s41540-019-0103-6 ◽

2019 ◽

Vol 5 (1) ◽

Cited By ~ 6

Author(s):

Yiqun Chen ◽

Brian O. McConnell ◽

Venkata Gayatri Dhara ◽

Harnish Mukesh Naik ◽

Chien-Ting Li ◽

...

Keyword(s):

Objective Function ◽

Uptake Rate ◽

Mammalian Cells ◽

Function Approach ◽

Scale Models ◽

Genome Scale

Download Full-text

Biofuel production improvement with genome-scale models: The role of cell composition

Biotechnology Journal ◽

10.1002/biot.201000007 ◽

2010 ◽

Vol 5 (7) ◽

pp. 671-685 ◽

Cited By ~ 26

Author(s):

Ryan S. Senger

Keyword(s):

Biofuel Production ◽

Cell Composition ◽

Production Improvement ◽

Scale Models ◽

Genome Scale

Download Full-text

GEMtractor: Extracting Views into Genome-scale Metabolic Models

10.1101/790725 ◽

2019 ◽

Author(s):

Martin Scharm ◽

Olaf Wolkenhauer ◽

Mahdi Jalili ◽

Ali Salehzadeh-Yazdi

Keyword(s):

Topological Analysis ◽

Web Based ◽

Link Type ◽

Scale Models ◽

Multipartite Graphs ◽

Metabolic Models ◽

Genome Scale

ABSTRACTSummaryComputational metabolic models typically encode for graphs of species, reactions, and enzymes. Comparing genome-scale models through topological analysis of multipartite graphs is challenging. However, in many practical cases it is not necessary to compare the full networks. The GEMtractor is a web-based tool to trim models encoded in SBML. It can be used to extract subnetworks, for example focusing on reaction- and enzyme-centric views into the model.Availability and ImplementationThe GEMtractor is licensed under the terms of GPLv3 and developed at github.com/binfalse/GEMtractor – a public version is available at sbi.uni-rostock.de/[email protected] and [email protected]

Download Full-text

clinker & clustermap.js: Automatic generation of gene cluster comparison figures

10.1101/2020.11.08.370650 ◽

2020 ◽

Author(s):

Cameron L.M. Gilchrist ◽

Yit-Heng Chooi

Keyword(s):

Gene Cluster ◽

Evolutionary History ◽

Source Code ◽

Gene Clusters ◽

Automatic Generation ◽

Biological Pathways ◽

Link Type ◽

E Mail ◽

Python Package ◽

Publication Quality

AbstractSummaryGenes involved in biological pathways are often collocalised in gene clusters, the comparison of which can give valuable insights into their function and evolutionary history. However, comparison and visualisation of gene cluster homology is a tedious process, particularly when many clusters are being compared. Here, we present clinker, a Python based tool, and clustermap.js, a companion JavaScript visualisation library, which used together can automatically generate accurate, interactive, publication-quality gene cluster comparison figures directly from sequence files.Availability and ImplementationSource code and documentation for clinker and clustermap.js is available on GitHub (github.com/gamcil/clinker and github.com/gamcil/clustermap.js, respectively) under the MIT license. clinker can be installed directly from the Python Package Index via pip.ContactE-mail: [email protected], [email protected]

Download Full-text

ProbAnnoWeb and ProbAnnoPy: probabilistic annotation and gap-filling of metabolic reconstructions

10.1101/151258 ◽

2017 ◽

Author(s):

Brendan King ◽

Terry Farrah ◽

Matthew Richards ◽

Michael Mundy ◽

Evangelos Simeonidis ◽

...

Keyword(s):

Web Service ◽

Gap Filling ◽

Flux Balance ◽

Link Type ◽

Likelihood Score ◽

Genome Scale ◽

Python Package

AbstractSummaryGap-filling is a necessary step to produce quality genome-scale metabolic reconstructions capable of flux-balance simulation. Most available gap-filling tools use an organism-agnostic approach, where reactions are selected from a database to fill gaps without consideration of the target organism. Conversely, our likelihood based gap-filling with probabilistic annotations selects candidate reactions based on a likelihood score derived specifically from the target organism’s genome. Here, we present two new implementations of probabilistic annotation and likelihood based gap-filling: a web service called ProbAnnoWeb, and a standalone python package called ProbAnnoPy.Availability and ImplementationOur tools are available as a web service with no installation needed (ProbAnnoWeb), available at http://probannoweb.systemsbiology.net, and as a local python package implementation (ProbAnnoPy), available for download at http://github.com/PriceLab/probannopy.Contacthttp://[email protected]; http://[email protected]

Download Full-text

TEX-FBA: A constraint-based method for integrating gene expression, thermodynamics, and metabolomics data into genome-scale metabolic models

10.1101/536235 ◽

2019 ◽

Cited By ~ 3

Author(s):

Vikash Pandey ◽

Daniel Hernandez Gardiol ◽

Anush Chiappino-Pepe ◽

Vassily Hatzimanikatis

Keyword(s):

Gene Expression ◽

Experimental Data ◽

Solution Space ◽

Metabolic Model ◽

Scale Models ◽

Different Types ◽

Transcriptomics Data ◽

Metabolic Reactions ◽

Metabolic Models ◽

Genome Scale

AbstractA large number of genome-scale models of cellular metabolism are available for various organisms. These models include all known metabolic reactions based on the genome annotation. However, the reactions that are active are dependent on the cellular metabolic function or environmental condition. Constraint-based methods that integrate condition-specific transcriptomics data into models have been used extensively to investigate condition-specific metabolism. Here, we present a method (TEX-FBA) for modeling condition-specific metabolism that combines transcriptomics and reaction thermodynamics data to generate a thermodynamically-feasible condition-specific metabolic model. TEX-FBA is an extension of thermodynamic-based flux balance analysis (TFA), which allows the simultaneous integration of different stages of experimental data (e.g., absolute gene expression, metabolite concentrations, thermodynamic data, and fluxomics) and the identification of alternative metabolic states that maximize consistency between gene expression levels and condition-specific reaction fluxes. We applied TEX-FBA to a genome-scale metabolic model ofEscherichia coliby integrating available condition-specific experimental data and found a marked reduction in the flux solution space. Our analysis revealed a marked correlation between actual gene expression profile and experimental flux measurements compared to the one obtained from a randomly generated gene expression profile. We identified additional essential reactions from the membrane lipid and folate metabolism when we integrated transcriptomics data of the given condition on the top of metabolomics and thermodynamics data. These results show TEX-FBA is a promising new approach to study condition-specific metabolism when different types of experimental data are available.Author summaryCells utilize nutrients via biochemical reactions that are controlled by enzymes and synthesize required compounds for their survival and growth. Genome-scale models of metabolism representing these complex reaction networks have been reconstructed for a wide variety of organisms ranging from bacteria to human cells. These models comprise all possible biochemical reactions in a cell, but cells choose only a subset of reactions for their immediate needs and functions. Usually, these models allow for a large flux solution space and one can integrate experimental data in order to reduce it and potentially predict the physiology for a specific condition. We developed a method for integrating different types of omics data, such as fluxomics, transcriptomics, metabolomics into genome-scale metabolic models that reduces the flux solution space. Using gene expression data, the algorithm maximizes the consistency between the predicted and experimental flux for the reactions and predicts biologically relevant flux ranges for the remaining reactions in the network. This method is useful for determining fluxes of metabolic reactions with reduced uncertainty and suitable for performing context- and condition-specific analysis in metabolic models using different types of experimental data.

Download Full-text

HyDe: a Python Package for Genome-Scale Hybridization Detection

10.1101/188037 ◽

2017 ◽

Cited By ~ 2

Author(s):

Paul D. Blischak ◽

Julia Chifman ◽

Andrea D. Wolfe ◽

Laura S. Kubatko

Keyword(s):

Gene Flow ◽

Simulated Data ◽

Data Sets ◽

Gene Trees ◽

Link Type ◽

Phylogenetic Invariants ◽

Genome Scale ◽

The Relationship ◽

Python Package ◽

Hybridization Detection

AbstractThe analysis of hybridization and gene flow among closely related taxa is a common goal for researchers studying speciation and phylogeography. Many methods for hybridization detection use simple site pattern frequencies from observed genomic data and compare them to null models that predict an absence of gene flow. The theory underlying the detection of hybridization using these site pattern probabilities exploits the relationship between the coalescent process for gene trees within population trees and the process of mutation along the branches of the gene trees. For certain models, site patterns are predicted to occur in equal frequency (i.e., their difference is 0), producing a set of functions called phylogenetic invariants. In this paper we introduce HyDe, a software package for detecting hybridization using phylogenetic invariants arising under the coalescent model with hybridization. HyDe is written in Python, and can be used interactively or through the command line using pre-packaged scripts. We demonstrate the use of HyDe on simulated data, as well as on two empirical data sets from the literature. We focus in particular on identifying individual hybrids within population samples and on distinguishing between hybrid speciation and gene flow. HyDe is freely available as an open source Python package under the GNU GPL v3 on both GitHub (https://github.com/pblischak/HyDe) and the Python Package Index (PyPI: https://pypi.python.org/pypi/phyde).

Download Full-text

pyrpipe: a python package for RNA-Seq workflows

10.1101/2020.03.04.925818 ◽

2020 ◽

Author(s):

Urminder Singh ◽

Jing Li ◽

Arun Seetharam ◽

Eve Syrkin Wurtele

Keyword(s):

Detailed Analysis ◽

Source Code ◽

Workflow Management ◽

Object Oriented ◽

Third Party ◽

Rna Seq ◽

Link Type ◽

Computing Environments ◽

High Level ◽

Python Package

Implementing RNA-Seq analysis pipelines is challenging as data gets bigger and more complex. With the availability of terabytes of RNA-Seq data and continuous development of analysis tools, there is a pressing requirement for frameworks that allow for fast and efficient development, modification, sharing and reuse of workflows. Scripting is often used, but it has many challenges and drawbacks. We have developed a python package, python RNA-Seq Pipeliner (pyrpipe) that enables straightforward development of flexible, reproducible and easy-to-debug computational pipelines purely in python, in an object-oriented manner. pyrpipe provides high level APIs to popular RNA-Seq tools. Pipelines can be customized by integrating new python code, third-party programs, or python libraries. Researchers can create checkpoints in the pipeline or integrate pyrpipe into a workflow management system, thus allowing execution on multiple computing environments. pyrpipe produces detailed analysis, and benchmark reports which can be shared or included in publications. pyrpipe is implemented in python and is compatible with python versions 3.6 and higher. All source code is available at https://github.com/urmi-21/pyrpipe; the package can be installed from the source or from PyPi (https://pypi.org/project/pyrpipe). Documentation is available on Read the Docs (http://pyrpipe.rtfd.io).

Download Full-text

pyABC: distributed, likelihood-free inference

10.1101/162552 ◽

2017 ◽

Cited By ~ 1

Author(s):

Emmanuel Klinger ◽

Dennis Rickert ◽

Jan Hasenauer

Keyword(s):

Sequential Monte Carlo ◽

Source Code ◽

Distance Functions ◽

Web Interface ◽

Practical Application ◽

Acceptance Threshold ◽

Link Type ◽

Data Querying ◽

Approximate Bayesian ◽

Python Package

SummaryLikelihood-free methods are often required for inference in systems biology. While Approximate Bayesian Computation (ABC) provides a theoretical solution, its practical application has often been challenging due to its high computational demands. To scale likelihood-free inference to computationally demanding stochastic models we developed pyABC: a distributed and scalable ABC-Sequential Monte Carlo (ABC-SMC) framework. It implements computation-minimizing and scalable, runtime-minimizing parallelization strategies for multi-core and distributed environments scaling to thousands of cores. The framework is accessible to non-expert users and also enables advanced users to experiment with and to custom implement many options of ABC-SMC schemes, such as acceptance threshold schedules, transition kernels and distance functions without alteration of pyABC’s source code. pyABC includes a web interface to visualize ongoing and 1nished ABC-SMC runs and exposes an API for data querying and post-processing.Availability and ImplementationpyABC is written in Python 3 and is released under the GPLv3 license. The source code is hosted on https://github.com/neuralyzer/pyabc and the documentation on http://pyabc.readthedocs.io. It can be installed from the Python Package Index (PyPI).

Download Full-text