scholarly journals Metabolic network-guided binning of metagenomic sequence fragments

2015 ◽  
Vol 32 (6) ◽  
pp. 867-874 ◽  
Author(s):  
Matthew B. Biggs ◽  
Jason A. Papin

Abstract Motivation: Most microbes on Earth have never been grown in a laboratory, and can only be studied through DNA sequences. Environmental DNA sequence samples are complex mixtures of fragments from many different species, often unknown. There is a pressing need for methods that can reliably reconstruct genomes from complex metagenomic samples in order to address questions in ecology, bioremediation, and human health. Results: We present the SOrting by NEtwork Completion (SONEC) approach for assigning reactions to incomplete metabolic networks based on a metabolite connectivity score. We successfully demonstrate proof of concept in a set of 100 genome-scale metabolic network reconstructions, and delineate the variables that impact reaction assignment accuracy. We further demonstrate the integration of SONEC with existing approaches (such as cross-sample scaffold abundance profile clustering) on a set of 94 metagenomic samples from the Human Microbiome Project. We show that not only does SONEC aid in reconstructing species-level genomes, but it also improves functional predictions made with the resulting metabolic networks. Availability and implementation: The datasets and code presented in this work are available at: https://bitbucket.org/mattbiggs/sorting_by_network_completion/. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.

2021 ◽  
Author(s):  
Ecehan Abdik ◽  
Tunahan Cakir

Genome-scale metabolic networks enable systemic investigation of metabolic alterations caused by diseases by providing interpretation of omics data. Although Mus musculus (mouse) is one of the most commonly used model...


2020 ◽  
Vol 36 (14) ◽  
pp. 4130-4136
Author(s):  
David J Burks ◽  
Rajeev K Azad

Abstract Motivation Alignment-free, stochastic models derived from k-mer distributions representing reference genome sequences have a rich history in the classification of DNA sequences. In particular, the variants of Markov models have previously been used extensively. Higher-order Markov models have been used with caution, perhaps sparingly, primarily because of the lack of enough training data and computational power. Advances in sequencing technology and computation have enabled exploitation of the predictive power of higher-order models. We, therefore, revisited higher-order Markov models and assessed their performance in classifying metagenomic sequences. Results Comparative assessment of higher-order models (HOMs, 9th order or higher) with interpolated Markov model, interpolated context model and lower-order models (8th order or lower) was performed on metagenomic datasets constructed using sequenced prokaryotic genomes. Our results show that HOMs outperform other models in classifying metagenomic fragments as short as 100 nt at all taxonomic ranks, and at lower ranks when the fragment size was increased to 250 nt. HOMs were also found to be significantly more accurate than local alignment which is widely relied upon for taxonomic classification of metagenomic sequences. A novel software implementation written in C++ performs classification faster than the existing Markovian metagenomic classifiers and can therefore be used as a standalone classifier or in conjunction with existing taxonomic classifiers for more robust classification of metagenomic sequences. Availability and implementation The software has been made available at https://github.com/djburks/SMM. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Parasitology ◽  
2010 ◽  
Vol 137 (9) ◽  
pp. 1393-1407 ◽  
Author(s):  
LUDOVIC COTTRET ◽  
FABIEN JOURDAN

SUMMARYRecently, a way was opened with the development of many mathematical methods to model and analyze genome-scale metabolic networks. Among them, methods based on graph models enable to us quickly perform large-scale analyses on large metabolic networks. However, it could be difficult for parasitologists to select the graph model and methods adapted to their biological questions. In this review, after briefly addressing the problem of the metabolic network reconstruction, we propose an overview of the graph-based approaches used in whole metabolic network analyses. Applications highlight the usefulness of this kind of approach in the field of parasitology, especially by suggesting metabolic targets for new drugs. Their development still represents a major challenge to fight against the numerous diseases caused by parasites.


2019 ◽  
Vol 36 (4) ◽  
pp. 1289-1290
Author(s):  
Patrick H Bradley ◽  
Katherine S Pollard

Abstract Summary Phylogenetic comparative methods are powerful but presently under-utilized ways to identify microbial genes underlying differences in community composition. These methods help to identify functionally important genes because they test for associations beyond those expected when related microbes occupy similar environments. We present phylogenize, a pipeline with web, QIIME 2 and R interfaces that allows researchers to perform phylogenetic regression on 16S amplicon and shotgun sequencing data and to visualize results. phylogenize applies broadly to both host-associated and environmental microbiomes. Using Human Microbiome Project and Earth Microbiome Project data, we show that phylogenize draws similar conclusions from 16S versus shotgun sequencing and reveals both known and candidate pathways associated with host colonization. Availability and implementation phylogenize is available at https://phylogenize.org and https://bitbucket.org/pbradz/phylogenize. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 26 (03) ◽  
pp. 373-397
Author(s):  
ZIXIANG XU ◽  
JING GUO ◽  
YUNXIA YUE ◽  
JING MENG ◽  
XIAO SUN

Microbial Fuel Cells (MFCs) are devices that generate electricity directly from organic compounds with microbes (electricigens) serving as anodic catalysts. As a novel environment-friendly energy source, MFCs have extensive practical value. Since the biological features and metabolic mechanism of electricigens have a great effect on the electricity production of MFCs, it is a big deal to screen strains with high electricity productivity for improving the power output of MFC. Reconstructions and simulations of metabolic networks are of significant help in studying the metabolism of microorganisms so as to guide gene engineering and metabolic engineering to improve their power-generating efficiency. Herein, we reconstructed a genome-scale constraint-based metabolic network model of Shewanella loihica PV-4, an important electricigen, based on its genomic functional annotations, reaction databases and published metabolic network models of seven microorganisms. The resulting network model iGX790 consists of 902 reactions (including 71 exchange reactions), 798 metabolites and 790 genes, covering the main pathways such as carbon metabolism, energy metabolism, amino acid metabolism, nucleic acid metabolism and lipid metabolism. Using the model, we simulated the growth rate, the maximal synthetic rate of ATP, the flux variability analysis of metabolic network, gene deletion and so on to examine the metabolism of S. loihica PV-4.


2021 ◽  
Author(s):  
Thomas James Moutinho ◽  
Benjamin C Neubert ◽  
Matthew L Jenior ◽  
Jason A. Papin

Genome-scale metabolic network reconstructions (GENREs) are valuable tools for understanding microbial community metabolism. The process of automatically generating GENREs includes identifying metabolic reactions supported by sufficient genomic evidence to generate a draft metabolic network. The draft GENRE is then gapfilled with additional reactions in order to recapitulate specific growth phenotypes as indicated with associated experimental data. Previous methods have implemented absolute mapping thresholds for the reactions automatically included in draft GENREs; however, there is growing evidence that integrating annotation evidence in a continuous form can improve model accuracy. There is a need for flexibility in the structure of GENREs to better account for uncertainty in biological data, unknown regulatory mechanisms, and context specificity associated with data inputs. To address this issue, we present a novel method that provides a framework for quantifying combined genomic, biochemical, and phenotypic evidence for each biochemical reaction during automated GENRE construction. Our method, Constraint-based Analysis Yielding reaction Usage across metabolic Networks (CANYUNs), generates accurate GENREs with a quantitative metric for the cumulative evidence for each reaction included in the network. The structure of a CANYUN GENRE allows for the simultaneous integration of three data inputs while maintaining all supporting evidence for biochemical reactions that may be active in an organism. CANYUNs is designed to maximize the utility of experimental and annotation datasets and to ultimately assist in the curation of the reference datasets used for the automatic reconstruction of metabolic networks. We validated CANYUNs by generating an E. coli K-12 model and compared it to the manually curated reconstruction iML1515. Finally, we demonstrated the use of CANYUNs to build a model by generating an E. coli Nissle CANYUN GENRE using novel phenotypic data that we collected. This method may address key challenges for the procedural construction of metabolic networks by leveraging uncertainty and redundancy in biological data.


2019 ◽  
Author(s):  
Hongzhong Lu ◽  
Zhengming Zhu ◽  
Eduard J Kerkhoven ◽  
Jens Nielsen

AbstractSummaryFALCONET (FAst visuaLisation of COmputational NETworks) enables the automatic for-mation and visualisation of metabolic maps from genome-scale models with R and CellDesigner, readily facilitating the visualisation of multi-layers omics datasets in the context of metabolic networks.MotivationUntil now, numerous GEMs have been reconstructed and used as scaffolds to conduct integrative omics analysis and in silico strain design. Due to the large network size of GEMs, it is challenging to produce and visualize these networks as metabolic maps for further in-depth analyses.ResultsHere, we presented the R package - FALCONET, which facilitates drawing and visualizing metabolic maps in an automatic manner. This package will benefit the research community by allowing a wider use of GEMs in systems biology.Availability and implementationFALCONET is available on https://github.com/SysBioChalmers/FALCONET and released under the MIT [email protected] informationSupplementary data are available online.


2016 ◽  
Author(s):  
Shea N Gardner ◽  
Sasha K Ames ◽  
Maya B Gokhale ◽  
Tom R Slezak ◽  
Jonathan Allen

Software for rapid, accurate, and comprehensive microbial profiling of metagenomic sequence data on a desktop will play an important role in large scale clinical use of metagenomic data. Here we describe LMAT-ML (Livermore Metagenomics Analysis Toolkit-Marker Library) which can be run with 24 GB of DRAM memory, an amount available on many clusters, or with 16 GB DRAM plus a 24 GB low cost commodity flash drive (NVRAM), a cost effective alternative for desktop or laptop users. We compared results from LMAT with five other rapid, low-memory tools for metagenome analysis for 131 Human Microbiome Project samples, and assessed discordant calls with BLAST. All the tools except LMAT-ML reported overly specific or incorrect species and strain resolution of reads that were in fact much more widely conserved across species, genera, and even families. Several of the tools misclassified reads from synthetic or vector sequence as microbial or human reads as viral. We attribute the high numbers of false positive and false negative calls to a limited reference database with inadequate representation of known diversity. Our comparisons with real world samples show that LMAT-ML is the only tool tested that classifies the majority of reads, and does so with high accuracy.


2020 ◽  
Vol 36 (14) ◽  
pp. 4163-4170
Author(s):  
Francisco Guil ◽  
José F Hidalgo ◽  
José M García

Abstract Motivation Elementary flux modes (EFMs) are a key tool for analyzing genome-scale metabolic networks, and several methods have been proposed to compute them. Among them, those based on solving linear programming (LP) problems are known to be very efficient if the main interest lies in computing large enough sets of EFMs. Results Here, we propose a new method called EFM-Ta that boosts the efficiency rate by analyzing the information provided by the LP solver. We base our method on a further study of the final tableau of the simplex method. By performing additional elementary steps and avoiding trivial solutions consisting of two cycles, we obtain many more EFMs for each LP problem posed, improving the efficiency rate of previously proposed methods by more than one order of magnitude. Availability and implementation Software is freely available at https://github.com/biogacop/Boost_LP_EFM. Contact [email protected] Supplementary information Supplementary data are available at Bioinformatics online.


Author(s):  
Mattia G Gollub ◽  
Hans-Michael Kaltenbach ◽  
Jörg Stelling

Abstract Motivation Random sampling of metabolic fluxes can provide a comprehensive description of the capabilities of a metabolic network. However, current sampling approaches do not model thermodynamics explicitly, leading to inaccurate predictions of an organism’s potential or actual metabolic operations. Results We present a probabilistic framework combining thermodynamic quantities with steady-state flux constraints to analyze the properties of a metabolic network. It includes methods for probabilistic metabolic optimization and for joint sampling of thermodynamic and flux spaces. Applied to a model of E. coli, we use the methods to reveal known and novel mechanisms of substrate channeling, and to accurately predict reaction directions and metabolite concentrations. Interestingly, predicted flux distributions are multimodal, leading to discrete hypotheses on E. coli’s metabolic capabilities. Availability Python and MATLAB packages available at https://gitlab.com/csb.ethz/pta. Supplementary information Supplementary data are available at Bioinformatics online.


Sign in / Sign up

Export Citation Format

Share Document