TaxiBGC: a Taxonomy-guided Approach for the Identification of Experimentally Verified Microbial Biosynthetic Gene Clusters in Shotgun Metagenomic Data

Biosynthetic gene clusters (BGCs) in microbial genomes encode for the production of bioactive secondary metabolites (SMs). Given the well-recognized importance of SMs in microbe-microbe and microbe-host interactions, the large-scale identification of BGCs from microbial metagenomes could offer novel functional insights into complex chemical ecology. Despite recent progress, currently available tools for predicting BGCs from shotgun metagenomes have several limitations, including the need for computationally demanding read-assembly and prediction of a narrow breadth of BGC classes. To overcome these limitations, we developed TaxiBGC (Taxonomy-guided Identification of Biosynthetic Gene Clusters), a computational pipeline for identifying experimentally verified BGCs in shotgun metagenomes by first pinpointing the microbial species likely to produce them. We show that our species-centric approach was able to identify BGCs in simulated metagenomes more accurately than by solely detecting BGC genes. By applying TaxiBGC on 5,423 metagenomes from the Human Microbiome Project and various case-control studies, we identified distinct BGC signatures of major human body sites and candidate stool-borne biomarkers for multiple diseases, including inflammatory bowel disease, colorectal cancer, and psychiatric disorders. In all, TaxiBGC demonstrates a significant advantage over existing techniques for systematically characterizing BGCs and inferring their SMs from microbiome data.

Download Full-text

Mining metagenomes for natural product biosynthetic gene clusters: unlocking new potential with ultrafast techniques

10.1101/2021.01.20.427441 ◽

2021 ◽

Author(s):

Emiliano Pereira-Flores ◽

Marnix Medema ◽

Pier Luigi Buttigieg ◽

Peter Meinicke ◽

Frank Oliver Glöckner ◽

...

Keyword(s):

Natural Products ◽

Natural Product ◽

Human Microbiome ◽

Gene Clusters ◽

Human Microbiome Project ◽

Biosynthetic Gene Cluster ◽

Metagenomic Data ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Wide Range

Microorganisms produce an immense variety of natural products through the expression of Biosynthetic Gene Clusters (BGCs): physically clustered genes that encode the enzymes of a specialized metabolic pathway. These natural products cover a wide range of chemical classes (e.g., aminoglycosides, lantibiotics, nonribosomal peptides, oligosaccharides, polyketides, terpenes) that are highly valuable for industrial and medical applications1. Metagenomics, as a culture-independent approach, has greatly enhanced our ability to survey the functional potential of microorganisms and is growing in popularity for the mining of BGCs. However, to effectively exploit metagenomic data to this end, it will be crucial to more efficiently identify these genomic elements in highly complex and ever-increasing volumes of data2. Here, we address this challenge by developing the ultrafast Biosynthetic Gene cluster MEtagenomic eXploration toolbox (BiG-MEx). BiG-MEx rapidly identifies a broad range of BGC protein domains, assess their diversity and novelty, and predicts the abundance profile of natural product BGC classes in metagenomic data. We show the advantages of BiG-MEx compared to standard BGC-mining approaches, and use it to explore the BGC domain and class composition of samples in the TARA Oceans3 and Human Microbiome Project datasets4. In these analyses, we demonstrate BiG-MEx’s applicability to study the distribution, diversity, and ecological roles of BGCs in metagenomic data, and guide the exploration of natural products with clinical applications.

Download Full-text

Faculty Opinions recommendation of A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718872945.793500860 ◽

2014 ◽

Author(s):

Howard Young ◽

Heekyong Bae

Keyword(s):

Human Microbiome ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Systematic Analysis

Download Full-text

Identification of a New Antimicrobial, Desertomycin H, Utilizing a Modified Crowded Plate Technique

Marine Drugs ◽

10.3390/md19080424 ◽

2021 ◽

Vol 19 (8) ◽

pp. 424

Author(s):

Osama G. Mohamed ◽

Sadaf Dorandish ◽

Rebecca Lindow ◽

Megan Steltz ◽

Ifrah Shoukat ◽

...

Keyword(s):

Antibiotic Production ◽

Gene Clusters ◽

Multidrug Resistant ◽

Microbial Interactions ◽

Mass Spectrometry Data ◽

Metagenomic Data ◽

Resistant Bacteria ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Plate Technique

The antibiotic-resistant bacteria-associated infections are a major global healthcare threat. New classes of antimicrobial compounds are urgently needed as the frequency of infections caused by multidrug-resistant microbes continues to rise. Recent metagenomic data have demonstrated that there is still biosynthetic potential encoded in but transcriptionally silent in cultivatable bacterial genomes. However, the culture conditions required to identify and express silent biosynthetic gene clusters that yield natural products with antimicrobial activity are largely unknown. Here, we describe a new antibiotic discovery scheme, dubbed the modified crowded plate technique (mCPT), that utilizes complex microbial interactions to elicit antimicrobial production from otherwise silent biosynthetic gene clusters. Using the mCPT as part of the antibiotic crowdsourcing educational program Tiny Earth®, we isolated over 1400 antibiotic-producing microbes, including 62, showing activity against multidrug-resistant pathogens. The natural product extracts generated from six microbial isolates showed potent activity against vancomycin-intermediate resistant Staphylococcus aureus. We utilized a targeted approach that coupled mass spectrometry data with bioactivity, yielding a new macrolactone class of metabolite, desertomycin H. In this study, we successfully demonstrate a concept that significantly increased our ability to quickly and efficiently identify microbes capable of the silent antibiotic production.

Download Full-text

IMG-ABC: A Knowledge Base To Fuel Discovery of Biosynthetic Gene Clusters and Novel Secondary Metabolites

mBio ◽

10.1128/mbio.00932-15 ◽

2015 ◽

Vol 6 (4) ◽

Cited By ~ 66

Author(s):

Michalis Hadjithomas ◽

I-Min Amy Chen ◽

Ken Chu ◽

Anna Ratner ◽

Krishna Palaniappan ◽

...

Keyword(s):

Secondary Metabolites ◽

Secondary Metabolism ◽

Genomic Data ◽

Gene Clusters ◽

Metagenomic Data ◽

Integrated Analysis ◽

Analysis Tool ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Analysis Tools

ABSTRACTIn the discovery of secondary metabolites, analysis of sequence data is a promising exploration path that remains largely underutilized due to the lack of computational platforms that enable such a systematic approach on a large scale. In this work, we present IMG-ABC (https://img.jgi.doe.gov/abc), an atlas of biosynthetic gene clusters within the Integrated Microbial Genomes (IMG) system, which is aimed at harnessing the power of “big” genomic data for discovering small molecules. IMG-ABC relies on IMG's comprehensive integrated structural and functional genomic data for the analysis of biosynthetic gene clusters (BCs) and associated secondary metabolites (SMs). SMs and BCs serve as the two main classes of objects in IMG-ABC, each with a rich collection of attributes. A unique feature of IMG-ABC is the incorporation of both experimentally validated and computationally predicted BCs in genomes as well as metagenomes, thus identifying BCs in uncultured populations and rare taxa. We demonstrate the strength of IMG-ABC's focused integrated analysis tools in enabling the exploration of microbial secondary metabolism on a global scale, through the discovery of phenazine-producing clusters for the first time inAlphaproteobacteria. IMG-ABC strives to fill the long-existent void of resources for computational exploration of the secondary metabolism universe; its underlying scalable framework enables traversal of uncovered phylogenetic and chemical structure space, serving as a doorway to a new era in the discovery of novel molecules.IMPORTANCEIMG-ABC is the largest publicly available database of predicted and experimental biosynthetic gene clusters and the secondary metabolites they produce. The system also includes powerful search and analysis tools that are integrated with IMG's extensive genomic/metagenomic data and analysis tool kits. As new research on biosynthetic gene clusters and secondary metabolites is published and more genomes are sequenced, IMG-ABC will continue to expand, with the goal of becoming an essential component of any bioinformatic exploration of the secondary metabolism world.

Download Full-text

A Comparative Analysis of Biosynthetic Gene Clusters in Lean and Obese Humans

BioMed Research International ◽

10.1155/2019/6361320 ◽

2019 ◽

Vol 2019 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Shengqin Wang ◽

Na Li ◽

Nan Li ◽

Huixi Zou ◽

Mingjiang Wu

Keyword(s):

Gut Microbiome ◽

Gene Clusters ◽

Human Adipose Tissue ◽

Taxonomic Diversity ◽

Metagenomic Data ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Intestinal Microbes ◽

Microbe Interactions ◽

Treatment Of Obesity

Obesity is intrinsically linked with the gut microbiome, and studies have identified several obesity-associated microbes. The microbe-microbe interactions can alter the composition of the microbial community and influence host health by producing secondary metabolites (SMs). However, the contribution of these SMs in the prevention and treatment of obesity has been largely ignored. We identified several SM-encoding biosynthetic gene clusters (BGCs) from the metagenomic data of lean and obese individuals and found significant association between some BGCs, including those that produce hitherto unknown SM, and obesity. In addition, the mean abundance of BGCs was positively correlated with obesity, consistent with the lower taxonomic diversity in the gut microbiota of obese individuals. By comparing the BGCs of known SM between obese and nonobese samples, we found that menaquinone produced by Enterobacter cloacae showed the highest correlation with BMI, in agreement with a recent study on human adipose tissue composition. Furthermore, an obesity-related nonribosomal peptide synthetase (NRPS) was negatively associated with Bacteroidetes, indicating that the SMs produced by intestinal microbes in obese individuals can change the microbiome structure. This is the first systemic study of the association between gut microbiome BGCs and obesity and provides new insights into the causes of obesity.

Download Full-text

A Systematic Analysis of Biosynthetic Gene Clusters in the Human Microbiome Reveals a Common Family of Antibiotics

Cell ◽

10.1016/j.cell.2014.08.032 ◽

2014 ◽

Vol 158 (6) ◽

pp. 1402-1414 ◽

Cited By ~ 353

Author(s):

Mohamed S. Donia ◽

Peter Cimermancic ◽

Christopher J. Schulze ◽

Laura C. Wieland Brown ◽

John Martin ◽

...

Keyword(s):

Human Microbiome ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Systematic Analysis

Download Full-text

Searching more genomic sequence with less memory for fast and accurate metagenomic profiling

10.1101/036681 ◽

2016 ◽

Author(s):

Shea N Gardner ◽

Sasha K Ames ◽

Maya B Gokhale ◽

Tom R Slezak ◽

Jonathan Allen

Keyword(s):

Large Scale ◽

Genomic Sequence ◽

Sequence Data ◽

Low Cost ◽

False Negative ◽

Human Microbiome ◽

Human Microbiome Project ◽

Metagenomic Data ◽

Reference Database ◽

Metagenomic Sequence

Software for rapid, accurate, and comprehensive microbial profiling of metagenomic sequence data on a desktop will play an important role in large scale clinical use of metagenomic data. Here we describe LMAT-ML (Livermore Metagenomics Analysis Toolkit-Marker Library) which can be run with 24 GB of DRAM memory, an amount available on many clusters, or with 16 GB DRAM plus a 24 GB low cost commodity flash drive (NVRAM), a cost effective alternative for desktop or laptop users. We compared results from LMAT with five other rapid, low-memory tools for metagenome analysis for 131 Human Microbiome Project samples, and assessed discordant calls with BLAST. All the tools except LMAT-ML reported overly specific or incorrect species and strain resolution of reads that were in fact much more widely conserved across species, genera, and even families. Several of the tools misclassified reads from synthetic or vector sequence as microbial or human reads as viral. We attribute the high numbers of false positive and false negative calls to a limited reference database with inadequate representation of known diversity. Our comparisons with real world samples show that LMAT-ML is the only tool tested that classifies the majority of reads, and does so with high accuracy.

Download Full-text

Faculty Opinions recommendation of A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718872945.793501437 ◽

2014 ◽

Author(s):

Chris Whitfield

Keyword(s):

Human Microbiome ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Systematic Analysis

Download Full-text

Large-Scale Metagenome Assembly Reveals Novel Animal-Associated Microbial Genomes, Biosynthetic Gene Clusters, and Other Genetic Diversity

mSystems ◽

10.1128/msystems.01045-20 ◽

2020 ◽

Vol 5 (6) ◽

Author(s):

Nicholas D. Youngblut ◽

Jacobo de la Cuesta-Zuluaga ◽

Georg H. Reischer ◽

Silke Dauser ◽

Nathalie Schuster ◽

...

Keyword(s):

Large Scale ◽

Animal Species ◽

Gene Clusters ◽

Genomic Diversity ◽

Data Sets ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Microbial Genomes ◽

Metagenome Assembly ◽

Gut Metagenome

ABSTRACT Large-scale metagenome assemblies of human microbiomes have produced a vast catalogue of previously unseen microbial genomes; however, comparatively few microbial genomes derive from other vertebrates. Here, we generated 5,596 metagenome-assembled genomes (MAGs) from the gut metagenomes of 180 predominantly wild animal species representing 5 classes, in addition to 14 existing animal gut metagenome data sets. The MAGs comprised 1,522 species-level genome bins (SGBs), most of which were novel at the species, genus, or family level, and the majority were enriched in host versus environment metagenomes. Many traits distinguished SGBs enriched in host or environmental biomes, including the number of antimicrobial resistance genes. We identified 1,986 diverse biosynthetic gene clusters; only 23 clustered with any MIBiG database references. Gene-based assembly revealed tremendous gene diversity, much of it host or environment specific. Our MAG and gene data sets greatly expand the microbial genome repertoire and provide a broad view of microbial adaptations to the vertebrate gut. IMPORTANCE Microbiome studies on a select few mammalian species (e.g., humans, mice, and cattle) have revealed a great deal of novel genomic diversity in the gut microbiome. However, little is known of the microbial diversity in the gut of other vertebrates. We studied the gut microbiomes of a large set of mostly wild animal species consisting of mammals, birds, reptiles, amphibians, and fish. Unfortunately, we found that existing reference databases commonly used for metagenomic analyses failed to capture the microbiome diversity among vertebrates. To increase database representation, we applied advanced metagenome assembly methods to our animal gut data and to many public gut metagenome data sets that had not been used to obtain microbial genomes. Our resulting genome and gene cluster collections comprised a great deal of novel taxonomic and genomic diversity, which we extensively characterized. Our findings substantially expand what is known of microbial genomic diversity in the vertebrate gut.

Download Full-text

Faculty Opinions recommendation of A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.718872945.793500087 ◽

2014 ◽

Author(s):

David Triggle

Keyword(s):

Human Microbiome ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Systematic Analysis

Download Full-text