Recapitulation of the evolution of biosynthetic gene clusters reveals hidden chemical diversity on bacterial genomes

Mapping Intimacies ◽

10.1101/020503 ◽

2015 ◽

Cited By ~ 6

Author(s):

Pablo Cruz-Morales ◽

Christian E. Martínez-Guerrero ◽

Marco A. Morales-Escalante ◽

Luis Yáñez-Guerra ◽

Johannes Florian Kopp ◽

...

Keyword(s):

Natural Products ◽

Chemical Space ◽

Streptomyces Coelicolor ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Chemical Diversity ◽

Biosynthetic Gene ◽

Bacterial Genomes ◽

Biosynthetic Gene Clusters

AbstractNatural products have provided humans with antibiotics for millennia. However, a decline in the pace of chemical discovery exerts pressure on human health as antibiotic resistance spreads. The empirical nature of current genome mining approaches used for natural products research limits the chemical space that is explored. By integration of evolutionary concepts related to emergence of metabolism, we have gained fundamental insights that are translated into an alternative genome mining approach, termed EvoMining. As the founding assumption of EvoMining is the evolution of enzymes, we solved two milestone problems revealing unprecedented conversions. First, we report the biosynthetic gene cluster of the ‘orphan’ metabolite leupeptin in Streptomyces roseus. Second, we discover an enzyme involved in formation of an arsenic-carbon bond in Streptomyces coelicolor and Streptomyces lividans. This work provides evidence that bacterial chemical repertoire is underexploited, as well as an approach to accelerate the discovery of novel antibiotics from bacterial genomes.

Download Full-text

Targeted Genome Mining—From Compound Discovery to Biosynthetic Pathway Elucidation

Microorganisms ◽

10.3390/microorganisms8122034 ◽

2020 ◽

Vol 8 (12) ◽

pp. 2034

Author(s):

Nils Gummerlich ◽

Yuriy Rebets ◽

Constanze Paulus ◽

Josef Zapp ◽

Andriy Luzhetskyy

Keyword(s):

Natural Products ◽

Polyketide Synthase ◽

Genomic Library ◽

Mutational Analysis ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Feeding Experiments

Natural products are an important source of novel investigational compounds in drug discovery. Especially in the field of antibiotics, Actinobacteria have been proven to be a reliable source for lead structures. The discovery of these natural products with activity- and structure-guided screenings has been impeded by the constant rediscovery of previously identified compounds. Additionally, a large discrepancy between produced natural products and biosynthetic potential in Actinobacteria, including representatives of the order Pseudonocardiales, has been revealed using genome sequencing. To turn this genomic potential into novel natural products, we used an approach including the in-silico pre-selection of unique biosynthetic gene clusters followed by their systematic heterologous expression. As a proof of concept, fifteen Saccharothrixespanaensis genomic library clones covering predicted biosynthetic gene clusters were chosen for expression in two heterologous hosts, Streptomyceslividans and Streptomycesalbus. As a result, two novel natural products, an unusual angucyclinone pentangumycin and a new type II polyketide synthase shunt product SEK90, were identified. After purification and structure elucidation, the biosynthetic pathways leading to the formation of pentangumycin and SEK90 were deduced using mutational analysis of the biosynthetic gene cluster and feeding experiments with 13C-labelled precursors.

Download Full-text

An interpreted atlas of biosynthetic gene clusters from 1,000 fungal genomes

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.2020230118 ◽

2021 ◽

Vol 118 (19) ◽

pp. e2020230118

Author(s):

Matthew T. Robey ◽

Lindsay K. Caesar ◽

Milton T. Drott ◽

Nancy P. Keller ◽

Neil L. Kelleher

Keyword(s):

Natural Products ◽

Chemical Space ◽

Genome Mining ◽

Fold Increase ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Automated Annotation ◽

Fungal Genomes ◽

Species Specific

Fungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1,000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here, we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1,037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.

Download Full-text

An Interpreted Atlas of Biosynthetic Gene Clusters from 1000 Fungal Genomes

10.1101/2020.09.21.307157 ◽

2020 ◽

Author(s):

Matthew T. Robey ◽

Lindsay K. Caesar ◽

Milton T. Drott ◽

Nancy P. Keller ◽

Neil L. Kelleher

Keyword(s):

Natural Products ◽

Large Scale ◽

Ad Hoc ◽

Chemical Space ◽

Genome Mining ◽

Fold Increase ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Fungal Genomes

AbstractFungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of Fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species-specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.Significance StatementFungi represent an underexploited resource for new compounds with applications in the pharmaceutical and agriscience industries. Despite the availability of >1000 fungal genomes, our knowledge of the biosynthetic space encoded by these genomes is limited and ad hoc. We present results from systematically organizing the biosynthetic content of 1037 fungal genomes, providing a resource for data-driven genome mining and large-scale comparison of the genetic and molecular repertoires produced in fungi and compare to those present in bacteria.

Download Full-text

Expanding the Natural Products Heterologous Expression Repertoire in the Model Cyanobacterium Anabaena sp. Strain PCC 7120: Production of Pendolmycin and Teleocidin B-4

10.26434/chemrxiv.11316098.v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Patrick Videau ◽

Kaitlyn Wells ◽

Arun Singh ◽

Jessie Eiting ◽

Philip Proteau ◽

...

Keyword(s):

Natural Products ◽

Genome Mining ◽

Gene Clusters ◽

Combinatorial Biosynthesis ◽

Test Case ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Cyanobacterium Anabaena ◽

Anabaena Sp ◽

Pcc 7120

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of <i>Anabaena </i>sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native <i>Anabaena</i>7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.

Download Full-text

Discovery of Unusual Cyanobacterial Tryptophan-containing Anabaenopeptins by MS/MS Based Molecular Networking

10.20944/preprints202007.0562.v1 ◽

2020 ◽

Author(s):

Subhasish Saha ◽

Germana Esposito ◽

Petra Urajova ◽

Jan Mareš ◽

Daniela Ewe ◽

...

Keyword(s):

Multidisciplinary Approach ◽

Spectroscopic Analysis ◽

Chemical Space ◽

Genome Mining ◽

Gc Content ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Hela Cell Lines ◽

Bioactive Secondary Metabolites

Heterocytous cyanobacteria are among the most prolific source of bioactive secondary metabolites, including anabaenopeptins (APTs). A terrestrial filamentous Brasilonema sp. CT11 collected in Costa Rica bamboo forest, as black mat was studied using a multidisciplinary approach: genome mining and HPLC-HRMS/MS coupled with bionformatic analyses. Herein, we report the nearly complete genome consisting 8.79 Mbp with a GC content of 42.4%. Moreover, we report on three novel tryptophane-containing APTs; anabaenopeptin 788 (1), anabaenopeptin 802 (2) and anabaenopeptin 816 (3). Further, the structure of two homologues, i.e., anabaenopeptin 802 (2a) and anabaenopeptin 802 (2b) was determined by spectroscopic analysis (NMR and MS). Both compounds were shown to exert weak to moderate antiproliferative activity against HeLa cell lines. This study also provides the unique and diverse potential of biosynthetic gene clusters and an assessment of the predicted chemical space yet to be discovered from this genus.

Download Full-text

Genomic analysis of siderophore β-hydroxylases reveals divergent stereocontrol and expands the condensation domain family

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1903161116 ◽

2019 ◽

Vol 116 (40) ◽

pp. 19805-19814 ◽

Cited By ~ 8

Author(s):

Zachary L. Reitz ◽

Clifford D. Hardy ◽

Jaewon Suk ◽

Jean Bouvet ◽

Alison Butler

Keyword(s):

Predictive Power ◽

Genome Mining ◽

Genomic Analysis ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Peptide Synthetase ◽

Condensation Domain

Genome mining of biosynthetic pathways streamlines discovery of secondary metabolites but can leave ambiguities in the predicted structures, which must be rectified experimentally. Through coupling the reactivity predicted by biosynthetic gene clusters with verified structures, the origin of the β-hydroxyaspartic acid diastereomers in siderophores is reported herein. Two functional subtypes of nonheme Fe(II)/α-ketoglutarate–dependent aspartyl β-hydroxylases are identified in siderophore biosynthetic gene clusters, which differ in genomic organization—existing either as fused domains (IβHAsp) at the carboxyl terminus of a nonribosomal peptide synthetase (NRPS) or as stand-alone enzymes (TβHAsp)—and each directs opposite stereoselectivity of Asp β-hydroxylation. The predictive power of this subtype delineation is confirmed by the stereochemical characterization of β-OHAsp residues in pyoverdine GB-1, delftibactin, histicorrugatin, and cupriachelin. The l-threo (2S, 3S) β-OHAsp residues of alterobactin arise from hydroxylation by the β-hydroxylase domain integrated into NRPS AltH, while l-erythro (2S, 3R) β-OHAsp in delftibactin arises from the stand-alone β-hydroxylase DelD. Cupriachelin contains both l-threo and l-erythro β-OHAsp, consistent with the presence of both types of β-hydroxylases in the biosynthetic gene cluster. A third subtype of nonheme Fe(II)/α-ketoglutarate–dependent enzymes (IβHHis) hydroxylates histidyl residues with l-threo stereospecificity. A previously undescribed, noncanonical member of the NRPS condensation domain superfamily is identified, named the interface domain, which is proposed to position the β-hydroxylase and the NRPS-bound amino acid prior to hydroxylation. Through mapping characterized β-OHAsp diastereomers to the phylogenetic tree of siderophore β-hydroxylases, methods to predict β-OHAsp stereochemistry in silico are realized.

Download Full-text

A Deep Learning Genome-Mining Strategy Improves Biosynthetic Gene Cluster Prediction

10.1101/500694 ◽

2018 ◽

Author(s):

Geoffrey D. Hannigan ◽

David Prihoda ◽

Andrej Palicka ◽

Jindrich Soukup ◽

Ondrej Klempir ◽

...

Keyword(s):

Natural Products ◽

Deep Learning ◽

Learning Strategy ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Antimicrobial Drugs ◽

Drug Candidates ◽

Significant Step

AbstractNatural products represent a rich reservoir of small molecule drug candidates utilized as antimicrobial drugs, anticancer therapies, and immunomodulatory agents. These molecules are microbial secondary metabolites synthesized by co-localized genes termed Biosynthetic Gene Clusters (BGCs). The increase in full microbial genomes and similar resources has led to development of BGC prediction algorithms, although their precision and ability to identify novel BGC classes could be improved. Here we present a deep learning strategy (DeepBGC) that offers more accurate BGC identification and an improved ability to extrapolate and identify novel BGC classes compared to existing tools. We supplemented this with downstream random forest classifiers that accurately predicted BGC product classes and potential chemical activity. Application of DeepBGC to bacterial genomes uncovered previously undetectable BGCs that may code for natural products with novel biologic activities. The improved accuracy and classification ability of DeepBGC represents a significant step forward forin-silicoBGC identification.

Download Full-text

Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters

Briefings in Bioinformatics ◽

10.1093/bib/bbx146 ◽

2017 ◽

Vol 20 (4) ◽

pp. 1103-1113 ◽

Cited By ~ 37

Author(s):

Kai Blin ◽

Hyun Uk Kim ◽

Marnix H Medema ◽

Tilmann Weber

Keyword(s):

Natural Products ◽

Small Molecules ◽

Sequence Similarity ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Rule Based ◽

Chemical Structures ◽

Annotation Quality

Abstract Many drugs are derived from small molecules produced by microorganisms and plants, so-called natural products. Natural products have diverse chemical structures, but the biosynthetic pathways producing those compounds are often organized as biosynthetic gene clusters (BGCs) and follow a highly conserved biosynthetic logic. This allows for the identification of core biosynthetic enzymes using genome mining strategies that are based on the sequence similarity of the involved enzymes/genes. However, mining for a variety of BGCs quickly approaches a complexity level where manual analyses are no longer possible and require the use of automated genome mining pipelines, such as the antiSMASH software. In this review, we discuss the principles underlying the predictions of antiSMASH and other tools and provide practical advice for their application. Furthermore, we discuss important caveats such as rule-based BGC detection, sequence and annotation quality and cluster boundary prediction, which all have to be considered while planning for, performing and analyzing the results of genome mining studies.

Download Full-text

MIBiG 2.0: a repository for biosynthetic gene clusters of known function

Nucleic Acids Research ◽

10.1093/nar/gkz882 ◽

2019 ◽

Cited By ~ 31

Author(s):

Satria A Kautsar ◽

Kai Blin ◽

Simon Shaw ◽

Jorge C Navarro-Muñoz ◽

Barbara R Terlouw ◽

...

Keyword(s):

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Data Schema ◽

Cluster Data ◽

Structure Databases ◽

And Storage

Abstract Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.

Download Full-text

Further Biochemical Profiling of Hypholoma fasciculare Metabolome Reveals Its Chemogenetic Diversity

Frontiers in Bioengineering and Biotechnology ◽

10.3389/fbioe.2021.567384 ◽

2021 ◽

Vol 9 ◽

Author(s):

Suhad A. A. Al-Salihi ◽

Ian D. Bull ◽

Raghad Al-Salhi ◽

Paul J. Gates ◽

Kifah S. M. Salih ◽

...

Keyword(s):

Natural Products ◽

Chemical Properties ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Microbial Drug Resistance ◽

Bioactive Natural Products ◽

Highly Active ◽

Hypholoma Fasciculare

Natural products with novel chemistry are urgently needed to battle the continued increase in microbial drug resistance. Mushroom-forming fungi are underutilized as a source of novel antibiotics in the literature due to their challenging culture preparation and genetic intractability. However, modern fungal molecular and synthetic biology tools have renewed interest in exploring mushroom fungi for novel therapeutic agents. The aims of this study were to investigate the secondary metabolites of nine basidiomycetes, screen their biological and chemical properties, and then investigate the genetic pathways associated with their production. Of the nine fungi selected, Hypholoma fasciculare was revealed to be a highly active antagonistic species, with antimicrobial activity against three different microorganisms: Bacillus subtilis, Escherichia coli, and Saccharomyces cerevisiae. Genomic comparisons and chromatographic studies were employed to characterize more than 15 biosynthetic gene clusters and resulted in the identification of 3,5-dichloromethoxy benzoic acid as a potential antibacterial compound. The biosynthetic gene cluster for this product is also predicted. This study reinforces the potential of mushroom-forming fungi as an underexplored reservoir of bioactive natural products. Access to genomic data, and chemical-based frameworks, will assist the development and application of novel molecules with applications in both the pharmaceutical and agrochemical industries.

Download Full-text