scholarly journals The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes

2020 ◽  
Vol 49 (D1) ◽  
pp. D639-D643 ◽  
Author(s):  
Kai Blin ◽  
Simon Shaw ◽  
Satria A Kautsar ◽  
Marnix H Medema ◽  
Tilmann Weber

Abstract Microorganisms produce natural products that are frequently used in the development of antibacterial, antiviral, and anticancer drugs, pesticides, herbicides, or fungicides. In recent years, genome mining has evolved into a prominent method to access this potential. antiSMASH is one of the most popular tools for this task. Here, we present version 3 of the antiSMASH database, providing a means to access and query precomputed antiSMASH-5.2-detected biosynthetic gene clusters from representative, publicly available, high-quality microbial genomes via an interactive graphical user interface. In version 3, the database contains 147 517 high quality BGC regions from 388 archaeal, 25 236 bacterial and 177 fungal genomes and is available at https://antismash-db.secondarymetabolites.org/.

2018 ◽  
Vol 16 (10) ◽  
pp. 1620-1626 ◽  
Author(s):  
Cameron L. M. Gilchrist ◽  
Hang Li ◽  
Yit-Heng Chooi

A perspective on existing and emerging strategies for the prioritisation of secondary metabolite biosynthetic gene clusters (BGCs) to increase the odds of fruitful mining of fungal genomes.


Author(s):  
Satria A. Kautsar ◽  
Justin J. J. van der Hooft ◽  
Dick de Ridder ◽  
Marnix H. Medema

AbstractBackgroundGenome mining for Biosynthetic Gene Clusters (BGCs) has become an integral part of natural product discovery. The >200,000 microbial genomes now publicly available hold information on abundant novel chemistry. One way to navigate this vast genomic diversity is through comparative analysis of homologous BGCs, which allows identification of cross-species patterns that can be matched to the presence of metabolites or biological activities. However, current tools suffer from a bottleneck caused by the expensive network-based approach used to group these BGCs into Gene Cluster Families (GCFs).ResultsHere, we introduce BiG-SLiCE, a tool designed to cluster massive numbers of BGCs. By representing them in Euclidean space, BiG-SLiCE can group BGCs into GCFs in a non-pairwise, near-linear fashion. We used BiG-SLiCE to analyze 1,225,071 BGCs collected from 209,206 publicly available microbial genomes and metagenome-assembled genomes (MAGs) within ten days on a typical 36-cores CPU server. We demonstrate the utility of such analyses by reconstructing a global map of secondary metabolic diversity across taxonomy to identify uncharted biosynthetic potential. BiG-SLiCE also provides a "query mode" that can efficiently place newly sequenced BGCs into previously computed GCFs, plus a powerful output visualization engine that facilitates user-friendly data exploration.ConclusionsBiG-SLiCE opens up new possibilities to accelerate natural product discovery and offers a first step towards constructing a global, searchable interconnected network of BGCs. As more genomes get sequenced from understudied taxa, more information can be mined to highlight their potentially novel chemistry. BiG-SLiCE is available via https://github.com/medema-group/bigslice.


2021 ◽  
Vol 118 (19) ◽  
pp. e2020230118
Author(s):  
Matthew T. Robey ◽  
Lindsay K. Caesar ◽  
Milton T. Drott ◽  
Nancy P. Keller ◽  
Neil L. Kelleher

Fungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1,000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here, we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1,037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.


2020 ◽  
Author(s):  
Matthew T. Robey ◽  
Lindsay K. Caesar ◽  
Milton T. Drott ◽  
Nancy P. Keller ◽  
Neil L. Kelleher

AbstractFungi are prolific producers of natural products, compounds which have had a large societal impact as pharmaceuticals, mycotoxins, and agrochemicals. Despite the availability of over 1000 fungal genomes and several decades of compound discovery efforts from fungi, the biosynthetic gene clusters (BGCs) encoded by these genomes and the associated chemical space have yet to be analyzed systematically. Here we provide detailed annotation and analyses of fungal biosynthetic and chemical space to enable genome mining and discovery of fungal natural products. Using 1037 genomes from species across the fungal kingdom (e.g., Ascomycota, Basidiomycota, and non-Dikarya taxa), 36,399 predicted BGCs were organized into a network of 12,067 gene cluster families (GCFs). Anchoring these GCFs with reference BGCs enabled automated annotation of 2,026 BGCs with predicted metabolite scaffolds. We performed parallel analyses of the chemical repertoire of Fungi, organizing 15,213 fungal compounds into 2,945 molecular families (MFs). The taxonomic landscape of fungal GCFs is largely species-specific, though select families such as the equisetin GCF are present across vast phylogenetic distances with parallel diversifications in the GCF and MF. We compare these fungal datasets with a set of 5,453 bacterial genomes and their BGCs and 9,382 bacterial compounds, revealing dramatic differences between bacterial and fungal biosynthetic logic and chemical space. These genomics and cheminformatics analyses reveal the large extent to which fungal and bacterial sources represent distinct compound reservoirs. With a >10-fold increase in the number of interpreted strains and annotated BGCs, this work better regularizes the biosynthetic potential of fungi for rational compound discovery.Significance StatementFungi represent an underexploited resource for new compounds with applications in the pharmaceutical and agriscience industries. Despite the availability of >1000 fungal genomes, our knowledge of the biosynthetic space encoded by these genomes is limited and ad hoc. We present results from systematically organizing the biosynthetic content of 1037 fungal genomes, providing a resource for data-driven genome mining and large-scale comparison of the genetic and molecular repertoires produced in fungi and compare to those present in bacteria.


GigaScience ◽  
2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Satria A Kautsar ◽  
Justin J J van der Hooft ◽  
Dick de Ridder ◽  
Marnix H Medema

Abstract Background Genome mining for biosynthetic gene clusters (BGCs) has become an integral part of natural product discovery. The >200,000 microbial genomes now publicly available hold information on abundant novel chemistry. One way to navigate this vast genomic diversity is through comparative analysis of homologous BGCs, which allows identification of cross-species patterns that can be matched to the presence of metabolites or biological activities. However, current tools are hindered by a bottleneck caused by the expensive network-based approach used to group these BGCs into gene cluster families (GCFs). Results Here, we introduce BiG-SLiCE, a tool designed to cluster massive numbers of BGCs. By representing them in Euclidean space, BiG-SLiCE can group BGCs into GCFs in a non-pairwise, near-linear fashion. We used BiG-SLiCE to analyze 1,225,071 BGCs collected from 209,206 publicly available microbial genomes and metagenome-assembled genomes within 10 days on a typical 36-core CPU server. We demonstrate the utility of such analyses by reconstructing a global map of secondary metabolic diversity across taxonomy to identify uncharted biosynthetic potential. BiG-SLiCE also provides a “query mode” that can efficiently place newly sequenced BGCs into previously computed GCFs, plus a powerful output visualization engine that facilitates user-friendly data exploration. Conclusions BiG-SLiCE opens up new possibilities to accelerate natural product discovery and offers a first step towards constructing a global and searchable interconnected network of BGCs. As more genomes are sequenced from understudied taxa, more information can be mined to highlight their potentially novel chemistry. BiG-SLiCE is available via https://github.com/medema-group/bigslice.


Author(s):  
Patrick Videau ◽  
Kaitlyn Wells ◽  
Arun Singh ◽  
Jessie Eiting ◽  
Philip Proteau ◽  
...  

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of <i>Anabaena </i>sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native <i>Anabaena</i>7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.


Author(s):  
Subhasish Saha ◽  
Germana Esposito ◽  
Petra Urajova ◽  
Jan Mareš ◽  
Daniela Ewe ◽  
...  

Heterocytous cyanobacteria are among the most prolific source of bioactive secondary metabolites, including anabaenopeptins (APTs). A terrestrial filamentous Brasilonema sp. CT11 collected in Costa Rica bamboo forest, as black mat was studied using a multidisciplinary approach: genome mining and HPLC-HRMS/MS coupled with bionformatic analyses. Herein, we report the nearly complete genome consisting 8.79 Mbp with a GC content of 42.4%. Moreover, we report on three novel tryptophane-containing APTs; anabaenopeptin 788 (1), anabaenopeptin 802 (2) and anabaenopeptin 816 (3). Further, the structure of two homologues, i.e., anabaenopeptin 802 (2a) and anabaenopeptin 802 (2b) was determined by spectroscopic analysis (NMR and MS). Both compounds were shown to exert weak to moderate antiproliferative activity against HeLa cell lines. This study also provides the unique and diverse potential of biosynthetic gene clusters and an assessment of the predicted chemical space yet to be discovered from this genus.


Antibiotics ◽  
2020 ◽  
Vol 9 (8) ◽  
pp. 494
Author(s):  
Lena Mitousis ◽  
Yvonne Thoma ◽  
Ewa M. Musiol-Kroll

The first antibiotic-producing actinomycete (Streptomyces antibioticus) was described by Waksman and Woodruff in 1940. This discovery initiated the “actinomycetes era”, in which several species were identified and demonstrated to be a great source of bioactive compounds. However, the remarkable group of microorganisms and their potential for the production of bioactive agents were only partially exploited. This is caused by the fact that the growth of many actinomycetes cannot be reproduced on artificial media at laboratory conditions. In addition, sequencing, genome mining and bioactivity screening disclosed that numerous biosynthetic gene clusters (BGCs), encoded in actinomycetes genomes are not expressed and thus, the respective potential products remain uncharacterized. Therefore, a lot of effort was put into the development of technologies that facilitate the access to actinomycetes genomes and activation of their biosynthetic pathways. In this review, we mainly focus on molecular tools and methods for genetic engineering of actinomycetes that have emerged in the field in the past five years (2015–2020). In addition, we highlight examples of successful application of the recently developed technologies in genetic engineering of actinomycetes for activation and/or improvement of the biosynthesis of secondary metabolites.


2020 ◽  
Vol 9 (42) ◽  
Author(s):  
Alex J. Mullins ◽  
Cerith Jones ◽  
Matthew J. Bull ◽  
Gordon Webster ◽  
Julian Parkhill ◽  
...  

ABSTRACT The genomes of 450 members of Burkholderiaceae, isolated from clinical and environmental sources, were sequenced and assembled as a resource for genome mining. Genomic analysis of the collection has enabled the identification of multiple metabolites and their biosynthetic gene clusters, including the antibiotics gladiolin, icosalide A, enacyloxin, and cepacin A.


2019 ◽  
Vol 116 (40) ◽  
pp. 19805-19814 ◽  
Author(s):  
Zachary L. Reitz ◽  
Clifford D. Hardy ◽  
Jaewon Suk ◽  
Jean Bouvet ◽  
Alison Butler

Genome mining of biosynthetic pathways streamlines discovery of secondary metabolites but can leave ambiguities in the predicted structures, which must be rectified experimentally. Through coupling the reactivity predicted by biosynthetic gene clusters with verified structures, the origin of the β-hydroxyaspartic acid diastereomers in siderophores is reported herein. Two functional subtypes of nonheme Fe(II)/α-ketoglutarate–dependent aspartyl β-hydroxylases are identified in siderophore biosynthetic gene clusters, which differ in genomic organization—existing either as fused domains (IβHAsp) at the carboxyl terminus of a nonribosomal peptide synthetase (NRPS) or as stand-alone enzymes (TβHAsp)—and each directs opposite stereoselectivity of Asp β-hydroxylation. The predictive power of this subtype delineation is confirmed by the stereochemical characterization of β-OHAsp residues in pyoverdine GB-1, delftibactin, histicorrugatin, and cupriachelin. The l-threo (2S, 3S) β-OHAsp residues of alterobactin arise from hydroxylation by the β-hydroxylase domain integrated into NRPS AltH, while l-erythro (2S, 3R) β-OHAsp in delftibactin arises from the stand-alone β-hydroxylase DelD. Cupriachelin contains both l-threo and l-erythro β-OHAsp, consistent with the presence of both types of β-hydroxylases in the biosynthetic gene cluster. A third subtype of nonheme Fe(II)/α-ketoglutarate–dependent enzymes (IβHHis) hydroxylates histidyl residues with l-threo stereospecificity. A previously undescribed, noncanonical member of the NRPS condensation domain superfamily is identified, named the interface domain, which is proposed to position the β-hydroxylase and the NRPS-bound amino acid prior to hydroxylation. Through mapping characterized β-OHAsp diastereomers to the phylogenetic tree of siderophore β-hydroxylases, methods to predict β-OHAsp stereochemistry in silico are realized.


Sign in / Sign up

Export Citation Format

Share Document