Transporter Protein-Guided Genome Mining for Head-to-Tail Cyclized Bacteriocins

Head-to-tail cyclized bacteriocins are ribosomally synthesized antimicrobial peptides that are defined by peptide backbone cyclization involving the N- and C- terminal amino acids. Their cyclic nature and overall three-dimensional fold confer superior stability against extreme pH and temperature conditions, and protease degradation. Most of the characterized head-to-tail cyclized bacteriocins were discovered through a traditional approach that involved the screening of bacterial isolates for antimicrobial activity and subsequent isolation and characterization of the active molecule. In this study, we performed genome mining using transporter protein sequences associated with experimentally validated head-to-tail cyclized bacteriocins as driver sequences to search for novel bacteriocins. Biosynthetic gene cluster analysis was then performed to select the high probability functional gene clusters. A total of 387 producer strains that encode putative head-to-tail cyclized bacteriocins were identified. Sequence and phylogenetic analyses revealed that this class of bacteriocins is more diverse than previously thought. Furthermore, our genome mining strategy captured hits that were not identified in precursor-based bioprospecting, showcasing the utility of this approach to expanding the repertoire of head-to-tail cyclized bacteriocins. This work sets the stage for future isolation of novel head-to-tail cyclized bacteriocins to serve as possible alternatives to traditional antibiotics and potentially help address the increasing threat posed by resistant pathogens.

Download Full-text

Biosynthetic potential of uncultured Antarctic soil bacteria revealed through long-read metagenomic sequencing

The ISME Journal ◽

10.1038/s41396-021-01052-3 ◽

2021 ◽

Author(s):

Valentin Waschulin ◽

Chiara Borsetto ◽

Robert James ◽

Kevin K. Newsham ◽

Stefano Donadio ◽

...

Keyword(s):

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Full Length ◽

Metagenomic Sequencing ◽

Short Read ◽

Short Read Sequencing ◽

Rich Diversity ◽

Long Read ◽

The Rich

AbstractThe growing problem of antibiotic resistance has led to the exploration of uncultured bacteria as potential sources of new antimicrobials. PCR amplicon analyses and short-read sequencing studies of samples from different environments have reported evidence of high biosynthetic gene cluster (BGC) diversity in metagenomes, indicating their potential for producing novel and useful compounds. However, recovering full-length BGC sequences from uncultivated bacteria remains a challenge due to the technological restraints of short-read sequencing, thus making assessment of BGC diversity difficult. Here, long-read sequencing and genome mining were used to recover >1400 mostly full-length BGCs that demonstrate the rich diversity of BGCs from uncultivated lineages present in soil from Mars Oasis, Antarctica. A large number of highly divergent BGCs were not only found in the phyla Acidobacteriota, Verrucomicrobiota and Gemmatimonadota but also in the actinobacterial classes Acidimicrobiia and Thermoleophilia and the gammaproteobacterial order UBA7966. The latter furthermore contained a potential novel family of RiPPs. Our findings underline the biosynthetic potential of underexplored phyla as well as unexplored lineages within seemingly well-studied producer phyla. They also showcase long-read metagenomic sequencing as a promising way to access the untapped genetic reservoir of specialised metabolite gene clusters of the uncultured majority of microbes.

Download Full-text

Genomic analysis of siderophore β-hydroxylases reveals divergent stereocontrol and expands the condensation domain family

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1903161116 ◽

2019 ◽

Vol 116 (40) ◽

pp. 19805-19814 ◽

Cited By ~ 8

Author(s):

Zachary L. Reitz ◽

Clifford D. Hardy ◽

Jaewon Suk ◽

Jean Bouvet ◽

Alison Butler

Keyword(s):

Predictive Power ◽

Genome Mining ◽

Genomic Analysis ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Peptide Synthetase ◽

Condensation Domain

Genome mining of biosynthetic pathways streamlines discovery of secondary metabolites but can leave ambiguities in the predicted structures, which must be rectified experimentally. Through coupling the reactivity predicted by biosynthetic gene clusters with verified structures, the origin of the β-hydroxyaspartic acid diastereomers in siderophores is reported herein. Two functional subtypes of nonheme Fe(II)/α-ketoglutarate–dependent aspartyl β-hydroxylases are identified in siderophore biosynthetic gene clusters, which differ in genomic organization—existing either as fused domains (IβHAsp) at the carboxyl terminus of a nonribosomal peptide synthetase (NRPS) or as stand-alone enzymes (TβHAsp)—and each directs opposite stereoselectivity of Asp β-hydroxylation. The predictive power of this subtype delineation is confirmed by the stereochemical characterization of β-OHAsp residues in pyoverdine GB-1, delftibactin, histicorrugatin, and cupriachelin. The l-threo (2S, 3S) β-OHAsp residues of alterobactin arise from hydroxylation by the β-hydroxylase domain integrated into NRPS AltH, while l-erythro (2S, 3R) β-OHAsp in delftibactin arises from the stand-alone β-hydroxylase DelD. Cupriachelin contains both l-threo and l-erythro β-OHAsp, consistent with the presence of both types of β-hydroxylases in the biosynthetic gene cluster. A third subtype of nonheme Fe(II)/α-ketoglutarate–dependent enzymes (IβHHis) hydroxylates histidyl residues with l-threo stereospecificity. A previously undescribed, noncanonical member of the NRPS condensation domain superfamily is identified, named the interface domain, which is proposed to position the β-hydroxylase and the NRPS-bound amino acid prior to hydroxylation. Through mapping characterized β-OHAsp diastereomers to the phylogenetic tree of siderophore β-hydroxylases, methods to predict β-OHAsp stereochemistry in silico are realized.

Download Full-text

Characterization of the Nodularin Synthetase Gene Cluster and Proposed Theory of the Evolution of Cyanobacterial Hepatotoxins

Applied and Environmental Microbiology ◽

10.1128/aem.70.11.6353-6362.2004 ◽

2004 ◽

Vol 70 (11) ◽

pp. 6353-6362 ◽

Cited By ~ 169

Author(s):

Michelle C. Moffitt ◽

Brett A. Neilan

Keyword(s):

Gene Cluster ◽

Rational Design ◽

Polyketide Synthase ◽

Alternative Hypothesis ◽

Phylogenetic Analyses ◽

Cyanobacterial Bloom ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Microcystin Synthetase

ABSTRACT Nodularia spumigena is a bloom-forming cyanobacterium which produces the hepatotoxin nodularin. The complete gene cluster encoding the enzymatic machinery required for the biosynthesis of nodularin in N. spumigena strain NSOR10 was sequenced and characterized. The 48-kb gene cluster consists of nine open reading frames (ORFs), ndaA to ndaI, which are transcribed from a bidirectional regulatory promoter region and encode nonribosomal peptide synthetase modules, polyketide synthase modules, and tailoring enzymes. The ORFs flanking the nda gene cluster in the genome of N. spumigena strain NSOR10 were identified, and one of them was found to encode a protein with homology to previously characterized transposases. Putative transposases are also associated with the structurally related microcystin synthetase (mcy) gene clusters derived from three cyanobacterial strains, indicating a possible mechanism for the distribution of these biosynthetic gene clusters between various cyanobacterial genera. We propose an alternative hypothesis for hepatotoxin evolution in cyanobacteria based on the results of comparative and phylogenetic analyses of the nda and mcy gene clusters. These analyses suggested that nodularin synthetase evolved from a microcystin synthetase progenitor. The identification of the nodularin biosynthetic gene cluster and evolution of hepatotoxicity in cyanobacteria reported in this study may be valuable for future studies on toxic cyanobacterial bloom formation. In addition, an appreciation of the natural evolution of nonribosomal biosynthetic pathways will be vital for future combinatorial engineering and rational design of novel metabolites and pharmaceuticals.

Download Full-text

A Deep Learning Genome-Mining Strategy Improves Biosynthetic Gene Cluster Prediction

10.1101/500694 ◽

2018 ◽

Author(s):

Geoffrey D. Hannigan ◽

David Prihoda ◽

Andrej Palicka ◽

Jindrich Soukup ◽

Ondrej Klempir ◽

...

Keyword(s):

Natural Products ◽

Deep Learning ◽

Learning Strategy ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Antimicrobial Drugs ◽

Drug Candidates ◽

Significant Step

AbstractNatural products represent a rich reservoir of small molecule drug candidates utilized as antimicrobial drugs, anticancer therapies, and immunomodulatory agents. These molecules are microbial secondary metabolites synthesized by co-localized genes termed Biosynthetic Gene Clusters (BGCs). The increase in full microbial genomes and similar resources has led to development of BGC prediction algorithms, although their precision and ability to identify novel BGC classes could be improved. Here we present a deep learning strategy (DeepBGC) that offers more accurate BGC identification and an improved ability to extrapolate and identify novel BGC classes compared to existing tools. We supplemented this with downstream random forest classifiers that accurately predicted BGC product classes and potential chemical activity. Application of DeepBGC to bacterial genomes uncovered previously undetectable BGCs that may code for natural products with novel biologic activities. The improved accuracy and classification ability of DeepBGC represents a significant step forward forin-silicoBGC identification.

Download Full-text

MIBiG 2.0: a repository for biosynthetic gene clusters of known function

Nucleic Acids Research ◽

10.1093/nar/gkz882 ◽

2019 ◽

Cited By ~ 31

Author(s):

Satria A Kautsar ◽

Kai Blin ◽

Simon Shaw ◽

Jorge C Navarro-Muñoz ◽

Barbara R Terlouw ◽

...

Keyword(s):

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Data Schema ◽

Cluster Data ◽

Structure Databases ◽

And Storage

Abstract Fueled by the explosion of (meta)genomic data, genome mining of specialized metabolites has become a major technology for drug discovery and studying microbiome ecology. In these efforts, computational tools like antiSMASH have played a central role through the analysis of Biosynthetic Gene Clusters (BGCs). Thousands of candidate BGCs from microbial genomes have been identified and stored in public databases. Interpreting the function and novelty of these predicted BGCs requires comparison with a well-documented set of BGCs of known function. The MIBiG (Minimum Information about a Biosynthetic Gene Cluster) Data Standard and Repository was established in 2015 to enable curation and storage of known BGCs. Here, we present MIBiG 2.0, which encompasses major updates to the schema, the data, and the online repository itself. Over the past five years, 851 new BGCs have been added. Additionally, we performed extensive manual data curation of all entries to improve the annotation quality of our repository. We also redesigned the data schema to ensure the compliance of future annotations. Finally, we improved the user experience by adding new features such as query searches and a statistics page, and enabled direct link-outs to chemical structure databases. The repository is accessible online at https://mibig.secondarymetabolites.org/.

Download Full-text

New Kendomycin Derivative Isolated from Streptomyces sp. Cl 58-27

Molecules ◽

10.3390/molecules26226834 ◽

2021 ◽

Vol 26 (22) ◽

pp. 6834

Author(s):

Constanze Paulus ◽

Oleksandr Gromyko ◽

Andriy Luzhetskyy

Keyword(s):

Nmr Spectroscopy ◽

Natural Product ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Streptomyces Sp ◽

Bioinformatic Tools ◽

Isolation And Characterization ◽

Metabolism Gene

In the course of screening new streptomycete strains, the strain Streptomyces sp. Cl 58-27 caught our attention due to its interesting secondary metabolite production profile. Here, we report the isolation and characterization of an ansamycin natural product that belongs structurally to the already known kendomycins. The structure of the new kendomycin E was elucidated using NMR spectroscopy, and the corresponding biosynthetic gene cluster was identified by sequencing the genome of Streptomyces sp. Cl 58-27 and conducting a detailed analysis of secondary metabolism gene clusters using bioinformatic tools.

Download Full-text

Global Genome Mining Reveals the Distribution of Diverse Thioamidated RiPP Biosynthesis Gene Clusters

Frontiers in Microbiology ◽

10.3389/fmicb.2021.635389 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jessie James Limlingan Malit ◽

Chuanhai Wu ◽

Ling-Li Liu ◽

Pei-Yuan Qian

Keyword(s):

Protein Pair ◽

Genome Mining ◽

Gene Clusters ◽

Cytotoxic Activities ◽

Related Gene ◽

Peptide Backbone ◽

Precursor Peptide ◽

Wide Range ◽

Biosynthesis Gene ◽

Modified Peptides

Thioamidated ribosomally synthesized and post-translationally modified peptides (RiPPs) are recently characterized natural products with wide range of potent bioactivities, such as antibiotic, antiproliferative, and cytotoxic activities. These peptides are distinguished by the presence of thioamide bonds in the peptide backbone catalyzed by the YcaO-TfuA protein pair with its genes adjacent to each other. Genome mining has facilitated an in silico approach to identify biosynthesis gene clusters (BGCs) responsible for thioamidated RiPP production. In this work, publicly available genomic data was used to detect and illustrate the diversity of putative BGCs encoding for thioamidated RiPPs. AntiSMASH and RiPPER analysis identified 613 unique TfuA-related gene cluster families (GCFs) and 797 precursor peptide families, even on phyla where the presence of these clusters have not been previously described. Several additional biosynthesis genes are colocalized with the detected BGCs, suggesting an array of possible chemical modifications. This study shows that thioamidated RiPPs occupy a widely unexplored chemical landscape.

Download Full-text

Deep-BGCpred: A unified deep learning genome-mining framework for biosynthetic gene cluster prediction

10.1101/2021.11.15.468547 ◽

2021 ◽

Author(s):

Ziyi Yang ◽

Benben Liao ◽

Changyu Hsieh ◽

Chao Han ◽

Liang Fang ◽

...

Keyword(s):

Natural Products ◽

Deep Learning ◽

High Throughput Sequencing ◽

Short Term Memory ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Bioactive Molecules ◽

Dual Model ◽

Biosynthetic Gene

Natural products produced by microorganisms constitute an important source of essential pharmaceuticals, including antimicrobial and anti-tumor drugs. These bioactive molecules are microbial secondary metabolites synthesized by co-localized genes termed Biosynthetic Gene Clusters (BGCs). The rapid increase of microbial genomics resources, due to the availability of high-throughput sequencing technologies, has spurred the development of computational methods for microbial genome mining for BGC discovery. Current machine learning methods, however, have limited successes in uncovering novel BGCs due to an excessive number of false positives in their predictions. To this end, we propose Deep-BGCpred, a framework that effectively addresses the aforementioned issue by improving a deep learning model termed DeepBGC. The new model embeds multi-source protein family domains and employs a stacked Bidirectional Long Short-Term Memory model to boost accuracy for BGC identifications. In particular, it integrates two customized strategies, sliding window strategy and dual-model serial screening, to improve the model's performance stability and reduce the number of false positive in BGC predictions. We compare the proposed model against other well-established methods on common benchmarks and achieve new state-of-the-art results with convincing evidences. We expect that researchers working on genome mining for natural products may be greatly benefited from our newly proposed method, Deep-BGCpred.

Download Full-text

Reconstitution of polythioamide antibiotic backbone formation reveals unusual thiotemplated assembly strategy

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1918759117 ◽

2020 ◽

Vol 117 (16) ◽

pp. 8850-8858

Author(s):

Kyle L. Dunbar ◽

Maria Dell ◽

Finn Gude ◽

Christian Hertweck

Keyword(s):

Secondary Metabolites ◽

Anaerobic Bacteria ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Biosynthetic Gene ◽

Peptide Backbone ◽

Nonribosomal Peptide ◽

Bacterial Phyla ◽

Biochemical Assays

Closthioamide (CTA) is a rare example of a thioamide-containing nonribosomal peptide and is one of only a handful of secondary metabolites described from obligately anaerobic bacteria. Although the biosynthetic gene cluster responsible for CTA production and the thioamide synthetase that catalyzes sulfur incorporation were recently discovered, the logic for peptide backbone assembly has remained a mystery. Here, through the use of in vitro biochemical assays, we demonstrate that the amide backbone of CTA is assembled in an unusual thiotemplated pathway involving the cooperation of a transacylating member of the papain-like cysteine protease family and an iteratively acting ATP-grasp protein. Using the ATP-grasp protein as a bioinformatic handle, we identified hundreds of such thiotemplated yet nonribosomal peptide synthetase (NRPS)-independent biosynthetic gene clusters across diverse bacterial phyla. The data presented herein not only clarify the pathway for the biosynthesis of CTA, but also provide a foundation for the discovery of additional secondary metabolites produced by noncanonical biosynthetic pathways.

Download Full-text

Recapitulation of the evolution of biosynthetic gene clusters reveals hidden chemical diversity on bacterial genomes

10.1101/020503 ◽

2015 ◽

Cited By ~ 6

Author(s):

Pablo Cruz-Morales ◽

Christian E. Martínez-Guerrero ◽

Marco A. Morales-Escalante ◽

Luis Yáñez-Guerra ◽

Johannes Florian Kopp ◽

...

Keyword(s):

Natural Products ◽

Chemical Space ◽

Streptomyces Coelicolor ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Chemical Diversity ◽

Biosynthetic Gene ◽

Bacterial Genomes ◽

Biosynthetic Gene Clusters

AbstractNatural products have provided humans with antibiotics for millennia. However, a decline in the pace of chemical discovery exerts pressure on human health as antibiotic resistance spreads. The empirical nature of current genome mining approaches used for natural products research limits the chemical space that is explored. By integration of evolutionary concepts related to emergence of metabolism, we have gained fundamental insights that are translated into an alternative genome mining approach, termed EvoMining. As the founding assumption of EvoMining is the evolution of enzymes, we solved two milestone problems revealing unprecedented conversions. First, we report the biosynthetic gene cluster of the ‘orphan’ metabolite leupeptin in Streptomyces roseus. Second, we discover an enzyme involved in formation of an arsenic-carbon bond in Streptomyces coelicolor and Streptomyces lividans. This work provides evidence that bacterial chemical repertoire is underexploited, as well as an approach to accelerate the discovery of novel antibiotics from bacterial genomes.

Download Full-text