The Tripod for Bacterial Natural Product Discovery: Genome Mining, Silent Pathway Induction, and Mass Spectrometry-Based Molecular Networking

ABSTRACT Natural products are the richest source of chemical compounds for drug discovery. Particularly, bacterial secondary metabolites are in the spotlight due to advances in genome sequencing and mining, as well as for the potential of biosynthetic pathway manipulation to awake silent (cryptic) gene clusters under laboratory cultivation. Further progress in compound detection, such as the development of the tandem mass spectrometry (MS/MS) molecular networking approach, has contributed to the discovery of novel bacterial natural products. The latter can be applied directly to bacterial crude extracts for identifying and dereplicating known compounds, therefore assisting the prioritization of extracts containing novel natural products, for example. In our opinion, these three approaches—genome mining, silent pathway induction, and MS-based molecular networking—compose the tripod for modern bacterial natural product discovery and will be discussed in this perspective.

Download Full-text

Coupling Mass Spectral and Genomic Information to Improve Bacterial Natural Product Discovery Workflows

Marine Drugs ◽

10.3390/md19030142 ◽

2021 ◽

Vol 19 (3) ◽

pp. 142 ◽

Cited By ~ 1

Author(s):

Max Crüsemann

Keyword(s):

Mass Spectrometry ◽

Natural Products ◽

Natural Product ◽

Large Scale ◽

Genome Mining ◽

Structural Diversity ◽

Gene Clusters ◽

Genomic Information ◽

Mass Spectral ◽

Natural Product Discovery

Bacterial natural products possess potent bioactivities and high structural diversity and are typically encoded in biosynthetic gene clusters. Traditional natural product discovery approaches rely on UV- and bioassay-guided fractionation and are limited in terms of dereplication. Recent advances in mass spectrometry, sequencing and bioinformatics have led to large-scale accumulation of genomic and mass spectral data that is increasingly used for signature-based or correlation-based mass spectrometry genome mining approaches that enable rapid linking of metabolomic and genomic information to accelerate and rationalize natural product discovery. In this mini-review, these approaches are presented, and discovery examples provided. Finally, future opportunities and challenges for paired omics-based natural products discovery workflows are discussed.

Download Full-text

Synthetic Biology Advanced Natural Product Discovery

Metabolites ◽

10.3390/metabo11110785 ◽

2021 ◽

Vol 11 (11) ◽

pp. 785

Author(s):

Junyang Wang ◽

Jens Nielsen ◽

Zihe Liu

Keyword(s):

Natural Products ◽

Synthetic Biology ◽

Natural Product ◽

Rapid Development ◽

Genome Mining ◽

Gene Clusters ◽

Future Research ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Natural Product Discovery

A wide variety of bacteria, fungi and plants can produce bioactive secondary metabolites, which are often referred to as natural products. With the rapid development of DNA sequencing technology and bioinformatics, a large number of putative biosynthetic gene clusters have been reported. However, only a limited number of natural products have been discovered, as most biosynthetic gene clusters are not expressed or are expressed at extremely low levels under conventional laboratory conditions. With the rapid development of synthetic biology, advanced genome mining and engineering strategies have been reported and they provide new opportunities for discovery of natural products. This review discusses advances in recent years that can accelerate the design, build, test, and learn (DBTL) cycle of natural product discovery, and prospects trends and key challenges for future research directions.

Download Full-text

A supervised fingerprint-based strategy to connect natural product mass spectrometry fragmentation data to their biosynthetic gene clusters

10.1101/2021.10.05.463235 ◽

2021 ◽

Author(s):

Tiago F. Leao ◽

Mingxun Wang ◽

Ricardo da Silva ◽

Justin J.J. van der Hooft ◽

Anelize Bauermeister ◽

...

Keyword(s):

Mass Spectrometry ◽

Natural Products ◽

Natural Product ◽

Nearest Neighbor ◽

Gene Clusters ◽

Supervised Machine Learning ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Biosynthetic Genes ◽

Natural Product Discovery

AbstractMicrobial natural products, in particular secondary or specialized metabolites, are an important source and inspiration for many pharmaceutical and biotechnological products. However, bioactivity-guided methods widely employed in natural product discovery programs do not explore the full biosynthetic potential of microorganisms, and they usually miss metabolites that are produced at low titer. As a complementary method, the use of genome-based mining in natural products research has facilitated the charting of many novel natural products in the form of predicted biosynthetic gene clusters that encode for their production. Linking the biosynthetic potential inferred from genomics to the specialized metabolome measured by metabolomics would accelerate natural product discovery programs. Here, we applied a supervised machine learning approach, the K-Nearest Neighbor (KNN) classifier, for systematically connecting metabolite mass spectrometry data to their biosynthetic gene clusters. This pipeline offers a method for annotating the biosynthetic genes for known, analogous to known and cryptic metabolites that are detected via mass spectrometry. We demonstrate this approach by automated linking of six different natural product mass spectra, and their analogs, to their corresponding biosynthetic genes. Our approach can be applied to bacterial, fungal, algal and plant systems where genomes are paired with corresponding MS/MS spectra. Additionally, an approach that connects known metabolites to their biosynthetic genes potentially allows for bulk production via heterologous expression and it is especially useful for cases where the metabolites are produced at low amounts in the original producer.SignificanceThe pace of natural products discovery has remained relatively constant over the last two decades. At the same time, there is an urgent need to find new therapeutics to fight antibiotic resistant bacteria, cancer, tropical parasites, pathogenic viruses, and other severe diseases. To spark the enhanced discovery of structurally novel and bioactive natural products, we here introduce a supervised learning algorithm (K-Nearest Neighbor) that can connect known and analogous to known, as well as MS/MS spectra of yet unknowns to their corresponding biosynthetic gene clusters. Our Natural Products Mixed Omics tool provides access to genomic information for bioactivity prediction, class prediction, substrate predictions, and stereochemistry predictions to prioritize relevant metabolite products and facilitate their structural elucidation.

Download Full-text

Genomic Assemblies of Members of Burkholderia and Related Genera as a Resource for Natural Product Discovery

Microbiology Resource Announcements ◽

10.1128/mra.00485-20 ◽

2020 ◽

Vol 9 (42) ◽

Author(s):

Alex J. Mullins ◽

Cerith Jones ◽

Matthew J. Bull ◽

Gordon Webster ◽

Julian Parkhill ◽

...

Keyword(s):

Natural Product ◽

Genome Mining ◽

Genomic Analysis ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Natural Product Discovery

ABSTRACT The genomes of 450 members of Burkholderiaceae, isolated from clinical and environmental sources, were sequenced and assembled as a resource for genome mining. Genomic analysis of the collection has enabled the identification of multiple metabolites and their biosynthetic gene clusters, including the antibiotics gladiolin, icosalide A, enacyloxin, and cepacin A.

Download Full-text

A Multi-Omics Characterization of the Natural Product Potential of Tropical Filamentous Marine Cyanobacteria

Marine Drugs ◽

10.3390/md19010020 ◽

2021 ◽

Vol 19 (1) ◽

pp. 20

Author(s):

Tiago Leão ◽

Mingxun Wang ◽

Nathan Moss ◽

Ricardo da Silva ◽

Jon Sanders ◽

...

Keyword(s):

Natural Products ◽

Natural Product ◽

Genome Mining ◽

Gene Clusters ◽

Microbial Interactions ◽

Marine Cyanobacteria ◽

Pharmaceutical Drugs ◽

Genes Encoding ◽

Microbial Natural Products

Microbial natural products are important for the understanding of microbial interactions, chemical defense and communication, and have also served as an inspirational source for numerous pharmaceutical drugs. Tropical marine cyanobacteria have been highlighted as a great source of new natural products, however, few reports have appeared wherein a multi-omics approach has been used to study their natural products potential (i.e., reports are often focused on an individual natural product and its biosynthesis). This study focuses on describing the natural product genetic potential as well as the expressed natural product molecules in benthic tropical cyanobacteria. We collected from several sites around the world and sequenced the genomes of 24 tropical filamentous marine cyanobacteria. The informatics program antiSMASH was used to annotate the major classes of gene clusters. BiG-SCAPE phylum-wide analysis revealed the most promising strains for natural product discovery among these cyanobacteria. LCMS/MS-based metabolomics highlighted the most abundant molecules and molecular classes among 10 of these marine cyanobacterial samples. We observed that despite many genes encoding for peptidic natural products, peptides were not as abundant as lipids and lipopeptides in the chemical extracts. Our results highlight a number of highly interesting biosynthetic gene clusters for genome mining among these cyanobacterial samples.

Download Full-text

Expansion of RiPP biosynthetic space through integration of pan-genomics and machine learning uncovers a novel class of lanthipeptides

PLoS Biology ◽

10.1371/journal.pbio.3001026 ◽

2020 ◽

Vol 18 (12) ◽

pp. e3001026

Author(s):

Alexander M. Kloosterman ◽

Peter Cimermancic ◽

Somayah S. Elsayed ◽

Chao Du ◽

Michalis Hadjithomas ◽

...

Keyword(s):

Natural Products ◽

Natural Product ◽

Sequence Similarity ◽

Genome Mining ◽

Gene Clusters ◽

Support Vector ◽

Chemical Labeling ◽

Product Families ◽

Anticancer Properties ◽

Modified Peptides

Microbial natural products constitute a wide variety of chemical compounds, many which can have antibiotic, antiviral, or anticancer properties that make them interesting for clinical purposes. Natural product classes include polyketides (PKs), nonribosomal peptides (NRPs), and ribosomally synthesized and post-translationally modified peptides (RiPPs). While variants of biosynthetic gene clusters (BGCs) for known classes of natural products are easy to identify in genome sequences, BGCs for new compound classes escape attention. In particular, evidence is accumulating that for RiPPs, subclasses known thus far may only represent the tip of an iceberg. Here, we present decRiPPter (Data-driven Exploratory Class-independent RiPP TrackER), a RiPP genome mining algorithm aimed at the discovery of novel RiPP classes. DecRiPPter combines a Support Vector Machine (SVM) that identifies candidate RiPP precursors with pan-genomic analyses to identify which of these are encoded within operon-like structures that are part of the accessory genome of a genus. Subsequently, it prioritizes such regions based on the presence of new enzymology and based on patterns of gene cluster and precursor peptide conservation across species. We then applied decRiPPter to mine 1,295 Streptomyces genomes, which led to the identification of 42 new candidate RiPP families that could not be found by existing programs. One of these was studied further and elucidated as a representative of a novel subfamily of lanthipeptides, which we designate class V. The 2D structure of the new RiPP, which we name pristinin A3 (1), was solved using nuclear magnetic resonance (NMR), tandem mass spectrometry (MS/MS) data, and chemical labeling. Two previously unidentified modifying enzymes are proposed to create the hallmark lanthionine bridges. Taken together, our work highlights how novel natural product families can be discovered by methods going beyond sequence similarity searches to integrate multiple pathway discovery criteria.

Download Full-text

BiG-SLiCE: A Highly Scalable Tool Maps the Diversity of 1.2 Million Biosynthetic Gene Clusters

10.1101/2020.08.17.240838 ◽

2020 ◽

Cited By ~ 3

Author(s):

Satria A. Kautsar ◽

Justin J. J. van der Hooft ◽

Dick de Ridder ◽

Marnix H. Medema

Keyword(s):

Natural Product ◽

Biological Activities ◽

Genome Mining ◽

Gene Clusters ◽

Genomic Diversity ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Microbial Genomes ◽

Natural Product Discovery ◽

User Friendly

AbstractBackgroundGenome mining for Biosynthetic Gene Clusters (BGCs) has become an integral part of natural product discovery. The >200,000 microbial genomes now publicly available hold information on abundant novel chemistry. One way to navigate this vast genomic diversity is through comparative analysis of homologous BGCs, which allows identification of cross-species patterns that can be matched to the presence of metabolites or biological activities. However, current tools suffer from a bottleneck caused by the expensive network-based approach used to group these BGCs into Gene Cluster Families (GCFs).ResultsHere, we introduce BiG-SLiCE, a tool designed to cluster massive numbers of BGCs. By representing them in Euclidean space, BiG-SLiCE can group BGCs into GCFs in a non-pairwise, near-linear fashion. We used BiG-SLiCE to analyze 1,225,071 BGCs collected from 209,206 publicly available microbial genomes and metagenome-assembled genomes (MAGs) within ten days on a typical 36-cores CPU server. We demonstrate the utility of such analyses by reconstructing a global map of secondary metabolic diversity across taxonomy to identify uncharted biosynthetic potential. BiG-SLiCE also provides a "query mode" that can efficiently place newly sequenced BGCs into previously computed GCFs, plus a powerful output visualization engine that facilitates user-friendly data exploration.ConclusionsBiG-SLiCE opens up new possibilities to accelerate natural product discovery and offers a first step towards constructing a global, searchable interconnected network of BGCs. As more genomes get sequenced from understudied taxa, more information can be mined to highlight their potentially novel chemistry. BiG-SLiCE is available via https://github.com/medema-group/bigslice.

Download Full-text

Imaging mass spectrometry for natural products discovery: a review of ionization methods

Natural Product Reports ◽

10.1039/c9np00038k ◽

2020 ◽

Vol 37 (2) ◽

pp. 150-162 ◽

Cited By ~ 10

Author(s):

Joseph E. Spraker ◽

Gordon T. Luu ◽

Laura M. Sanchez

Keyword(s):

Mass Spectrometry ◽

Natural Products ◽

Natural Product ◽

Imaging Mass Spectrometry ◽

Natural Product Discovery ◽

Ionization Sources ◽

Natural Products Discovery ◽

Ionization Methods

This mini review discusses advantages, limitations, and examples of different mass spectrometry ionization sources applicable to natural product discovery workflows.

Download Full-text

A Single Biosynthetic Gene Cluster Is Responsible for the Production of Bagremycin Antibiotics and Ferroverdin Iron Chelators

mBio ◽

10.1128/mbio.01230-19 ◽

2019 ◽

Vol 10 (4) ◽

Cited By ~ 6

Author(s):

Loïc Martinet ◽

Aymeric Naômé ◽

Benoit Deflandre ◽

Marta Maciejewska ◽

Déborah Tellatin ◽

...

Keyword(s):

Natural Product ◽

Gene Cluster ◽

Biosynthetic Pathway ◽

Genome Mining ◽

Gene Clusters ◽

Bioactive Molecules ◽

Bioactive Metabolites ◽

Biosynthetic Gene ◽

Single Family ◽

Biosynthetic Gene Clusters

ABSTRACT Biosynthetic gene clusters (BGCs) are organized groups of genes involved in the production of specialized metabolites. Typically, one BGC is responsible for the production of one or several similar compounds with bioactivities that usually only vary in terms of strength and/or specificity. Here we show that the previously described ferroverdins and bagremycins, which are families of metabolites with different bioactivities, are produced from the same BGC, whereby the fate of the biosynthetic pathway depends on iron availability. Under conditions of iron depletion, the monomeric bagremycins are formed, representing amino-aromatic antibiotics resulting from the condensation of 3-amino-4-hydroxybenzoic acid with p-vinylphenol. Conversely, when iron is abundantly available, the biosynthetic pathway additionally produces a molecule based on p-vinylphenyl-3-nitroso-4-hydroxybenzoate, which complexes iron to form the trimeric ferroverdins that have anticholesterol activity. Thus, our work shows a unique exception to the concept that BGCs should only produce a single family of molecules with one type of bioactivity and that in fact different bioactive molecules may be produced depending on the environmental conditions. IMPORTANCE Access to whole-genome sequences has exposed the general incidence of the so-called cryptic biosynthetic gene clusters (BGCs), thereby renewing their interest for natural product discovery. As a consequence, genome mining is the often first approach implemented to assess the potential of a microorganism for producing novel bioactive metabolites. By revealing a new level of complexity of natural product biosynthesis, we further illustrate the difficulty of estimation of the panel of molecules associated with a BGC based on genomic information alone. Indeed, we found that the same gene cluster is responsible for the production of compounds which differ in terms of structure and bioactivity. The production of these different compounds responds to different environmental triggers, which suggests that multiplication of culture conditions is essential for revealing the entire panel of molecules made by a single BGC.

Download Full-text

Computational Genomics of Specialized Metabolism: from Natural Product Discovery to Microbiome Ecology

mSystems ◽

10.1128/msystems.00182-17 ◽

2018 ◽

Vol 3 (2) ◽

Cited By ~ 14

Author(s):

Marnix H. Medema

Keyword(s):

Natural Product ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Clusters ◽

Natural Product Discovery ◽

The Past ◽

Leading Role ◽

Large Sets ◽

Microbe Interactions ◽

Host Microbe Interactions

ABSTRACT Microbial and plant specialized metabolites, also known as natural products, are key mediators of microbe-microbe and host-microbe interactions and constitute a rich resource for drug development. In the past decade, genome mining has emerged as a prominent strategy for natural product discovery. Initially, such mining was performed on the basis of individual microbial genome sequences. Now, these efforts are being scaled up to fully genome-sequenced strain collections, pangenomes of bacterial genera, and large sets of metagenome-assembled genomes from microbial communities. The Medema research group aims to play a leading role in these developments by developing and applying computational approaches to identify, classify, and prioritize specialized metabolite biosynthetic gene clusters and pathways and to connect them to specific molecules and microbiome-associated phenotypes. Moreover, we are extending the scope of genome mining from microbes to plants, which will allow more comprehensive interpretation of the chemical language between hosts and microbes in a microbiome setting.

Download Full-text