Identification of a Tambjamine Gene Cluster in Streptomyces Reveals Convergent Evolution of the Biosynthetic Pathway

Bacterial natural products are an immensely valuable source of therapeutics. As modern DNA sequencing efforts provide increasing numbers of microbial genomes, it is clear that the molecules produced by most natural product biosynthetic gene clusters (BGCs) remain unknown. Genome mining makes use of bioinformatic techniques to elucidate the natural products produced by these “orphan” BGCs. Here, we report the use of sequence similarity networks (SSNs) and genome neighborhood networks (GNNs) to identify an orphan BGC that is responsible for the production of the antitumor tambjamine BE-18591 in Streptomyces albus NRRL B-2362. Although BE-18591 is a close structural analogue of tambjamine YP1 produced by Pseudoalteromonas tunicata, the biosynthetic routes to produce these molecules differ significantly. Notably, the C12-alkylamine tail that is appended onto the bipyrrole core of tambjamine YP1 is derived from fatty acids siphoned from the primary metabolism of the pseudoalteromonad, whilst the S. albus NRRL B-2362 BGC encodes a dedicated system for the de novo biosynthesis of the alkylamine portion of tambjamine BE-18591. These remarkably different biosynthetic strategies represent a striking example of convergent BGC evolution, with selective pressure for the production of tambjamines seemingly leading to the emergence of separate biosynthetic pathways in pseudoalteromonads and streptomycetes that ultimately produce closely related compounds

Download Full-text

Identification of a Tambjamine Gene Cluster in Streptomyces Reveals Convergent Evolution of the Biosynthetic Pathway

10.26434/chemrxiv.12899264 ◽

2020 ◽

Author(s):

Neil L Grenade ◽

Dragos S. Chiriac ◽

Graeme W. Howe ◽

Avena Ross

Keyword(s):

Natural Products ◽

De Novo ◽

Sequence Similarity ◽

Genome Mining ◽

Gene Clusters ◽

Primary Metabolism ◽

Biosynthetic Gene Clusters ◽

Microbial Genomes ◽

Sequence Similarity Networks ◽

Related Compounds

Download Full-text

Recent development of antiSMASH and other computational approaches to mine secondary metabolite biosynthetic gene clusters

Briefings in Bioinformatics ◽

10.1093/bib/bbx146 ◽

2017 ◽

Vol 20 (4) ◽

pp. 1103-1113 ◽

Cited By ~ 37

Author(s):

Kai Blin ◽

Hyun Uk Kim ◽

Marnix H Medema ◽

Tilmann Weber

Keyword(s):

Natural Products ◽

Small Molecules ◽

Sequence Similarity ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Rule Based ◽

Chemical Structures ◽

Annotation Quality

Abstract Many drugs are derived from small molecules produced by microorganisms and plants, so-called natural products. Natural products have diverse chemical structures, but the biosynthetic pathways producing those compounds are often organized as biosynthetic gene clusters (BGCs) and follow a highly conserved biosynthetic logic. This allows for the identification of core biosynthetic enzymes using genome mining strategies that are based on the sequence similarity of the involved enzymes/genes. However, mining for a variety of BGCs quickly approaches a complexity level where manual analyses are no longer possible and require the use of automated genome mining pipelines, such as the antiSMASH software. In this review, we discuss the principles underlying the predictions of antiSMASH and other tools and provide practical advice for their application. Furthermore, we discuss important caveats such as rule-based BGC detection, sequence and annotation quality and cluster boundary prediction, which all have to be considered while planning for, performing and analyzing the results of genome mining studies.

Download Full-text

Expanding the Natural Products Heterologous Expression Repertoire in the Model Cyanobacterium Anabaena sp. Strain PCC 7120: Production of Pendolmycin and Teleocidin B-4

10.26434/chemrxiv.11316098.v1 ◽

2019 ◽

Cited By ~ 1

Author(s):

Patrick Videau ◽

Kaitlyn Wells ◽

Arun Singh ◽

Jessie Eiting ◽

Philip Proteau ◽

...

Keyword(s):

Natural Products ◽

Genome Mining ◽

Gene Clusters ◽

Combinatorial Biosynthesis ◽

Test Case ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Cyanobacterium Anabaena ◽

Anabaena Sp ◽

Pcc 7120

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of <i>Anabaena </i>sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native <i>Anabaena</i>7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.

Download Full-text

Glutamic acid is a carrier for hydrazine during the biosyntheses of fosfazinomycin and kinamycin

10.1101/365031 ◽

2018 ◽

Author(s):

Kwo-Kwang Abraham Wang ◽

Tai L. Ng ◽

Peng Wang ◽

Zedu Huang ◽

Emily P. Balskus ◽

...

Keyword(s):

Natural Products ◽

Nitrous Acid ◽

Bond Formation ◽

Gene Clusters ◽

Biosynthetic Pathways ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Structural Differences ◽

Related Compounds ◽

Made In

AbstractFosfazinomycin and kinamycin are natural products that contain nitrogen-nitrogen (N-N) bonds but that are otherwise structurally unrelated. Despite their considerable structural differences, their biosynthetic gene clusters share a set of genes predicted to facilitate N-N bond formation. In this study, we show that for both compounds, one of the nitrogen atoms in the N-N bond originates from nitrous acid. Furthermore, we show that for both compounds, an acetylhydrazine biosynthetic synthon is generated first and then funneled via a glutamyl carrier into the respective biosynthetic pathways. Therefore, unlike other pathways to NN bond-containing natural products wherein the N-N bond is formed directly on a biosynthetic intermediate, during the biosyntheses of fosfazinomycin, kinamycin, and related compounds, the N-N bond is made in an independent pathway that forms a branch of a convergent route to structurally complex natural products.

Download Full-text

The antiSMASH database version 3: increased taxonomic coverage and new query features for modular enzymes

Nucleic Acids Research ◽

10.1093/nar/gkaa978 ◽

2020 ◽

Vol 49 (D1) ◽

pp. D639-D643 ◽

Cited By ~ 1

Author(s):

Kai Blin ◽

Simon Shaw ◽

Satria A Kautsar ◽

Marnix H Medema ◽

Tilmann Weber

Keyword(s):

User Interface ◽

Graphical User Interface ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

High Quality ◽

Microbial Genomes ◽

Fungal Genomes ◽

Interactive Graphical User Interface

Abstract Microorganisms produce natural products that are frequently used in the development of antibacterial, antiviral, and anticancer drugs, pesticides, herbicides, or fungicides. In recent years, genome mining has evolved into a prominent method to access this potential. antiSMASH is one of the most popular tools for this task. Here, we present version 3 of the antiSMASH database, providing a means to access and query precomputed antiSMASH-5.2-detected biosynthetic gene clusters from representative, publicly available, high-quality microbial genomes via an interactive graphical user interface. In version 3, the database contains 147 517 high quality BGC regions from 388 archaeal, 25 236 bacterial and 177 fungal genomes and is available at https://antismash-db.secondarymetabolites.org/.

Download Full-text

Expansion of RiPP biosynthetic space through integration of pan-genomics and machine learning uncovers a novel class of lanthipeptides

PLoS Biology ◽

10.1371/journal.pbio.3001026 ◽

2020 ◽

Vol 18 (12) ◽

pp. e3001026

Author(s):

Alexander M. Kloosterman ◽

Peter Cimermancic ◽

Somayah S. Elsayed ◽

Chao Du ◽

Michalis Hadjithomas ◽

...

Keyword(s):

Natural Products ◽

Natural Product ◽

Sequence Similarity ◽

Genome Mining ◽

Gene Clusters ◽

Support Vector ◽

Chemical Labeling ◽

Product Families ◽

Anticancer Properties ◽

Modified Peptides

Microbial natural products constitute a wide variety of chemical compounds, many which can have antibiotic, antiviral, or anticancer properties that make them interesting for clinical purposes. Natural product classes include polyketides (PKs), nonribosomal peptides (NRPs), and ribosomally synthesized and post-translationally modified peptides (RiPPs). While variants of biosynthetic gene clusters (BGCs) for known classes of natural products are easy to identify in genome sequences, BGCs for new compound classes escape attention. In particular, evidence is accumulating that for RiPPs, subclasses known thus far may only represent the tip of an iceberg. Here, we present decRiPPter (Data-driven Exploratory Class-independent RiPP TrackER), a RiPP genome mining algorithm aimed at the discovery of novel RiPP classes. DecRiPPter combines a Support Vector Machine (SVM) that identifies candidate RiPP precursors with pan-genomic analyses to identify which of these are encoded within operon-like structures that are part of the accessory genome of a genus. Subsequently, it prioritizes such regions based on the presence of new enzymology and based on patterns of gene cluster and precursor peptide conservation across species. We then applied decRiPPter to mine 1,295 Streptomyces genomes, which led to the identification of 42 new candidate RiPP families that could not be found by existing programs. One of these was studied further and elucidated as a representative of a novel subfamily of lanthipeptides, which we designate class V. The 2D structure of the new RiPP, which we name pristinin A3 (1), was solved using nuclear magnetic resonance (NMR), tandem mass spectrometry (MS/MS) data, and chemical labeling. Two previously unidentified modifying enzymes are proposed to create the hallmark lanthionine bridges. Taken together, our work highlights how novel natural product families can be discovered by methods going beyond sequence similarity searches to integrate multiple pathway discovery criteria.

Download Full-text

BiG-SLiCE: A Highly Scalable Tool Maps the Diversity of 1.2 Million Biosynthetic Gene Clusters

10.1101/2020.08.17.240838 ◽

2020 ◽

Cited By ~ 3

Author(s):

Satria A. Kautsar ◽

Justin J. J. van der Hooft ◽

Dick de Ridder ◽

Marnix H. Medema

Keyword(s):

Natural Product ◽

Biological Activities ◽

Genome Mining ◽

Gene Clusters ◽

Genomic Diversity ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Microbial Genomes ◽

Natural Product Discovery ◽

User Friendly

AbstractBackgroundGenome mining for Biosynthetic Gene Clusters (BGCs) has become an integral part of natural product discovery. The >200,000 microbial genomes now publicly available hold information on abundant novel chemistry. One way to navigate this vast genomic diversity is through comparative analysis of homologous BGCs, which allows identification of cross-species patterns that can be matched to the presence of metabolites or biological activities. However, current tools suffer from a bottleneck caused by the expensive network-based approach used to group these BGCs into Gene Cluster Families (GCFs).ResultsHere, we introduce BiG-SLiCE, a tool designed to cluster massive numbers of BGCs. By representing them in Euclidean space, BiG-SLiCE can group BGCs into GCFs in a non-pairwise, near-linear fashion. We used BiG-SLiCE to analyze 1,225,071 BGCs collected from 209,206 publicly available microbial genomes and metagenome-assembled genomes (MAGs) within ten days on a typical 36-cores CPU server. We demonstrate the utility of such analyses by reconstructing a global map of secondary metabolic diversity across taxonomy to identify uncharted biosynthetic potential. BiG-SLiCE also provides a "query mode" that can efficiently place newly sequenced BGCs into previously computed GCFs, plus a powerful output visualization engine that facilitates user-friendly data exploration.ConclusionsBiG-SLiCE opens up new possibilities to accelerate natural product discovery and offers a first step towards constructing a global, searchable interconnected network of BGCs. As more genomes get sequenced from understudied taxa, more information can be mined to highlight their potentially novel chemistry. BiG-SLiCE is available via https://github.com/medema-group/bigslice.

Download Full-text

On the Risks of Phylogeny-Based Strain Prioritization for Drug Discovery: Streptomyces lunaelactis as a Case Study

Biomolecules ◽

10.3390/biom10071027 ◽

2020 ◽

Vol 10 (7) ◽

pp. 1027 ◽

Cited By ~ 1

Author(s):

Loïc Martinet ◽

Aymeric Naômé ◽

Dominique Baiwir ◽

Edwin De Pauw ◽

Gabriel Mazzucchelli ◽

...

Keyword(s):

Natural Products ◽

Drug Discovery ◽

Reference Strain ◽

Pattern Analysis ◽

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Representative Strain

Strain prioritization for drug discovery aims at excluding redundant strains of a collection in order to limit the repetitive identification of the same molecules. In this work, we wanted to estimate what can be unexploited in terms of the amount, diversity, and novelty of compounds if the search is focused on only one single representative strain of a species, taking Streptomyces lunaelactis as a model. For this purpose, we selected 18 S. lunaelactis strains taxonomically clustered with the archetype strain S. lunaelactis MM109T. Genome mining of all S. lunaelactis isolated from the same cave revealed that 54% of the 42 biosynthetic gene clusters (BGCs) are strain specific, and five BGCs are not present in the reference strain MM109T. In addition, even when a BGC is conserved in all strains such as the bag/fev cluster involved in bagremycin and ferroverdin production, the compounds produced highly differ between the strains and previously unreported compounds are not produced by the archetype MM109T. Moreover, metabolomic pattern analysis uncovered important profile heterogeneity, confirming that identical BGC predisposition between two strains does not automatically imply chemical uniformity. In conclusion, trying to avoid strain redundancy based on phylogeny and genome mining information alone can compromise the discovery of new natural products and might prevent the exploitation of the best naturally engineered producers of specific molecules.

Download Full-text

Synthetic Biology Advanced Natural Product Discovery

Metabolites ◽

10.3390/metabo11110785 ◽

2021 ◽

Vol 11 (11) ◽

pp. 785

Author(s):

Junyang Wang ◽

Jens Nielsen ◽

Zihe Liu

Keyword(s):

Natural Products ◽

Synthetic Biology ◽

Natural Product ◽

Rapid Development ◽

Genome Mining ◽

Gene Clusters ◽

Future Research ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Natural Product Discovery

A wide variety of bacteria, fungi and plants can produce bioactive secondary metabolites, which are often referred to as natural products. With the rapid development of DNA sequencing technology and bioinformatics, a large number of putative biosynthetic gene clusters have been reported. However, only a limited number of natural products have been discovered, as most biosynthetic gene clusters are not expressed or are expressed at extremely low levels under conventional laboratory conditions. With the rapid development of synthetic biology, advanced genome mining and engineering strategies have been reported and they provide new opportunities for discovery of natural products. This review discusses advances in recent years that can accelerate the design, build, test, and learn (DBTL) cycle of natural product discovery, and prospects trends and key challenges for future research directions.

Download Full-text

EvoMining reveals the origin and fate of natural products biosynthetic enzymes

10.1101/482273 ◽

2018 ◽

Author(s):

Nelly Sélem-Mojica ◽

César Aguilar ◽

Karina Gutiérrez-García ◽

Christian E. Martínez-Guerrero ◽

Francisco Barona-Gómez

Keyword(s):

Natural Products ◽

Evolutionary Dynamics ◽

Sequence Similarity ◽

Genome Mining ◽

Gene Clusters ◽

Lessons Learned ◽

Metabolic Enzyme ◽

Link Type ◽

A Genome ◽

Enzyme Families

ABSTRACTNatural products, or specialized metabolites, are important for medicine and agriculture alike, as well as for the fitness of the organisms that produce them. Microbial genome mining aims at extracting metabolic information from genomes of microbes presumed to produce these compounds. Typically, canonical enzyme sequences from known biosynthetic systems are identified after sequence similarity searches. Despite this being an efficient process the likelihood of identifying truly novel biosynthetic systems is low. To overcome this limitation we previously introduced EvoMining, a genome mining approach that incorporates evolutionary principles. Here, we release and use our latest version of EvoMining, which includes novel visualization features and customizable databases, to analyze 42 central metabolic enzyme families conserved throughout Actinobacteria, Cyanobacteria, Pseudomonas and Archaea. We found that expansion-and-recruitment profiles of these enzyme families are lineage specific, opening a new metabolic space related to ‘shell’ enzymes, which have been overlooked to date. As a case study of canonical shell enzymes, we characterized the expansion and recruitment of glutamate dehydrogenase and acetolactate synthase into scytonemin biosynthesis, and into other central metabolic pathways driving microbial adaptive evolution. By defining the origins and fates of metabolic enzymes, EvoMining not only complements traditional genome mining approaches as an unbiased and rule-independent strategy, but it opens the door to gain insights into the evolution of natural products biosynthesis. We anticipate that EvoMining will be broadly used for metabolic evolutionary studies, and to generate genome-mining predictions leading to unprecedented chemical scaffolds and new antibiotics.DATA SUMMARYDatabases have been deposited at Zenodo; DOI: 10.5281/zenodo.1162336 http://zenodo.org/deposit/1219709Trees and metadata have been deposited in MicroReactGDH Actinobacteria https://microreact.org/project/r1IhjVm6XGDH Cyanobacteria https://microreact.org/project/HyjYUN7pQ)GDH Pseudomonas https://microreact.org/project/rJPC4EQa7GDH Archaea https://microreact.org/project/ByUcvNmaXALS Cyanobacteria https://microreact.org/project/B11HkUtdmEvoMining code has been deposited in gitHub https://github/nselem/evominingDocker container in Dockerhub https://hub.docker.com/r/nselem/evomining/We confirm all supporting data, code and protocols have been provided within the article or through supplementary data files.IMPACT STATEMENTEvoMining allows studying expansion-and-recruitment events of enzyme families in prokaryotic lineages, with the goal of providing both evolutionary insights and a genome mining approach for the discovery of truly novel natural products biosynthetic gene clusters. Thus, by better understanding the origin and fate of gene copies within enzyme families, this work contributes towards the identification of lineage-dependent enzymes that we call ‘shell’ enzymes, which are ideal beacons to unveil ‘chemical dark matter’. We show that enzyme functionality is a continuum, including transition enzymes located between central and specialized metabolism. To exemplify these evolutionary dynamics, we focused in the genes directing the synthesis of the sunscreen peptide scytonemin, as the two key enzymes of this biosynthetic pathway behave as shell enzymes and were correctly identified by EvoMining. We also show how evolutionary approaches are better suited to study unexplored lineages, such as those belonging to the Archaea domain, which is systematically mined here for novel natural products for the first time. The release of EvoMining as a stand-alone tool will allow researchers to explore its own enzyme families of interest, within their own genomic lineages of expertise, by taking into account the lessons learned from this work

Download Full-text