Automatic reconstruction of metabolic pathways from identified biosynthetic gene clusters

Abstract Background A wide range of bioactive compounds is produced by enzymes and enzymatic complexes encoded in biosynthetic gene clusters (BGCs). These BGCs can be identified and functionally annotated based on their DNA sequence. Candidates for further research and development may be prioritized based on properties such as their functional annotation, (dis)similarity to known BGCs, and bioactivity assays. Production of the target compound in the native strain is often not achievable, rendering heterologous expression in an optimized host strain as a promising alternative. Genome-scale metabolic models are frequently used to guide strain development, but large-scale incorporation and testing of heterologous production of complex natural products in this framework is hampered by the amount of manual work required to translate annotated BGCs to metabolic pathways. To this end, we have developed a pipeline for an automated reconstruction of BGC associated metabolic pathways responsible for the synthesis of non-ribosomal peptides and polyketides, two of the dominant classes of bioactive compounds. Results The developed pipeline correctly predicts 72.8% of the metabolic reactions in a detailed evaluation of 8 different BGCs comprising 228 functional domains. By introducing the reconstructed pathways into a genome-scale metabolic model we demonstrate that this level of accuracy is sufficient to make reliable in silico predictions with respect to production rate and gene knockout targets. Furthermore, we apply the pipeline to a large BGC database and reconstruct 943 metabolic pathways. We identify 17 enzymatic reactions using high-throughput assessment of potential knockout targets for increasing the production of any of the associated compounds. However, the targets only provide a relative increase of up to 6% compared to wild-type production rates. Conclusion With this pipeline we pave the way for an extended use of genome-scale metabolic models in strain design of heterologous expression hosts. In this context, we identified generic knockout targets for the increased production of heterologous compounds. However, as the predicted increase is minor for any of the single-reaction knockout targets, these results indicate that more sophisticated strain-engineering strategies are necessary for the development of efficient BGC expression hosts.

Download Full-text

Automatic reconstruction of metabolic pathways from identified biosynthetic gene clusters

10.1101/2020.11.24.395400 ◽

2020 ◽

Author(s):

Snorre Sulheim ◽

Fredrik A. Fossheim ◽

Alexander Wentzel ◽

Eivind Almaas

Keyword(s):

Heterologous Expression ◽

Bioactive Compounds ◽

Metabolic Pathways ◽

Gene Clusters ◽

Strain Engineering ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Wide Range ◽

Metabolic Models ◽

Genome Scale

AbstractBackgroundA wide range of bioactive compounds are produced by enzymes and enzymatic complexes encoded in biosynthetic gene clusters (BGCs). These BGCs can be identified and functionally annotated based on their DNA sequence. Candidates for further research and development may be prioritized based on properties such as their functional annotation, (dis)similarity to known BGCs, and bioactivity assays. Production of the target compound in the native strain is often not achievable, rendering heterologous expression in an optimized host strain as a promising alternative. Genome-scale metabolic models are frequently used to guide strain development, but large-scale incorporation and testing of heterologous production of complex natural products in this framework is hampered by the amount of manual work required to translate annotated BGCs to metabolic pathways. To this end, we have developed a pipeline for an automated reconstruction of BGC associated metabolic pathways responsible for the synthesis of non-ribosomal peptides and polyketides, two of the dominant classes of bioactive compounds.ResultsThe developed pipeline correctly predicts 72.8% of the metabolic reactions in a detailed evaluation of 8 different BGCs comprising 228 functional domains. By introducing the reconstructed pathways into a genome-scale metabolic model we demonstrate that this level of accuracy is sufficient to make reliable in silico predictions with respect to production rate and gene knockout targets. Furthermore, we apply the pipeline to a large BGC database and reconstruct 943 metabolic pathways. We identify 17 enzymatic reactions using high-throughput assessment of potential knockout targets for increasing the production of any of the associated compounds. However, the targets only provide a relative increase of up to 6% compared to wild-type production rates.ConclusionsWith this pipeline we pave the way for an extended use of genome-scale metabolic models in strain design of heterologous expression hosts. In this context, we identified generic knockout targets for the increased production of heterologous compounds. However, as the predicted increase is minor for any of the single-reaction knockout targets, these results indicate that more sophisticated strain-engineering strategies are necessary for the development of efficient BGC expression hosts.

Download Full-text

Detecting and prioritizing biosynthetic gene clusters for bioactive compounds in bacteria and fungi

Applied Microbiology and Biotechnology ◽

10.1007/s00253-019-09708-z ◽

2019 ◽

Vol 103 (8) ◽

pp. 3277-3287 ◽

Cited By ~ 19

Author(s):

Phuong Nguyen Tran ◽

Ming-Ren Yen ◽

Chen-Yu Chiang ◽

Hsiao-Ching Lin ◽

Pao-Yang Chen

Keyword(s):

Bioactive Compounds ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Bacteria And Fungi

Download Full-text

Global analysis of adenylate-forming enzymes reveals β-lactone biosynthesis pathway in pathogenic Nocardia

Journal of Biological Chemistry ◽

10.1074/jbc.ra120.013528 ◽

2020 ◽

Vol 295 (44) ◽

pp. 14826-14839

Author(s):

Serina L. Robinson ◽

Barbara R. Terlouw ◽

Megan D. Smith ◽

Sacha J. Pidot ◽

Timothy P. Stinear ◽

...

Keyword(s):

Machine Learning ◽

Natural Product ◽

Global Analysis ◽

Gene Clusters ◽

Divergent Evolution ◽

Acid Activation ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Predictive Tool ◽

Wide Range

Enzymes that cleave ATP to activate carboxylic acids play essential roles in primary and secondary metabolism in all domains of life. Class I adenylate-forming enzymes share a conserved structural fold but act on a wide range of substrates to catalyze reactions involved in bioluminescence, nonribosomal peptide biosynthesis, fatty acid activation, and β-lactone formation. Despite their metabolic importance, the substrates and functions of the vast majority of adenylate-forming enzymes are unknown without tools available to accurately predict them. Given the crucial roles of adenylate-forming enzymes in biosynthesis, this also severely limits our ability to predict natural product structures from biosynthetic gene clusters. Here we used machine learning to predict adenylate-forming enzyme function and substrate specificity from protein sequences. We built a web-based predictive tool and used it to comprehensively map the biochemical diversity of adenylate-forming enzymes across >50,000 candidate biosynthetic gene clusters in bacterial, fungal, and plant genomes. Ancestral phylogenetic reconstruction and sequence similarity networking of enzymes from these clusters suggested divergent evolution of the adenylate-forming superfamily from a core enzyme scaffold most related to contemporary CoA ligases toward more specialized functions including β-lactone synthetases. Our classifier predicted β-lactone synthetases in uncharacterized biosynthetic gene clusters conserved in >90 different strains of Nocardia. To test our prediction, we purified a candidate β-lactone synthetase from Nocardia brasiliensis and reconstituted the biosynthetic pathway in vitro to link the gene cluster to the β-lactone natural product, nocardiolactone. We anticipate that our machine learning approach will aid in functional classification of enzymes and advance natural product discovery.

Download Full-text

An anaerobic bacterium host system for heterologous expression of natural product biosynthetic gene clusters

Nature Communications ◽

10.1038/s41467-019-11673-0 ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 10

Author(s):

Tingting Hao ◽

Zhoujie Xie ◽

Min Wang ◽

Liwei Liu ◽

Yuwei Zhang ◽

...

Keyword(s):

Heterologous Expression ◽

Natural Product ◽

Anaerobic Bacterium ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Host System

Download Full-text

RecET direct cloning and Redαβ recombineering of biosynthetic gene clusters, large operons or single genes for heterologous expression

Nature Protocols ◽

10.1038/nprot.2016.054 ◽

2016 ◽

Vol 11 (7) ◽

pp. 1175-1190 ◽

Cited By ~ 54

Author(s):

Hailong Wang ◽

Zhen Li ◽

Ruonan Jia ◽

Yu Hou ◽

Jia Yin ◽

...

Keyword(s):

Heterologous Expression ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Direct Cloning

Download Full-text

Heterologous expression of the biosynthetic gene clusters of coumermycin A1, clorobiocin and caprazamycins in genetically modified Streptomyces coelicolor strains

Biopolymers ◽

10.1002/bip.21493 ◽

2010 ◽

Vol 93 (9) ◽

pp. 823-832 ◽

Cited By ~ 33

Author(s):

Katrin Flinspach ◽

Lucia Westrich ◽

Leonard Kaysser ◽

Stefanie Siebenberg ◽

Juan Pablo Gomez-Escribano ◽

...

Keyword(s):

Heterologous Expression ◽

Streptomyces Coelicolor ◽

Genetically Modified ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters

Download Full-text

Functional Genome Mining for Metabolites Encoded by Large Gene Clusters through Heterologous Expression of a Whole-Genome Bacterial Artificial Chromosome Library in Streptomyces spp.

Applied and Environmental Microbiology ◽

10.1128/aem.01383-16 ◽

2016 ◽

Vol 82 (19) ◽

pp. 5795-5805 ◽

Cited By ~ 31

Author(s):

Min Xu ◽

Yemin Wang ◽

Zhilong Zhao ◽

Guixi Gao ◽

Sheng-Xiong Huang ◽

...

Keyword(s):

Heterologous Expression ◽

Genome Mining ◽

Gene Clusters ◽

Artificial Chromosome ◽

Functional Screening ◽

Biosynthetic Pathways ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Content Type ◽

Bac Libraries

ABSTRACTGenome sequencing projects in the last decade revealed numerous cryptic biosynthetic pathways for unknown secondary metabolites in microbes, revitalizing drug discovery from microbial metabolites by approaches called genome mining. In this work, we developed a heterologous expression and functional screening approach for genome mining from genomic bacterial artificial chromosome (BAC) libraries inStreptomycesspp. We demonstrate mining from a strain ofStreptomyces rochei, which is known to produce streptothricins and borrelidin, by expressing its BAC library in the surrogate hostStreptomyces lividansSBT5, and screening for antimicrobial activity. In addition to the successful capture of the streptothricin and borrelidin biosynthetic gene clusters, we discovered two novel linear lipopeptides and their corresponding biosynthetic gene cluster, as well as a novel cryptic gene cluster for an unknown antibiotic fromS. rochei. This high-throughput functional genome mining approach can be easily applied to other streptomycetes, and it is very suitable for the large-scale screening of genomic BAC libraries for bioactive natural products and the corresponding biosynthetic pathways.IMPORTANCEMicrobial genomes encode numerous cryptic biosynthetic gene clusters for unknown small metabolites with potential biological activities. Several genome mining approaches have been developed to activate and bring these cryptic metabolites to biological tests for future drug discovery. Previous sequence-guided procedures relied on bioinformatic analysis to predict potentially interesting biosynthetic gene clusters. In this study, we describe an efficient approach based on heterologous expression and functional screening of a whole-genome library for the mining of bioactive metabolites fromStreptomyces. The usefulness of this function-driven approach was demonstrated by the capture of four large biosynthetic gene clusters for metabolites of various chemical types, including streptothricins, borrelidin, two novel lipopeptides, and one unknown antibiotic fromStreptomyces rocheiSal35. The transfer, expression, and screening of the library were all performed in a high-throughput way, so that this approach is scalable and adaptable to industrial automation for next-generation antibiotic discovery.

Download Full-text

Heterologous expression system in Aspergillus oryzae for fungal biosynthetic gene clusters of secondary metabolites

Applied Microbiology and Biotechnology ◽

10.1007/s00253-011-3657-9 ◽

2011 ◽

Vol 93 (5) ◽

pp. 2011-2022 ◽

Cited By ~ 48

Author(s):

Kanae Sakai ◽

Hiroshi Kinoshita ◽

Takuya Nihira

Keyword(s):

Secondary Metabolites ◽

Heterologous Expression ◽

Aspergillus Oryzae ◽

Expression System ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Heterologous Expression System

Download Full-text

Engineering Salinispora tropica for heterologous expression of natural product biosynthetic gene clusters

Applied Microbiology and Biotechnology ◽

10.1007/s00253-018-9283-z ◽

2018 ◽

Vol 102 (19) ◽

pp. 8437-8446 ◽

Cited By ~ 6

Author(s):

Jia Jia Zhang ◽

Bradley S. Moore ◽

Xiaoyu Tang

Keyword(s):

Heterologous Expression ◽

Natural Product ◽

Gene Clusters ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Salinispora Tropica

Download Full-text

Reclassification of the Specialized Metabolite Producer Pseudomonas mesoacidophila ATCC 31433 as a Member of the Burkholderia cepacia Complex

Journal of Bacteriology ◽

10.1128/jb.00125-17 ◽

2017 ◽

Vol 199 (13) ◽

Cited By ~ 14

Author(s):

E. Joel Loveridge ◽

Cerith Jones ◽

Matthew J. Bull ◽

Suzy C. Moody ◽

Małgorzata W. Kahl ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Bioactive Compounds ◽

Single Molecule ◽

Complete Genome ◽

Burkholderia Cepacia ◽

Gene Clusters ◽

Burkholderia Cepacia Complex ◽

Biosynthetic Gene ◽

Biosynthetic Gene Clusters ◽

Content Type

ABSTRACT Pseudomonas mesoacidophila ATCC 31433 is a Gram-negative bacterium, first isolated from Japanese soil samples, that produces the monobactam isosulfazecin and the β-lactam-potentiating bulgecins. To characterize the biosynthetic potential of P. mesoacidophila ATCC 31433, its complete genome was determined using single-molecule real-time DNA sequence analysis. The 7.8-Mb genome comprised four replicons, three chromosomal (each encoding rRNA) and one plasmid. Phylogenetic analysis demonstrated that P. mesoacidophila ATCC 31433 was misclassified at the time of its deposition and is a member of the Burkholderia cepacia complex, most closely related to Burkholderia ubonensis. The sequenced genome shows considerable additional biosynthetic potential; known gene clusters for malleilactone, ornibactin, isosulfazecin, alkylhydroxyquinoline, and pyrrolnitrin biosynthesis and several uncharacterized biosynthetic gene clusters for polyketides, nonribosomal peptides, and other metabolites were identified. Furthermore, P. mesoacidophila ATCC 31433 harbors many genes associated with environmental resilience and antibiotic resistance and was resistant to a range of antibiotics and metal ions. In summary, this bioactive strain should be designated B. cepacia complex strain ATCC 31433, pending further detailed taxonomic characterization. IMPORTANCE This work reports the complete genome sequence of Pseudomonas mesoacidophila ATCC 31433, a known producer of bioactive compounds. Large numbers of both known and novel biosynthetic gene clusters were identified, indicating that P. mesoacidophila ATCC 31433 is an untapped resource for discovery of novel bioactive compounds. Phylogenetic analysis demonstrated that P. mesoacidophila ATCC 31433 is in fact a member of the Burkholderia cepacia complex, most closely related to the species Burkholderia ubonensis. Further investigation of the classification and biosynthetic potential of P. mesoacidophila ATCC 31433 is warranted.

Download Full-text