scholarly journals FunOrder: A robust and semi-automated method for the identification of essential biosynthetic genes through computational molecular co-evolution

2021 ◽  
Vol 17 (9) ◽  
pp. e1009372
Author(s):  
Gabriel A. Vignolle ◽  
Denise Schaffer ◽  
Leopold Zehetner ◽  
Robert L. Mach ◽  
Astrid R. Mach-Aigner ◽  
...  

Secondary metabolites (SMs) are a vast group of compounds with different structures and properties that have been utilized as drugs, food additives, dyes, and as monomers for novel plastics. In many cases, the biosynthesis of SMs is catalysed by enzymes whose corresponding genes are co-localized in the genome in biosynthetic gene clusters (BGCs). Notably, BGCs may contain so-called gap genes, that are not involved in the biosynthesis of the SM. Current genome mining tools can identify BGCs, but they have problems with distinguishing essential genes from gap genes. This can and must be done by expensive, laborious, and time-consuming comparative genomic approaches or transcriptome analyses. In this study, we developed a method that allows semi-automated identification of essential genes in a BGC based on co-evolution analysis. To this end, the protein sequences of a BGC are blasted against a suitable proteome database. For each protein, a phylogenetic tree is created. The trees are compared by treeKO to detect co-evolution. The results of this comparison are visualized in different output formats, which are compared visually. Our results suggest that co-evolution is commonly occurring within BGCs, albeit not all, and that especially those genes that encode for enzymes of the biosynthetic pathway are co-evolutionary linked and can be identified with FunOrder. In light of the growing number of genomic data available, this will contribute to the studies of BGCs in native hosts and facilitate heterologous expression in other organisms with the aim of the discovery of novel SMs.

2021 ◽  
Author(s):  
Gabriel A. Vignolle ◽  
Denise Schaffer ◽  
Robert L. Mach ◽  
Astrid R. Mach-Aigner ◽  
Christian Derntl

ABSTRACTSecondary metabolites (SMs) are a vast group of compounds with different structures and properties. Humankind uses SMs as drugs, food additives, dyes, and as monomers for novel plastics. In many cases, the biosynthesis of SMs is catalysed by enzymes whose corresponding genes are co-localized in the genome in biosynthetic gene clusters (BGCs). Notably, BGCs may contain so-called gap genes, that are not involved in the biosynthesis of the SM. Current genome mining tools can identify BGCs but they have problems with distinguishing essential genes from gap genes and defining the borders of a BGC. This can and must be done by expensive, laborious, and time-consuming comparative genomic approaches or co-expression analyses. In this study, we developed a novel tool that allows automated identification of essential genes in a BGC based solely on genomic data. The Functional Order (FunOrder) tool – Identification of essential biosynthetic genes through computational molecular co-evolution – searches for co-evolutionary linked genes in the BGCs. In light of the growing number of genomic data available, this will contribute to the studies of BGCs in native hosts and facilitate heterologous expression in other organisms with the aim of the discovery of novel SMs, including antibiotics and other pharmaceuticals.


2016 ◽  
Author(s):  
Satria A. Kautsar ◽  
Hernando G. Suarez Duran ◽  
Kai Blin ◽  
Anne Osbourn ◽  
Marnix H. Medema

ABSTRACTPlant specialized metabolites are chemically highly diverse, play key roles in host-microbe interactions, have important nutritional value in crops and are frequently applied as medicines. It has recently become clear that plant biosynthetic pathway-encoding genes are sometimes densely clustered in specific genomic loci: biosynthetic gene clusters (BGCs). Here, we introduce plantiSMASH, a versatile online analysis platform that automates the identification of candidate plant BGCs. Moreover, it allows integration of transcriptomic data to prioritize candidate BGCs based on the coexpression patterns of predicted biosynthetic enzyme-coding genes, and facilitates comparative genomic analysis to study the evolutionary conservation of each cluster. Applied on 48 high-quality plant genomes, plantiSMASH identifies a rich diversity of candidate plant BGCs. These results will guide further experimental exploration of the nature and dynamics of gene clustering in plant metabolism. Moreover, spurred by the continuing decrease in costs of plant genome sequencing, they will allow genome mining technologies to be applied to plant natural product discovery.The plantiSMASH web server, precalculated results and source code are freely available from http://plantismash.secondarymetabolites.org.


2020 ◽  
Author(s):  
Dina Kačar ◽  
Librada M Cañedo ◽  
Pilar Rodríguez ◽  
Elena Gonzalez ◽  
Beatriz Galán ◽  
...  

AbstractGlutaramide-containing polyketides are known as potent antitumoral and antimetastatic agents. However, the associated gene clusters have only been identified and studied in a few Streptomyces producers and sole Burkholderia gladioli symbiont. The new glutaramide-family polyketides, denominated sesbanimides D, E and F along with the previously known sesbanimide A and C, were isolated from two marine alphaproteobacteria Stappia indica PHM037 and Labrenzia aggregata PHM038. Structures of the isolated compounds were elucidated based on 1D and 2D homo and heteronuclear NMR analyses and ESI-MS spectrometry. All compounds exhibited strong antitumor activity in lung, breast and colorectal cancer cell lines. Subsequent whole genome sequencing and genome mining revealed the presence of the trans-AT PKS gene cluster responsible for the sesbanimide biosynthesis, described as sbn cluster, and the sesbanimide modular assembly is proposed. Interestingly, numerous homologous orphan gene clusters were localized in distantly related bacteria and used as comparative genomic assets for a more global characterization of sbn like-clusters. Strikingly, the modular architecture of downstream mixed type PKS/NRPS, SbnQ, revealed high similarity to PedH in pederin and Lab13 in labrenzin gene clusters, although those clusters are responsible for the production of structurally completely different molecules. The unexpected presence of SbnQ homologs in unrelated polyketide gene clusters across phylogenetically distant bacteria, raises intriguing questions about the evolutionary relationship between glutaramide-like and pederin-like pathways, as well as the functionality of their synthetic products.SignificanceGlutaramide-containing polyketides are still a largely understudied group of polyketides, produced mainly by the genera Streptomyces, with a great potential for antitumor drug production. Here, we describe genomes of two cultivable marine bacteria, Stappia indica PHM037 and Labrenzia aggregata PHM038, producers of the cytotoxic glutaramide-family polyketides sesbanimide A and C with chemical elucidation of newly identified analogs D, E and F. Genome mining revealed trans-AT PKS gene cluster responsible for sesbanimide biosynthesis. Although there are numerous homologous gene clusters present in remarkably different bacteria, this is the first time that the biosynthesis product has been reported. The comparative genome analysis reveals stunning, cryptic evolutionary relationship between sesbanimides, glutaramides from Streptomyces spp. and the pederin-family gene clusters.


Genes ◽  
2020 ◽  
Vol 11 (10) ◽  
pp. 1166
Author(s):  
Adeel Malik ◽  
Yu Ri Kim ◽  
Seung Bum Kim

The genus Streptacidiphilus represents a group of acidophilic actinobacteria within the family Streptomycetaceae, and currently encompasses 15 validly named species, which include five recent additions within the last two years. Considering the potential of the related genera within the family, namely Streptomyces and Kitasatospora, these relatively new members of the family can also be a promising source for novel secondary metabolites. At present, 15 genome data for 11 species from this genus are available, which can provide valuable information on their biology including the potential for metabolite production as well as enzymatic activities in comparison to the neighboring taxa. In this study, the genome sequences of 11 Streptacidiphilus species were subjected to the comparative analysis together with selected Streptomyces and Kitasatospora genomes. This study represents the first comprehensive comparative genomic analysis of the genus Streptacidiphilus. The results indicate that the genomes of Streptacidiphilus contained various secondary metabolite (SM) producing biosynthetic gene clusters (BGCs), some of them exclusively identified in Streptacidiphilus only. Several of these clusters may potentially code for SMs that may have a broad range of bioactivities, such as antibacterial, antifungal, antimalarial and antitumor activities. The biodegradation capabilities of Streptacidiphilus were also explored by investigating the hydrolytic enzymes for complex carbohydrates. Although all genomes were enriched with carbohydrate-active enzymes (CAZymes), their numbers in the genomes of some strains such as Streptacidiphilus carbonis NBRC 100919T were higher as compared to well-known carbohydrate degrading organisms. These distinctive features of each Streptacidiphilus species make them interesting candidates for future studies with respect to their potential for SM production and enzymatic activities.


2021 ◽  
Author(s):  
Dingrong Kang ◽  
Saeed Shoaie ◽  
Samuel Jacquiod ◽  
Søren Johannes Sørensen ◽  
Rodrigo Ledesma-Amaro

Several efforts have been made to valorize keratinous materials, an abundant and renewable resource. Despite these attempts to valorize products generated from keratin hydrolysate, either via chemical or microbial conversion, they generally remain with an overall low value. In this study, a promising keratinolytic strain from the genus Chryseobacterium (Chryseobacteriumsp. KMC2) was investigated using comparative genomic tools against publicly available reference genomes to reveal the metabolic potential for biosynthesis of valuable secondary metabolites. Genome and metabolic features of four species were compared, shows different gene numbers but similar functional categories. We successfully mined eleven different secondary metabolite gene clusters of interest from the four genomes, including five common ones shared across all genomes. Among the common metabolites, we identified gene clusters involved in biosynthesis of flexirubin-type pigment, microviridin, and siderophore, all showing remarkable conservation across the four genomes. Unique secondary metabolite gene clusters were also discovered, for example, ladderane from Chryseobacterium sp. KMC2. Additionally, this study provides a more comprehensive understanding of the potential metabolic pathways of keratin utilization in Chryseobacterium sp. KMC2, with the involvement of amino acid metabolism, TCA cycle, glycolysis/gluconeogenesis, propanoate metabolism, and sulfate reduction. This work uncovers the biosynthesis of secondary metabolite gene clusters from four keratinolytic Chryseobacterium spp. and shades lights on the keratinolytic potential of Chryseobacterium sp. KMC2 from a genome-mining perspective, providing alternatives to valorize keratinous materials into high-value natural products.


2021 ◽  
Vol 9 (5) ◽  
pp. 1042
Author(s):  
Dingrong Kang ◽  
Saeed Shoaie ◽  
Samuel Jacquiod ◽  
Søren J. Sørensen ◽  
Rodrigo Ledesma-Amaro

A promising keratin-degrading strain from the genus Chryseobacterium (Chryseobacterium sp. KMC2) was investigated using comparative genomic tools against three publicly available reference genomes to reveal the keratinolytic potential for biosynthesis of valuable secondary metabolites. Genomic features and metabolic potential of four species were compared, showing genomic differences but similar functional categories. Eleven different secondary metabolite gene clusters of interest were mined from the four genomes successfully, including five common ones shared across all genomes. Among the common metabolites, we identified gene clusters involved in biosynthesis of flexirubin-type pigment, microviridin, and siderophore, showing remarkable conservation across the four genomes. Unique secondary metabolite gene clusters were also discovered, for example, ladderane from Chryseobacterium sp. KMC2. Additionally, this study provides a more comprehensive understanding of the potential metabolic pathways of keratin utilization in Chryseobacterium sp. KMC2, with the involvement of amino acid metabolism, TCA cycle, glycolysis/gluconeogenesis, propanoate metabolism, and sulfate reduction. This work uncovers the biosynthesis of secondary metabolite gene clusters from four keratinolytic Chryseobacterium species and shades lights on the keratinolytic potential of Chryseobacterium sp. KMC2 from a genome-mining perspective, can provide alternatives to valorize keratinous materials into high-value bioactive natural products.


Author(s):  
Patrick Videau ◽  
Kaitlyn Wells ◽  
Arun Singh ◽  
Jessie Eiting ◽  
Philip Proteau ◽  
...  

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of <i>Anabaena </i>sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native <i>Anabaena</i>7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.


2021 ◽  
Vol 7 (5) ◽  
pp. 337
Author(s):  
Daniel Peterson ◽  
Tang Li ◽  
Ana M. Calvo ◽  
Yanbin Yin

Phytopathogenic Ascomycota are responsible for substantial economic losses each year, destroying valuable crops. The present study aims to provide new insights into phytopathogenicity in Ascomycota from a comparative genomic perspective. This has been achieved by categorizing orthologous gene groups (orthogroups) from 68 phytopathogenic and 24 non-phytopathogenic Ascomycota genomes into three classes: Core, (pathogen or non-pathogen) group-specific, and genome-specific accessory orthogroups. We found that (i) ~20% orthogroups are group-specific and accessory in the 92 Ascomycota genomes, (ii) phytopathogenicity is not phylogenetically determined, (iii) group-specific orthogroups have more enriched functional terms than accessory orthogroups and this trend is particularly evident in phytopathogenic fungi, (iv) secreted proteins with signal peptides and horizontal gene transfers (HGTs) are the two functional terms that show the highest occurrence and significance in group-specific orthogroups, (v) a number of other functional terms are also identified to have higher significance and occurrence in group-specific orthogroups. Overall, our comparative genomics analysis determined positive enrichment existing between orthogroup classes and revealed a prediction of what genomic characteristics make an Ascomycete phytopathogenic. We conclude that genes shared by multiple phytopathogenic genomes are more important for phytopathogenicity than those that are unique in each genome.


Marine Drugs ◽  
2021 ◽  
Vol 19 (6) ◽  
pp. 298
Author(s):  
Despoina Konstantinou ◽  
Rafael V. Popin ◽  
David P. Fewer ◽  
Kaarina Sivonen ◽  
Spyros Gkelis

Sponges form symbiotic relationships with diverse and abundant microbial communities. Cyanobacteria are among the most important members of the microbial communities that are associated with sponges. Here, we performed a genus-wide comparative genomic analysis of the newly described marine benthic cyanobacterial genus Leptothoe (Synechococcales). We obtained draft genomes from Le. kymatousa TAU-MAC 1615 and Le. spongobia TAU-MAC 1115, isolated from marine sponges. We identified five additional Leptothoe genomes, host-associated or free-living, using a phylogenomic approach, and the comparison of all genomes showed that the sponge-associated strains display features of a symbiotic lifestyle. Le. kymatousa and Le. spongobia have undergone genome reduction; they harbored considerably fewer genes encoding for (i) cofactors, vitamins, prosthetic groups, pigments, proteins, and amino acid biosynthesis; (ii) DNA repair; (iii) antioxidant enzymes; and (iv) biosynthesis of capsular and extracellular polysaccharides. They have also lost several genes related to chemotaxis and motility. Eukaryotic-like proteins, such as ankyrin repeats, playing important roles in sponge-symbiont interactions, were identified in sponge-associated Leptothoe genomes. The sponge-associated Leptothoe stains harbored biosynthetic gene clusters encoding novel natural products despite genome reduction. Comparisons of the biosynthetic capacities of Leptothoe with chemically rich cyanobacteria revealed that Leptothoe is another promising marine cyanobacterium for the biosynthesis of novel natural products.


2021 ◽  
Vol 7 (6) ◽  
pp. 485
Author(s):  
Boxun Li ◽  
Yang Yang ◽  
Jimiao Cai ◽  
Xianbao Liu ◽  
Tao Shi ◽  
...  

Rubber tree Corynespora leaf fall (CLF) disease, caused by the fungus Corynespora cassiicola, is one of the most damaging diseases in rubber tree plantations in Asia and Africa, and this disease also threatens rubber nurseries and young rubber plantations in China. C. cassiicola isolates display high genetic diversity, and virulence profiles vary significantly depending on cultivar. Although one phytotoxin (cassicolin) has been identified, it cannot fully explain the diversity in pathogenicity between C. cassiicola species, and some virulent C. cassiicola strains do not contain the cassiicolin gene. In the present study, we report high-quality gapless genome sequences, obtained using short-read sequencing and single-molecule long-read sequencing, of two Chinese C. cassiicola virulent strains. Comparative genomics of gene families in these two stains and a virulent CPP strain from the Philippines showed that all three strains experienced different selective pressures, and metabolism-related gene families vary between the strains. Secreted protein analysis indicated that the quantities of secreted cell wall-degrading enzymes were correlated with pathogenesis, and the most aggressive CCP strain (cassiicolin toxin type 1) encoded 27.34% and 39.74% more secreted carbohydrate-active enzymes (CAZymes) than Chinese strains YN49 and CC01, respectively, both of which can only infect rubber tree saplings. The results of antiSMASH analysis showed that all three strains encode ~60 secondary metabolite biosynthesis gene clusters (SM BGCs). Phylogenomic and domain structure analyses of core synthesis genes, together with synteny analysis of polyketide synthase (PKS) and non-ribosomal peptide synthetase (NRPS) gene clusters, revealed diversity in the distribution of SM BGCs between strains, as well as SM polymorphisms, which may play an important role in pathogenic progress. The results expand our understanding of the C. cassiicola genome. Further comparative genomic analysis indicates that secreted CAZymes and SMs may influence pathogenicity in rubber tree plantations. The findings facilitate future exploration of the molecular pathogenic mechanism of C. cassiicola.


Sign in / Sign up

Export Citation Format

Share Document