scholarly journals Genome-guided discovery of natural products through multiplexed low coverage whole-genome sequencing of soil Actinomycetes on Oxford Nanopore Flongle

2021 ◽  
Author(s):  
Rahim Rajwani ◽  
Shannon I Ohlemacher ◽  
Gengxiang Zhao ◽  
Hong-Bing Liu ◽  
Carole A. Bewley

Genome-mining is an important tool for discovery of new natural products; however, the number of publicly available genomes for natural product-rich microbes such as Actinomycetes, relative to human pathogens with smaller genomes, is small. To obtain contiguous DNA assemblies and identify large (ca. 10 to greater than 100 Kb) biosynthetic gene clusters (BGCs) with high-GC (>70%) and -repeat content, it is necessary to use long-read sequencing methods when sequencing Actinomycete genomes. One of the hurdles to long-read sequencing is the higher cost. In the current study, we assessed Flongle, a recently launched platform by Oxford Nanopore Technologies, as a low-cost DNA sequencing option to obtain contiguous DNA assemblies and analyze BGCs. To make the workflow more cost-effective, we multiplexed up to four samples in a single Flongle sequencing experiment while expecting low-sequencing coverage per sample. We hypothesized that contiguous DNA assemblies might enable analysis of BGCs even at low sequencing depth. To assess the value of these assemblies, we collected high-resolution mass-spectrometry data and conducted a multi-omics analysis to connect BGCs to secondary metabolites. In total, we assembled genomes for 20 distinct strains across seven sequencing experiments. In each experiment, 50% of the bases were in reads longer than 10 Kb, which facilitated the assembly of reads into contigs with an average N50 value of 3.5 Mb. The programs antiSMASH and PRISM predicted 629 and 295 BGCs, respectively. We connected BGCs to metabolites for N,N-dimethyl cyclic-ditryptophan, a novel lassopeptide and three known Actinomycete-associated siderophores, namely mirubactin, heterobactin and salinichelin.

2018 ◽  
Author(s):  
S. Michelle Todd ◽  
Robert E. Settlage ◽  
Kevin K. Lahmers ◽  
Daniel J. Slade

Understanding the virulence mechanisms of human pathogens from the genus Fusobacterium has been hindered by a lack of properly assembled and annotated genomes. Here we report the first complete genomes for seven Fusobacterium strains, as well as resequencing of the reference strain F. nucleatum subsp. nucleatum ATCC 25586 (seven total species, eight total genomes). A highly efficient and cost-effective sequencing pipeline was achieved using sample multiplexing for short-read Illumina (150 bp) and long-read Oxford Nanopore MinION (>80 kbp) platforms, coupled with genome assembly using the open-source software Unicycler. When compared to currently available draft assemblies (previously 24-67 contigs), these genomes are highly accurate and consist of only one complete chromosome. We present the complete genome sequence of F. nucleatum 23726, a genetically tractable and biomedically important strain, and in addition, reveal that the previous F. nucleatum 25586 genome assembly contains a 452 kb genomic inversion that has been corrected using our sequencing and assembly pipeline. To enable the scientific community, we concurrently use these genomes to launch FusoPortal, a repository of interactive and downloadable genomic data, genome maps, gene annotations, and protein functional analysis and classification. In summary, this study provides detailed methods for accurately sequencing, assembling, and annotating Fusobacterium genomes, which will enhance efforts to properly identify virulence proteins that may contribute to a repertoire of diseases including periodontitis, pre-term birth, and colorectal cancer.


Author(s):  
Patrick Videau ◽  
Kaitlyn Wells ◽  
Arun Singh ◽  
Jessie Eiting ◽  
Philip Proteau ◽  
...  

Cyanobacteria are prolific producers of natural products and genome mining has shown that many orphan biosynthetic gene clusters can be found in sequenced cyanobacterial genomes. New tools and methodologies are required to investigate these biosynthetic gene clusters and here we present the use of <i>Anabaena </i>sp. strain PCC 7120 as a host for combinatorial biosynthesis of natural products using the indolactam natural products (lyngbyatoxin A, pendolmycin, and teleocidin B-4) as a test case. We were able to successfully produce all three compounds using codon optimized genes from Actinobacteria. We also introduce a new plasmid backbone based on the native <i>Anabaena</i>7120 plasmid pCC7120ζ and show that production of teleocidin B-4 can be accomplished using a two-plasmid system, which can be introduced by co-conjugation.


2021 ◽  
Author(s):  
Valentin Waschulin ◽  
Chiara Borsetto ◽  
Robert James ◽  
Kevin K. Newsham ◽  
Stefano Donadio ◽  
...  

AbstractThe growing problem of antibiotic resistance has led to the exploration of uncultured bacteria as potential sources of new antimicrobials. PCR amplicon analyses and short-read sequencing studies of samples from different environments have reported evidence of high biosynthetic gene cluster (BGC) diversity in metagenomes, indicating their potential for producing novel and useful compounds. However, recovering full-length BGC sequences from uncultivated bacteria remains a challenge due to the technological restraints of short-read sequencing, thus making assessment of BGC diversity difficult. Here, long-read sequencing and genome mining were used to recover >1400 mostly full-length BGCs that demonstrate the rich diversity of BGCs from uncultivated lineages present in soil from Mars Oasis, Antarctica. A large number of highly divergent BGCs were not only found in the phyla Acidobacteriota, Verrucomicrobiota and Gemmatimonadota but also in the actinobacterial classes Acidimicrobiia and Thermoleophilia and the gammaproteobacterial order UBA7966. The latter furthermore contained a potential novel family of RiPPs. Our findings underline the biosynthetic potential of underexplored phyla as well as unexplored lineages within seemingly well-studied producer phyla. They also showcase long-read metagenomic sequencing as a promising way to access the untapped genetic reservoir of specialised metabolite gene clusters of the uncultured majority of microbes.


mSystems ◽  
2018 ◽  
Vol 3 (2) ◽  
Author(s):  
Daniela B. B. Trivella ◽  
Rafael de Felicio

ABSTRACT Natural products are the richest source of chemical compounds for drug discovery. Particularly, bacterial secondary metabolites are in the spotlight due to advances in genome sequencing and mining, as well as for the potential of biosynthetic pathway manipulation to awake silent (cryptic) gene clusters under laboratory cultivation. Further progress in compound detection, such as the development of the tandem mass spectrometry (MS/MS) molecular networking approach, has contributed to the discovery of novel bacterial natural products. The latter can be applied directly to bacterial crude extracts for identifying and dereplicating known compounds, therefore assisting the prioritization of extracts containing novel natural products, for example. In our opinion, these three approaches—genome mining, silent pathway induction, and MS-based molecular networking—compose the tripod for modern bacterial natural product discovery and will be discussed in this perspective.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Albina Nowak ◽  
Omer Murik ◽  
Tzvia Mann ◽  
David A. Zeevi ◽  
Gheona Altarescu

AbstractMore than 900 variants have been described in the GLA gene. Some intronic variants and copy number variants in GLA can cause Fabry disease but will not be detected by classical Sanger sequence. We aimed to design and validate a method for sequencing the GLA gene using long-read Oxford Nanopore sequencing technology. Twelve Fabry patients were blindly analyzed, both by conventional Sanger sequence and by long-read sequencing of a 13 kb PCR amplicon. We used minimap2 to align the long-read data and Nanopolish and Sniffles to call variants. All the variants detected by Sanger (including a deep intronic variant) were also detected by long-read sequencing. One patient had a deletion that was not detected by Sanger sequencing but was detected by the new technology. Our long-read sequencing-based method was able to detect missense variants and an exonic deletion, with the added advantage of intronic analysis. It can be used as an efficient and cost-effective tool for screening and diagnosing Fabry disease.


2018 ◽  
Author(s):  
Geoffrey D. Hannigan ◽  
David Prihoda ◽  
Andrej Palicka ◽  
Jindrich Soukup ◽  
Ondrej Klempir ◽  
...  

AbstractNatural products represent a rich reservoir of small molecule drug candidates utilized as antimicrobial drugs, anticancer therapies, and immunomodulatory agents. These molecules are microbial secondary metabolites synthesized by co-localized genes termed Biosynthetic Gene Clusters (BGCs). The increase in full microbial genomes and similar resources has led to development of BGC prediction algorithms, although their precision and ability to identify novel BGC classes could be improved. Here we present a deep learning strategy (DeepBGC) that offers more accurate BGC identification and an improved ability to extrapolate and identify novel BGC classes compared to existing tools. We supplemented this with downstream random forest classifiers that accurately predicted BGC product classes and potential chemical activity. Application of DeepBGC to bacterial genomes uncovered previously undetectable BGCs that may code for natural products with novel biologic activities. The improved accuracy and classification ability of DeepBGC represents a significant step forward forin-silicoBGC identification.


2021 ◽  
Author(s):  
Albina Nowak ◽  
Omer Murik ◽  
Tzvia Mann ◽  
David A. Zeevi ◽  
Gheona Altarescu

Abstract Introduction: More than one thousand variants have been described in the GLA gene. Some intronic variants and copy number variants in GLA can cause Fabry disease but will not be detected by classical Sanger sequence.Aims: We aimed to design and validate a method for sequencing the GLA gene using long read Oxford Nanopore sequencing technology.Methods: Twelve Fabry patients were blindly analyzed, both by conventional Sanger sequence and by long read sequencing of a 13kb PCR amplicon. We used minimap2 to align the long read data and Nanopolish and Sniffles to call variants.Results: All the variants detected by Sanger (including a deep intronic variant) were also detected by long read sequencing. One patient had a deletion that was not detected by Sanger sequencing but was detected by the new technology.Conclusions: Our long read sequencing-based method was able to detect missense variants and an exonic deletion, with the added advantage of intronic analysis. It can be used as an efficient and cost-effective tool for screening and diagnosing Fabry disease.


mSystems ◽  
2021 ◽  
Author(s):  
Rahim Rajwani ◽  
Shannon I. Ohlemacher ◽  
Gengxiang Zhao ◽  
Hong-Bing Liu ◽  
Carole A. Bewley

Short-read sequencing of GC-rich genomes such as those from actinomycetes results in a fragmented genome assembly and truncated biosynthetic gene clusters (often 10 to >100 kb long), which hinders our ability to understand the biosynthetic potential of a given strain and predict the molecules that can be produced. The current study demonstrates that contiguous DNA assemblies, suitable for analysis of BGCs, can be obtained through low-coverage, multiplexed sequencing on Flongle, which provides a new low-cost workflow ($30 to 40 per strain) for sequencing actinomycete strain libraries.


2019 ◽  
Author(s):  
Zhoutao Chen ◽  
Long Pham ◽  
Tsai-Chin Wu ◽  
Guoya Mo ◽  
Yu Xia ◽  
...  

AbstractLong-range sequencing information is required for haplotype phasing, de novo assembly and structural variation detection. Current long-read sequencing technologies can provide valuable long-range information but at a high cost with low accuracy and high DNA input requirement. We have developed a single-tube Transposase Enzyme Linked Long-read Sequencing (TELL-Seq™) technology, which enables a low-cost, high-accuracy and high-throughput short-read next generation sequencer to routinely generate over 100 Kb long-range sequencing information with as little as 0.1 ng input material. In a PCR tube, millions of clonally barcoded beads are used to uniquely barcode long DNA molecules in an open bulk reaction without dilution and compartmentation. The barcode linked reads are used to successfully assemble genomes ranging from microbes to human. These linked-reads also generate mega-base-long phased blocks and provide a cost-effective tool for detecting structural variants in a genome, which are important to identify compound heterozygosity in recessive Mendelian diseases and discover genetic drivers and diagnostic biomarkers in cancers.


2020 ◽  
Author(s):  
Yunchang Xie ◽  
Jiawen Chen ◽  
Bo Wang ◽  
Tai Chen ◽  
Junyu Chen ◽  
...  

Abstract Backgrounds: Activation of silent biosynthetic gene clusters (BGCs) in marine-derived actinomycete strains is a feasible strategy to discover bioactive natural products. Actinoalloteichus sp. AHMU CJ021, isolated from the seashore, was shown to contain an intact but silent caerulomycin A (CRM A) BGC-cam in its genome. Thus, a genome mining work was preformed to activate the strain’s bioproduction of CRM A, an immunosuppressive drug lead with diverse bioactivities.Results: To well activate the expression of cam, ribosomal engineering was adopted to treat the wild type Actinoalloteichus sp. AHMU CJ021. The initial mutant strain XC-11G with gentamycin resistance and CRM A bioproduction titer of 42.51 ± 4.22 mg/L was selected from all generated mutant strains by gene expression comparison of the essential biosynthetic gene-camE. The titer of CRM A bioproduction was then improved by two strain breeding methods via UV mutagenesis and cofactor engineering-directed increasing of intracellular riboflavin, which finally generated the optimal mutant strain XC-11GUR with a CRM A bioproduction titer of 113.91 ± 7.58 mg/L. Subsequently, this titer of strain XC-11GUR was improved to 618.61 ± 16.29 mg/L through medium optimization together with further adjustment derived from response surface methodology. In terms of this 14.7 folds increase in the titer of CRM A compared to the initial value, strain XC-GUR could be a well alternative strain for CRM A development.Conclusions: Our results have constructed an ideal CRM A producer. More importantly, our efforts also have demonstrated the effectiveness of abovementioned combinatorial strategies, which is applicable to the genome mining of bioactive natural products from abundant actinomycetes strains.


Sign in / Sign up

Export Citation Format

Share Document