Biosynthetic potential of uncultured Antarctic soil bacteria revealed through long-read metagenomic sequencing

The ISME Journal ◽

10.1038/s41396-021-01052-3 ◽

2021 ◽

Author(s):

Valentin Waschulin ◽

Chiara Borsetto ◽

Robert James ◽

Kevin K. Newsham ◽

Stefano Donadio ◽

...

Keyword(s):

Genome Mining ◽

Gene Clusters ◽

Biosynthetic Gene Cluster ◽

Full Length ◽

Metagenomic Sequencing ◽

Short Read ◽

Short Read Sequencing ◽

Rich Diversity ◽

Long Read ◽

The Rich

AbstractThe growing problem of antibiotic resistance has led to the exploration of uncultured bacteria as potential sources of new antimicrobials. PCR amplicon analyses and short-read sequencing studies of samples from different environments have reported evidence of high biosynthetic gene cluster (BGC) diversity in metagenomes, indicating their potential for producing novel and useful compounds. However, recovering full-length BGC sequences from uncultivated bacteria remains a challenge due to the technological restraints of short-read sequencing, thus making assessment of BGC diversity difficult. Here, long-read sequencing and genome mining were used to recover >1400 mostly full-length BGCs that demonstrate the rich diversity of BGCs from uncultivated lineages present in soil from Mars Oasis, Antarctica. A large number of highly divergent BGCs were not only found in the phyla Acidobacteriota, Verrucomicrobiota and Gemmatimonadota but also in the actinobacterial classes Acidimicrobiia and Thermoleophilia and the gammaproteobacterial order UBA7966. The latter furthermore contained a potential novel family of RiPPs. Our findings underline the biosynthetic potential of underexplored phyla as well as unexplored lineages within seemingly well-studied producer phyla. They also showcase long-read metagenomic sequencing as a promising way to access the untapped genetic reservoir of specialised metabolite gene clusters of the uncultured majority of microbes.

Get full-text (via PubEx)

Metabolic potential of uncultured Antarctic soil bacteria revealed through long-read metagenomic sequencing

10.1101/2020.12.09.416412 ◽

2020 ◽

Author(s):

Valentin Waschulin ◽

Chiara Borsetto ◽

Robert James ◽

Kevin K. Newsham ◽

Stefano Donadio ◽

...

Keyword(s):

Genome Mining ◽

Biosynthetic Gene Cluster ◽

Metagenomic Sequencing ◽

Metabolic Potential ◽

Uncultured Bacteria ◽

Rich Diversity ◽

Sequencing Studies ◽

Long Read ◽

The Rich ◽

Uncultivated Bacteria

AbstractThe growing problem of antibiotic resistance has led to the exploration of uncultured bacteria as potential sources of new antimicrobials. PCR amplicon analyses and short-read sequencing studies of samples from different environments have reported evidence of high biosynthetic gene cluster (BGC) diversity in metagenomes. However, few complete BGCs from uncultivated bacteria have been recovered, making assessment of BGC diversity difficult. Here, long-read sequencing and genome mining were used to recover >1400 mostly complete BGCs that demonstrate the rich diversity of BGCs from uncultivated lineages present in soil from Mars Oasis, Antarctica. The phyla Acidobacteriota, Verrucomicrobiota and Gemmatimonadota, but also the actinobacterial classes Acidimicrobiia, Thermoleophilia, and the gammaproteobacterial order UBA7966, were found to encode a large number of highly divergent BGCs. Our findings underline the biosynthetic potential of underexplored phyla as well as unexplored lineages within seemingly well-studied producer phyla. They also showcase long-read metagenomic sequencing as a promising way to access the untapped reservoir of specialised metabolites of the uncultured majority of microbes.

Get full-text (via PubEx)

Qualitative De Novo Analysis of Full Length cDNA and Quantitative Analysis of Gene Expression for Common Marmoset (Callithrix jacchus) Transcriptomes Using Parallel Long-Read Technology and Short-Read Sequencing

PLoS ONE ◽

10.1371/journal.pone.0100936 ◽

2014 ◽

Vol 9 (6) ◽

pp. e100936 ◽

Cited By ~ 24

Author(s):

Makiko Shimizu ◽

Shunsuke Iwano ◽

Yasuhiro Uno ◽

Shotaro Uehara ◽

Takashi Inoue ◽

...

Keyword(s):

Gene Expression ◽

Quantitative Analysis ◽

De Novo ◽

Common Marmoset ◽

Callithrix Jacchus ◽

Full Length ◽

Short Read ◽

Full Length Cdna ◽

Short Read Sequencing ◽

Long Read

Get full-text (via PubEx)

Improving nanopore read accuracy with the R2C2 method enables the sequencing of highly multiplexed full-length single-cell cDNA

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1806447115 ◽

2018 ◽

Vol 115 (39) ◽

pp. 9726-9731 ◽

Cited By ~ 65

Author(s):

Roger Volden ◽

Theron Palmer ◽

Ashley Byrne ◽

Charles Cole ◽

Robert J. Schmitz ◽

...

Keyword(s):

Single Cell ◽

Full Length ◽

Long Distance ◽

Distance Information ◽

Short Read ◽

Transcript Isoforms ◽

Short Read Sequencing ◽

Sequencing Method ◽

Long Read ◽

Rna Transcript

High-throughput short-read sequencing has revolutionized how transcriptomes are quantified and annotated. However, while Illumina short-read sequencers can be used to analyze entire transcriptomes down to the level of individual splicing events with great accuracy, they fall short of analyzing how these individual events are combined into complete RNA transcript isoforms. Because of this shortfall, long-distance information is required to complement short-read sequencing to analyze transcriptomes on the level of full-length RNA transcript isoforms. While long-read sequencing technology can provide this long-distance information, there are issues with both Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) long-read sequencing technologies that prevent their widespread adoption. Briefly, PacBio sequencers produce low numbers of reads with high accuracy, while ONT sequencers produce higher numbers of reads with lower accuracy. Here, we introduce and validate a long-read ONT-based sequencing method. At the same cost, our Rolling Circle Amplification to Concatemeric Consensus (R2C2) method generates more accurate reads of full-length RNA transcript isoforms than any other available long-read sequencing method. These reads can then be used to generate isoform-level transcriptomes for both genome annotation and differential expression analysis in bulk or single-cell samples.

Get full-text (via PubEx)

Full-length 16S rRNA gene amplicon analysis of human gut microbiota using MinION™ nanopore sequencing confers species-level resolution

BMC Microbiology ◽

10.1186/s12866-021-02094-5 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Yoshiyuki Matsuo ◽

Shinnosuke Komiya ◽

Yoshiaki Yasumizu ◽

Yuki Yasuoka ◽

Katsura Mizushima ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Amplicon Sequencing ◽

Species Level ◽

Full Length ◽

Rrna Gene ◽

Short Read ◽

Short Read Sequencing ◽

Long Read

Abstract Background Species-level genetic characterization of complex bacterial communities has important clinical applications in both diagnosis and treatment. Amplicon sequencing of the 16S ribosomal RNA (rRNA) gene has proven to be a powerful strategy for the taxonomic classification of bacteria. This study aims to improve the method for full-length 16S rRNA gene analysis using the nanopore long-read sequencer MinION™. We compared it to the conventional short-read sequencing method in both a mock bacterial community and human fecal samples. Results We modified our existing protocol for full-length 16S rRNA gene amplicon sequencing by MinION™. A new strategy for library construction with an optimized primer set overcame PCR-associated bias and enabled taxonomic classification across a broad range of bacterial species. We compared the performance of full-length and short-read 16S rRNA gene amplicon sequencing for the characterization of human gut microbiota with a complex bacterial composition. The relative abundance of dominant bacterial genera was highly similar between full-length and short-read sequencing. At the species level, MinION™ long-read sequencing had better resolution for discriminating between members of particular taxa such as Bifidobacterium, allowing an accurate representation of the sample bacterial composition. Conclusions Our present microbiome study, comparing the discriminatory power of full-length and short-read sequencing, clearly illustrated the analytical advantage of sequencing the full-length 16S rRNA gene.

Get full-text (via PubEx)

scCAT-seq:single-cell identification and quantification of mRNA isoforms by cost-effective short-read sequencing of cap and tail

10.1101/2019.12.11.873505 ◽

2019 ◽

Author(s):

Youjin Hu ◽

Jiawei Zhong ◽

Yuhua Xiao ◽

Zheng Xing ◽

Katherine Sheu ◽

...

Keyword(s):

Single Cell ◽

Learning Algorithm ◽

Single Cells ◽

Full Length ◽

Translation Efficiency ◽

Mrna Isoforms ◽

Short Read ◽

Short Read Sequencing ◽

Long Read ◽

Identification And Quantification

AbstractThe differences in transcription start sites (TSS) and transcription end sites (TES) among gene isoforms can affect the stability, localization, and translation efficiency of mRNA. Isoforms also allow a single gene different functions across various tissues and cells However, methods for efficient genome-wide identification and quantification of RNA isoforms in single cells are still lacking. Here, we introduce single cell Cap And Tail sequencing (scCAT-seq). In conjunction with a novel machine learning algorithm developed for TSS/TES characterization, scCAT-seq can demarcate transcript boundaries of RNA transcripts, providing an unprecedented way to identify and quantify single-cell full-length RNA isoforms based on short-read sequencing. Compared with existing long-read sequencing methods, scCAT-seq has higher efficiency with lower cost. Using scCAT-seq, we identified hundreds of previously uncharacterized full-length transcripts and thousands of alternative transcripts for known genes, quantitatively revealed cell-type specific isoforms with alternative TSSs/TESs in dorsal root ganglion (DRG) neurons, mature oocytes and ageing oocytes, and generated the first atlas of the non-human primate cornea. The approach described here can be widely adapted to other short-read or long-read methods to improve accuracy and efficiency in assessing RNA isoform dynamics among single cells.

Get full-text (via PubEx)

Full-length 16S rRNA gene amplicon analysis of human gut microbiota using MinION™ nanopore sequencing confers species-level resolution

10.1101/2020.05.06.078147 ◽

2020 ◽

Cited By ~ 3

Author(s):

Yoshiyuki Matsuo ◽

Shinnosuke Komiya ◽

Yoshiaki Yasumizu ◽

Yuki Yasuoka ◽

Katsura Mizushima ◽

...

Keyword(s):

16S Rrna ◽

16S Rrna Gene ◽

Amplicon Sequencing ◽

Species Level ◽

Full Length ◽

Rrna Gene ◽

Short Read ◽

Short Read Sequencing ◽

16S Amplicon Sequencing ◽

Long Read

AbstractBackgroundSpecies-level genetic characterization of complex bacterial communities has important clinical applications in both diagnosis and treatment. Amplicon sequencing of the 16S ribosomal RNA (rRNA) gene has proven to be a powerful strategy for the taxonomic classification of bacteria. This study aims to improve the method for full-length 16S rRNA gene analysis using the nanopore long-read sequencer MinION™. We compared it to the conventional short-read sequencing method in both a mock bacterial community and human fecal samples.ResultsWe modified our existing protocol for full-length 16S amplicon sequencing by MinION™. A new strategy for library construction with an optimized primer set overcame PCR-associated bias and enabled taxonomic classification across a broad range of bacterial species. We compared the performance of full-length and short-read 16S amplicon sequencing for the characterization of human gut microbiota with a complex bacterial composition. The relative abundance of dominant bacterial genera was highly similar between full-length and short-read sequencing. At the species level, MinION™ long-read sequencing had better resolution for discriminating between members of particular taxa such as Bifidobacterium, allowing an accurate representation of the sample bacterial composition.ConclusionsOur present microbiome study, comparing the discriminatory power of full-length and short-read sequencing, clearly illustrated the analytical advantage of sequencing the full-length 16S rRNA gene, which provided the requisite species-level resolution and accuracy in clinical settings.

Get full-text (via PubEx)

R2C2: Improving nanopore read accuracy enables the sequencing of highly-multiplexed full-length single-cell cDNA

10.1101/338020 ◽

2018 ◽

Cited By ~ 1

Author(s):

Roger Volden ◽

Theron Palmer ◽

Ashley Byrne ◽

Charles Cole ◽

Robert J Schmitz ◽

...

Keyword(s):

Quantitative Analysis ◽

Single Cell ◽

Cancer Biology ◽

Full Length ◽

Short Read ◽

Transcript Isoforms ◽

Short Read Sequencing ◽

Sequencing Method ◽

Long Read ◽

Rna Transcript

AbstractHigh-throughput short-read sequencing has revolutionized how transcriptomes are quantified and annotated. However, while Illumina short-read sequencers can be used to analyze entire transcriptomes down to the level of individual splicing events with great accuracy, they fall short of analyzing how these individual events are combined into complete RNA transcript isoforms. Because of this shortfall, long-read sequencing is required to complement short-read sequencing to analyze transcriptomes on the level of full-length RNA transcript isoforms. However, there are issues with both Pacific Biosciences (PacBio) and Oxford Nanopore Technologies (ONT) long-read sequencing technologies that prevent their widespread adoption. Briefly, PacBio sequencers produce low numbers of reads with high accuracy, while ONT sequencers produce higher numbers of reads with lower accuracy. Here we introduce and validate a new long-read ONT based sequencing method. At the same cost, our Rolling Circle Amplification to Concatemeric Consensus (R2C2) method generates more accurate reads of full-length RNA transcript isoforms than any other available long-read sequencing method. These reads can then be used to generate isoform-level transcriptomes for both genome annotation and differential expression analysis in bulk or single cell samples.Significance StatementSubtle changes in RNA transcript isoform expression can have dramatic effects on cellular behaviors in both health and disease. As such, comprehensive and quantitative analysis of isoform-level transcriptomes would open an entirely new window into cellular diversity in fields ranging from developmental to cancer biology. The R2C2 method we are presenting here is the first method with sufficient throughput and accuracy to make the comprehensive and quantitative analysis of RNA transcript isoforms in bulk and single cell samples economically feasible.

Get full-text (via PubEx)

A Single-Molecule Long-Read Survey of Human Transcriptomes using LoopSeq Synthetic Long Read Sequencing

10.1101/532135 ◽

2019 ◽

Cited By ~ 5

Author(s):

Indira Wu ◽

Tuval Ben-Yehezkel

Keyword(s):

Single Molecule ◽

Transcriptome Sequencing ◽

Splice Variants ◽

Error Rates ◽

Full Length ◽

Tissue Samples ◽

Short Read ◽

Short Read Sequencing ◽

Long Read ◽

Sequence Reconstruction

AbstractState-of-the-art short-read transcriptome sequencing methods employ unique molecular identifier (UMI) to accurately classify and count mRNA transcripts. A fundamental limitation of UMI-based short-read transcriptome sequencing is that each read typically covers a small fraction of the transcript sequence. Efforts to accurately characterize splicing isoforms, arguably the largest source of variation in Human gene expression, using short read sequencing have therefore largely relied on computational predictions of transcript isoforms based on indirect observations. Here we describe a transcript counting, synthetic long read method for sequencing whole transcriptomes using short read sequencing platforms and no additional hardware. The method enables full-length mRNA sequence reconstruction at single-nucleotide resolutions with high-throughput, low error rates and UMI based transcript counting using any Illumina sequencer. We describe results from whole transcriptome sequencing from total RNA extracted from 3 human tissue samples: brain, liver, and blood. Reconstructed transcript sequences are characterized and annotated using SQANTI, an analysis pipeline for assessing the sequence quality of long-read transcriptomes. Our results demonstrate that LoopSeq synthetic long-read sequencing can reconstruct contigs up to 3,900nt full-length transcripts using tissue extracted RNA, as well as identify novel splice variants of known junction donors and acceptors.

Get full-text (via PubEx)

Metagenomic exploration of the marine sponge mycale hentscheli uncovers multiple polyketide-producing bacterial symbionts

10.26686/wgtn.12444086 ◽

2020 ◽

Author(s):

Mathew Storey ◽

SK Andreassend ◽

Joe Bracegirdle ◽

Alistair Brown ◽

Robert Keyzers ◽

...

Keyword(s):

Marine Sponge ◽

Bacterial Species ◽

Gene Clusters ◽

Taxonomic Diversity ◽

Marine Sponges ◽

Metagenomic Sequencing ◽

Defensive Symbiosis ◽

Long Read ◽

Comprehensive Picture ◽

Mycale Hentscheli

© 2020 Storey et al. Marine sponges have been a prolific source of unique bioactive compounds that are presumed to act as a deterrent to predation. Many of these compounds have potential therapeutic applications; however, the lack of efficient and sustainable synthetic routes frequently limits clinical development. Here, we describe a metag-enomic investigation of Mycale hentscheli, a chemically gifted marine sponge that pos-sesses multiple distinct chemotypes. We applied shotgun metagenomic sequencing, hybrid assembly of short-and long-read data, and metagenomic binning to obtain a comprehensive picture of the microbiome of five specimens, spanning three chemo-types. Our data revealed multiple producing species, each having relatively modest secondary metabolomes, that contribute collectively to the chemical arsenal of the holo-biont. We assembled complete genomes for multiple new genera, including two species that produce the cytotoxic polyketides pateamine and mycalamide, as well as a third high-abundance symbiont harboring a proteusin-type biosynthetic pathway that appears to encode a new polytheonamide-like compound. We also identified an additional 188 biosynthetic gene clusters, including a pathway for biosynthesis of peloruside. These re-sults suggest that multiple species cooperatively contribute to defensive symbiosis in M. hentscheli and reveal that the taxonomic diversity of secondary-metabolite-producing sponge symbionts is larger and richer than previously recognized. IMPORTANCE Mycale hentscheli is a marine sponge that is rich in bioactive small mol-ecules. Here, we use direct metagenomic sequencing to elucidate highly complete and contiguous genomes for the major symbiotic bacteria of this sponge. We identify complete biosynthetic pathways for the three potent cytotoxic polyketides which have previously been isolated from M. hentscheli. Remarkably, and in contrast to previous studies of marine sponges, we attribute each of these metabolites to a different producing mi-crobe. We also find that the microbiome of M. hentscheli is stably maintained among in-dividuals, even over long periods of time. Collectively, our data suggest a cooperative mode of defensive symbiosis in which multiple symbiotic bacterial species cooperatively contribute to the defensive chemical arsenal of the holobiont.

Get full-text (via PubEx)

Rapid Mycobacterium tuberculosis spoligotyping from uncorrected long reads using Galru

10.1101/2020.05.31.126490 ◽

2020 ◽

Author(s):

Andrew J. Page ◽

Nabil-Fareed Alikhan ◽

Michael Strinden ◽

Thanh Le Viet ◽

Timofey Skvortsov

Keyword(s):

Mycobacterium Tuberculosis ◽

State Of The Art ◽

Sequence Data ◽

Human Pathogen ◽

Sequencing Data ◽

Short Read ◽

Short Read Sequencing ◽

Long Reads ◽

Long Read

AbstractSpoligotyping of Mycobacterium tuberculosis provides a subspecies classification of this major human pathogen. Spoligotypes can be predicted from short read genome sequencing data; however, no methods exist for long read sequence data such as from Nanopore or PacBio. We present a novel software package Galru, which can rapidly detect the spoligotype of a Mycobacterium tuberculosis sample from as little as a single uncorrected long read. It allows for near real-time spoligotyping from long read data as it is being sequenced, giving rapid sample typing. We compare it to the existing state of the art software and find it performs identically to the results obtained from short read sequencing data. Galru is freely available from https://github.com/quadram-institute-bioscience/galru under the GPLv3 open source licence.

Get full-text (via PubEx)