Native molecule sequencing by nano-ID reveals synthesis and stability of RNA isoforms

AbstractEukaryotic genes often generate a variety of RNA isoforms that can lead to functionally distinct protein variants. The synthesis and stability of RNA isoforms is however poorly characterized. The reason for this is that current methods to quantify RNA metabolism use ‘short-read’ sequencing that cannot detect RNA isoforms. Here we present nanopore sequencing-based Isoform Dynamics (nano-ID), a method that detects newly synthesized RNA isoforms and monitors isoform metabolism. nano-ID combines metabolic RNA labeling, ‘long-read’ nanopore sequencing of native RNA molecules and machine learning. Application of nano-ID to the heat shock response in human cells reveals that many RNA isoforms change their synthesis rate, stability, and splicing pattern. nano-ID also shows that the metabolism of individual RNA isoforms differs strongly from that estimated for the combined RNA signal at a specific gene locus. And although combined RNA stability correlates with poly(A)-tail length, individual RNA isoforms can deviate significantly. nano-ID enables studies of RNA metabolism on the level of single RNA molecules and isoforms in different cell states and conditions.

Download Full-text

MBRS-47. RAPID MOLECULAR SUBGROUPING OF MEDULLOBLASTOMA BASED ON DNA METHYLATION BY NANOPORE SEQUENCING

Neuro-Oncology ◽

10.1093/neuonc/noaa222.556 ◽

2020 ◽

Vol 22 (Supplement_3) ◽

pp. iii406-iii406

Author(s):

Julien Masliah-Planchon ◽

Elodie Girard ◽

Philipp Euskirchen ◽

Christine Bourneix ◽

Delphine Lequin ◽

...

Keyword(s):

Dna Methylation ◽

Single Molecule ◽

Nanopore Sequencing ◽

Molecular Subgroup ◽

Group Assignment ◽

Group 4 ◽

Methylation Assay ◽

Tumor Group ◽

Long Read ◽

Group 3

Abstract Medulloblastoma (MB) can be classified into four molecular subgroups (WNT group, SHH group, group 3, and group 4). The gold standard of assignment of molecular subgroup through DNA methylation profiling uses Illumina EPIC array. However, this tool has some limitation in terms of cost and timing, in order to get the results soon enough for clinical use. We present an alternative DNA methylation assay based on nanopore sequencing efficient for rapid, cheaper, and reliable subgrouping of clinical MB samples. Low-depth whole genome with long-read single-molecule nanopore sequencing was used to simultaneously assess copy number profile and MB subgrouping based on DNA methylation. The DNA methylation data generated by Nanopore sequencing were compared to a publicly available reference cohort comprising over 2,800 brain tumors including the four subgroups of MB (Capper et al. Nature; 2018) to generate a score that estimates a confidence with a tumor group assignment. Among the 24 MB analyzed with nanopore sequencing (six WNT, nine SHH, five group 3, and four group 4), all of them were classified in the appropriate subgroup established by expression-based Nanostring subgrouping. In addition to the subgrouping, we also examine the genomic profile. Furthermore, all previously identified clinically relevant genomic rearrangements (mostly MYC and MYCN amplifications) were also detected with our assay. In conclusion, we are confirming the full reliability of nanopore sequencing as a novel rapid and cheap assay for methylation-based MB subgrouping. We now plan to implement this technology to other embryonal tumors of the central nervous system.

Download Full-text

Nanopore sequencing of single-cell transcriptomes with scCOLOR-seq

Nature Biotechnology ◽

10.1038/s41587-021-00965-w ◽

2021 ◽

Author(s):

Martin Philpott ◽

Jonathan Watson ◽

Anjan Thakurta ◽

Tom Brown ◽

...

Keyword(s):

Single Cell ◽

Error Detection ◽

Single Cells ◽

Fusion Transcript ◽

Building Blocks ◽

Myeloma Cell ◽

Nanopore Sequencing ◽

Long Read ◽

Unique Molecular Identifier ◽

Transcript Detection

AbstractHere we describe single-cell corrected long-read sequencing (scCOLOR-seq), which enables error correction of barcode and unique molecular identifier oligonucleotide sequences and permits standalone cDNA nanopore sequencing of single cells. Barcodes and unique molecular identifiers are synthesized using dimeric nucleotide building blocks that allow error detection. We illustrate the use of the method for evaluating barcode assignment accuracy, differential isoform usage in myeloma cell lines, and fusion transcript detection in a sarcoma cell line.

Download Full-text

Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

Nature Communications ◽

10.1038/s41467-021-23143-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Mathys Grapotte ◽

Manu Saraswat ◽

Chloé Bessière ◽

Christophe Menichelli ◽

Jordan A. Ramilowski ◽

...

Keyword(s):

Transcription Initiation ◽

Tandem Repeats ◽

Specific Gene ◽

Rna Seq ◽

Transcription Start Sites ◽

Long Read ◽

Cap Analysis ◽

Dna Tandem Repeats ◽

Short Tandem ◽

Str Polymorphism

AbstractUsing the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism.

Download Full-text

Dual Isoform Sequencing Reveals a Multifaceted Transcriptional Architecture of a Prototype Baculovirus

10.21203/rs.3.rs-637036/v1 ◽

2021 ◽

Author(s):

Gábor Torma ◽

Dóra Tombácz ◽

Norbert Moldován ◽

Ádám Fülöp ◽

István Prazsák ◽

...

Keyword(s):

Protein Coding ◽

Rna Molecules ◽

Non Coding Rna ◽

Oxford Nanopore ◽

The Pacific ◽

Viral Genes ◽

Long Read ◽

Oxford Nanopore Technologies ◽

Overlapping Transcripts

Abstract In this study, we used two long-read sequencing (LRS) techniques, Sequel from the Pacific Biosciences and MinION from Oxford Nanopore Technologies, for the transcriptional characterization of a prototype baculovirus, Autographacalifornica multiple nucleopolyhedrovirus. LRS is able to read full-length RNA molecules, and thereby to distinguish between transcript isoforms, mono- and polycistronic RNAs, and overlapping transcripts. Altogether, we detected 875 transcripts, of which 759 are novel and 116 have been annotated previously. These RNA molecules include 41 novel putative protein coding transcript (each containing 5’-truncated in-frame ORFs), 14 monocistronic transcripts, 99 multicistronic RNAs, 101 non-coding RNA, and 504 length isoforms. We also detected RNA methylation in 12 viral genes and RNA hyper-editing in the longer 5’-UTR transcript isoform of ORF 19 gene.

Download Full-text

How Long Are Long Tandem Repeats? A Challenge for Current Methods of Whole-Genome Sequence Assembly: The Case of Satellites in Caenorhabditis elegans

Genes ◽

10.3390/genes9100500 ◽

2018 ◽

Vol 9 (10) ◽

pp. 500

Author(s):

Juan A. Subirana ◽

Xavier Messeguer

Keyword(s):

Caenorhabditis Elegans ◽

Sanger Sequencing ◽

Tandem Repeats ◽

Whole Genome Sequence ◽

Nanopore Sequencing ◽

Original Sequence ◽

Genome Sequence Assembly ◽

Long Read ◽

Genomic Regions ◽

Caenorhabditis Elegans Genome

Repetitive genome regions have been difficult to sequence, mainly because of the comparatively small size of the fragments used in assembly. Satellites or tandem repeats are very abundant in nematodes and offer an excellent playground to evaluate different assembly methods. Here, we compare the structure of satellites found in three different assemblies of the Caenorhabditis elegans genome: the original sequence obtained by Sanger sequencing, an assembly based on PacBio technology, and an assembly using Nanopore sequencing reads. In general, satellites were found in equivalent genomic regions, but the new long-read methods (PacBio and Nanopore) tended to result in longer assembled satellites. Important differences exist between the assemblies resulting from the two long-read technologies, such as the sizes of long satellites. Our results also suggest that the lengths of some annotated genes with internal repeats which were assembled using Sanger sequencing are likely to be incorrect.

Download Full-text

High resolution species detection: accurate long read eDNA metabarcoding of North Sea fish using Oxford Nanopore sequencing

10.22541/au.163828284.40261296/v1 ◽

2021 ◽

Author(s):

Karlijn Doorenspleet ◽

Lara Jansen ◽

Saskia Oosterbroek ◽

Oscar Bos ◽

Pauline Kamermans ◽

...

Keyword(s):

High Resolution ◽

North Sea ◽

Fish Species ◽

16S Rrna Genes ◽

Rrna Genes ◽

Nanopore Sequencing ◽

Nature Restoration ◽

Oxford Nanopore ◽

Field Samples ◽

Long Read

To monitor the effect of nature restoration projects in North Sea ecosystems, accurate and intensive biodiversity assessments are vital. DNA based techniques and especially environmental DNA (eDNA) metabarcoding from seawater is becoming a powerful monitoring tool. However, current approaches are based on genetic target regions of <500 nucleotides, which offer limited taxonomic resolution. This study aims to develop and validate a long read nanopore sequencing method for eDNA that enables improved identification of fish species. We designed a universal primer pair targeting a 2kb region covering the 12S and 16S rRNA genes of fish mitochondria. eDNA was amplified and sequenced using the Oxford Nanopore MiniON. Sequence data was processed using the new pipeline Decona, and accurate consensus identities of above 99.9% were retrieved. The primer set efficiency was tested with eDNA from a 3.000.000 L zoo aquarium with 31 species of bony fish and elasmobranchs. Over 55% of the species present were identified on species level and over 75% on genus level. Next, our long read eDNA metabarcoding approach was applied to North Sea eDNA field samples collected at ship wreck sites, the Gemini Offshore Wind Farm, the Borkum Reef Grounds and a bare sand bottom. Here, location specific fish and vertebrate communities were obtained. Incomplete reference databases still form a major bottleneck in further developing high resolution long read metabarcoding. Yet, the method has great potential for rapid and accurate fish species monitoring in marine field studies.

Download Full-text

Comprehensive genetic diagnosis of tandem repeat expansion disorders with programmable targeted nanopore sequencing

10.1101/2021.09.27.21263187 ◽

2021 ◽

Author(s):

Igor Stevanovski ◽

Sanjog R. Chintalaphani ◽

Hasindu Gamaarachchi ◽

James M. Ferguson ◽

Sandy S. Pineda ◽

...

Keyword(s):

Tandem Repeat ◽

Tandem Repeats ◽

Fragile X ◽

Genetic Diagnosis ◽

Neuromuscular Diseases ◽

Nanopore Sequencing ◽

Molecular Tests ◽

Genetic Landscape ◽

Long Read ◽

Short Tandem

ABSTRACTShort-tandem repeat (STR) expansions are an important class of pathogenic genetic variants. Over forty neurological and neuromuscular diseases are caused by STR expansions, with 37 different genes implicated to date. Here we describe the use of programmable targeted long-read sequencing with Oxford Nanopore’s ReadUntil function for parallel genotyping of all known neuropathogenic STRs in a single, simple assay. Our approach enables accurate, haplotype-resolved assembly and DNA methylation profiling of expanded and non-expanded STR sites. In doing so, the assay correctly diagnoses all individuals in a cohort of patients (n = 27) with various neurogenetic diseases, including Huntington’s disease, fragile X syndrome and cerebellar ataxia (CANVAS) and others. Targeted long-read sequencing solves large and complex STR expansions that confound established molecular tests and short-read sequencing, and identifies non-canonical STR motif conformations and internal sequence interruptions. Even in our relatively small cohort, we observe a wide diversity of STR alleles of known and unknown pathogenicity, suggesting that long-read sequencing will redefine the genetic landscape of STR expansion disorders. Finally, we show how the flexible inclusion of pharmacogenomics (PGx) genes as secondary ReadUntil targets can identify clinically actionable PGx genotypes to further inform patient care, at no extra cost. Our study addresses the need for improved techniques for genetic diagnosis of STR expansion disorders and illustrates the broad utility of programmable long-read sequencing for clinical genomics.One sentence summaryThis study describes the development and validation of a programmable targeted nanopore sequencing assay for parallel genetic diagnosis of all known pathogenic short-tandem repeats (STRs) in a single, simple test.

Download Full-text

Single-cell RNA counting at allele- and isoform-resolution using Smart-seq3

10.1101/817924 ◽

2019 ◽

Cited By ~ 6

Author(s):

Michael Hagemann-Jensen ◽

Christoph Ziegenhain ◽

Ping Chen ◽

Daniel Ramsköld ◽

Gert-Jan Hendriks ◽

...

Keyword(s):

Single Cell ◽

Large Scale ◽

Cell Types ◽

Mouse Strains ◽

Rna Molecules ◽

Counting Strategy ◽

Long Read ◽

Sequencing Strategy ◽

Transcriptome Coverage ◽

Scale Characterization

AbstractLarge-scale sequencing of RNAs from individual cells can reveal patterns of gene, isoform and allelic expression across cell types and states1. However, current single-cell RNA-sequencing (scRNA-seq) methods have limited ability to count RNAs at allele- and isoform resolution, and long-read sequencing techniques lack the depth required for large-scale applications across cells2,3. Here, we introduce Smart-seq3 that combines full-length transcriptome coverage with a 5’ unique molecular identifier (UMI) RNA counting strategy that enabled in silico reconstruction of thousands of RNA molecules per cell. Importantly, a large portion of counted and reconstructed RNA molecules could be directly assigned to specific isoforms and allelic origin, and we identified significant transcript isoform regulation in mouse strains and human cell types. Moreover, Smart-seq3 showed a dramatic increase in sensitivity and typically detected thousands more genes per cell than Smart-seq2. Altogether, we developed a short-read sequencing strategy for single-cell RNA counting at isoform and allele-resolution applicable to large-scale characterization of cell types and states across tissues and organisms.

Download Full-text

Nanopore sequencing provides rapid and reliable insight into microbial profiles of Intensive Care Units

10.1101/2021.05.14.444165 ◽

2021 ◽

Author(s):

Guilherme Marcelino Viana de Siqueira ◽

Felipe Marcelo Pereira-dos-Santos ◽

Rafael Silva-Rocha ◽

Maria-Eugenia Guazzaroni

Keyword(s):

Intensive Care ◽

Intensive Care Units ◽

Nanopore Sequencing ◽

Accurate Identification ◽

Complex Samples ◽

Healthcare Settings ◽

Long Read ◽

Single Use ◽

Sequencing Platforms ◽

Insight Into

Fast and accurate identification of pathogens is an essential task in healthcare settings. Next generation sequencing platforms such as Illumina have greatly expanded the capacity with which different organisms can be detected in hospital samples, and third-generation nanopore-driven sequencing devices such as Oxford Nanopore's minION have recently emerged as ideal sequencing platforms for routine healthcare surveillance due to their long-read capacity and high portability. Despite its great potential, protocols and analysis pipelines for nanopore sequencing are still being extensively validated. In this work, we assess the ability of nanopore sequencing to provide reliable community profiles based on 16S rRNA sequencing in comparison to traditional Illumina platforms using samples collected from Intensive Care Units from a hospital in Brazil. While our results point that lower throughputs may be a shortcoming of the method in more complex samples, we show that the use of single-use Flongle flowcells in nanopore sequencing runs can provide insightful information on the community composition in healthcare settings.

Download Full-text

Fast and sensitive mapping of error-prone nanopore sequencing reads with GraphMap

10.1101/020719 ◽

2015 ◽

Cited By ~ 1

Author(s):

Ivan Sovic ◽

Mile Sikic ◽

Andreas Wilm ◽

Shannon Nicole Fenlon ◽

Swaine Chen ◽

...

Keyword(s):

Human Genome ◽

Variant Calling ◽

Error Rates ◽

Nanopore Sequencing ◽

Structural Variants ◽

Specific Identification ◽

Long Reads ◽

Long Read ◽

Specific Error ◽

Very High

Exploiting the power of nanopore sequencing requires the development of new bioinformatics approaches to deal with its specific error characteristics. We present the first nanopore read mapper (GraphMap) that uses a read-funneling paradigm to robustly handle variable error rates and fast graph traversal to align long reads with speed and very high precision (>95%). Evaluation on MinION sequencing datasets against short and long-read mappers indicates that GraphMap increases mapping sensitivity by at least 15-80%. GraphMap alignments are the first to demonstrate consensus calling with <1 error in 100,000 bases, variant calling on the human genome with 76% improvement in sensitivity over the next best mapper (BWA-MEM), precise detection of structural variants from 100bp to 4kbp in length and species and strain-specific identification of pathogens using MinION reads. GraphMap is available open source under the MIT license at https://github.com/isovic/graphmap.

Download Full-text