scholarly journals Episo: quantitative estimation of RNA 5-methylcytosine at isoform level by high-throughput sequencing of RNA treated with bisulfite

2019 ◽  
Vol 36 (7) ◽  
pp. 2033-2039 ◽  
Author(s):  
Junfeng Liu ◽  
Ziyang An ◽  
Jianjun Luo ◽  
Jing Li ◽  
Feifei Li ◽  
...  

Abstract Motivation RNA 5-methylcytosine (m5C) is a type of post-transcriptional modification that may be involved in numerous biological processes and tumorigenesis. RNA m5C can be profiled at single-nucleotide resolution by high-throughput sequencing of RNA treated with bisulfite (RNA-BisSeq). However, the exploration of transcriptome-wide profile and potential function of m5C in splicing remains to be elucidated due to lack of isoform level m5C quantification tool. Results We developed a computational package to quantify Epitranscriptomal RNA m5C at the transcript isoform level (named Episo). Episo consists of three tools: mapper, quant and Bisulfitefq, for mapping, quantifying and simulating RNA-BisSeq data, respectively. The high accuracy of Episo was validated using an improved m5C-specific methylated RNA immunoprecipitation (meRIP) protocol, as well as a set of in silico experiments. By applying Episo to public human and mouse RNA-BisSeq data, we found that the RNA m5C is not evenly distributed among the transcript isoforms, implying the m5C may subject to be regulated at isoform level. Availability and implementation Episo is released under the GNU GPLv3+ license. The resource code Episo is freely accessible from https://github.com/liujunfengtop/Episo (with Tophat/cufflink) and https://github.com/liujunfengtop/Episo/tree/master/Episo_Kallisto (with Kallisto). Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Vol 47 (18) ◽  
pp. e103-e103 ◽  
Author(s):  
Benjamin J Callahan ◽  
Joan Wong ◽  
Cheryl Heiner ◽  
Steve Oh ◽  
Casey M Theriot ◽  
...  

AbstractTargeted PCR amplification and high-throughput sequencing (amplicon sequencing) of 16S rRNA gene fragments is widely used to profile microbial communities. New long-read sequencing technologies can sequence the entire 16S rRNA gene, but higher error rates have limited their attractiveness when accuracy is important. Here we present a high-throughput amplicon sequencing methodology based on PacBio circular consensus sequencing and the DADA2 sample inference method that measures the full-length 16S rRNA gene with single-nucleotide resolution and a near-zero error rate. In two artificial communities of known composition, our method recovered the full complement of full-length 16S sequence variants from expected community members without residual errors. The measured abundances of intra-genomic sequence variants were in the integral ratios expected from the genuine allelic variants within a genome. The full-length 16S gene sequences recovered by our approach allowed Escherichia coli strains to be correctly classified to the O157:H7 and K12 sub-species clades. In human fecal samples, our method showed strong technical replication and was able to recover the full complement of 16S rRNA alleles in several E. coli strains. There are likely many applications beyond microbial profiling for which high-throughput amplicon sequencing of complete genes with single-nucleotide resolution will be of use.


Author(s):  
Quang Tran ◽  
Alexej Abyzov

Abstract Summary Defining the precise location of structural variations (SVs) at single-nucleotide breakpoint resolution is a challenging problem due to large gaps in alignment. Previously, Alignment with Gap Excision (AGE) enabled us to define breakpoints of SVs at single-nucleotide resolution; however, AGE requires a vast amount of memory when aligning a pair of long sequences. To address this, we developed a memory-efficient implementation—LongAGE—based on the classical Hirschberg algorithm. We demonstrate an application of LongAGE for resolving breakpoints of SVs embedded into segmental duplications on Pacific Biosciences (PacBio) reads that can be longer than 10 kb. Furthermore, we observed different breakpoints for a deletion and a duplication in the same locus, providing direct evidence that such multi-allelic copy number variants (mCNVs) arise from two or more independent ancestral mutations. Availability and implementation LongAGE is implemented in C++ and available on Github at https://github.com/Coaxecva/LongAGE. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Meng Liu ◽  
Gangqiang Guo ◽  
Pengge Qian ◽  
Jianbing Mu ◽  
Binbin Lu ◽  
...  

5-methylcytosine (m5C) is an important epi-transcriptomic modification involved in mRNA stability and translation efficiency in various biological processes. However, it remains unclear if m5C modification contributes to the dynamic regulation of the transcriptome during the developmental cycles of Plasmodium parasites. Here, we characterize the landscape of m5C mRNA modifications at single nucleotide resolution in the asexual replication stages and gametocyte sexual stages of rodent (P. yoelii) and human (P. falciparum) malaria parasites. While different representations of m5C-modified mRNAs are associated with the different stages, the abundance of the m5C marker is strikingly enhanced in the transcriptomes of gametocytes. Our results show that m5C modifications confer stability to the Plasmodium transcripts and that a Plasmodium ortholog of NSUN2 is a major mRNA m5C methyltransferase in malaria parasites. Upon knock-out of P. yoelii nsun2 (pynsun2), marked reductions of m5C modification were observed in a panel of gametocytogenesis-associated transcripts. These reductions correlated with impaired gametocyte production in rodent and human malaria parasites. Restoration of the nsun2 gene in the knock-out parasites rescued the gametocyte production phenotype as well as m5C modification of the gametocytogenesis-associated transcripts. Together with the mRNA m5C profiles for two species of Plasmodium, our findings demonstrate a major role for NSUN2-mediated m5C modifications in mRNA transcript stability and sexual differentiation in malaria parasites.


2014 ◽  
Vol 4 (1) ◽  
Author(s):  
Nicholas C. Wu ◽  
Arthur P. Young ◽  
Laith Q. Al-Mawsawi ◽  
C. Anders Olson ◽  
Jun Feng ◽  
...  

2018 ◽  
Author(s):  
Benjamin J Callahan ◽  
Joan Wong ◽  
Cheryl Heiner ◽  
Steve Oh ◽  
Casey M Theriot ◽  
...  

AbstractTargeted PCR amplification and high-throughput sequencing (amplicon sequencing) of 16S rRNA gene fragments is widely used to profile microbial communities. New long-read sequencing technologies can sequence the entire 16S rRNA gene, but higher error rates have limited their attractiveness when accuracy is important. Here we present a high-throughput amplicon sequencing methodology based on PacBio circular consensus sequencing and the DADA2 sample inference method that measures the full-length 16S rRNA gene with single-nucleotide resolution and a near-zero error rate.In two artificial communities of known composition, our method recovered the full complement of full-length 16S sequence variants from expected community members without residual errors. The measured abundances of intra-genomic sequence variants were in the integral ratios expected from the genuine allelic variants within a genome. The full-length 16S gene sequences recovered by our approach allowedE. colistrains to be correctly classified to the O157:H7 and K12 sub-species clades. In human fecal samples, our method showed strong technical replication and was able to recover the full complement of 16S rRNA alleles in severalE. colistrains.There are likely many applications beyond microbial profiling for which high-throughput amplicon sequencing of complete genes with single-nucleotide resolution will be of use.


2016 ◽  
Vol 113 (32) ◽  
pp. 9057-9062 ◽  
Author(s):  
Peng Mao ◽  
Michael J. Smerdon ◽  
Steven A. Roberts ◽  
John J. Wyrick

UV-induced DNA lesions are important contributors to mutagenesis and cancer, but it is not fully understood how the chromosomal landscape influences UV lesion formation and repair. Genome-wide profiling of repair activity in UV irradiated cells has revealed significant variations in repair kinetics across the genome, not only among large chromatin domains, but also at individual transcription factor binding sites. Here we report that there is also a striking but predictable variation in initial UV damage levels across a eukaryotic genome. We used a new high-throughput sequencing method, known as CPD-seq, to precisely map UV-induced cyclobutane pyrimidine dimers (CPDs) at single-nucleotide resolution throughout the yeast genome. This analysis revealed that individual nucleosomes significantly alter CPD formation, protecting nucleosomal DNA with an inward rotational setting, even though such DNA is, on average, more intrinsically prone to form CPD lesions. CPD formation is also inhibited by DNA-bound transcription factors, in effect shielding important DNA elements from UV damage. Analysis of CPD repair revealed that initial differences in CPD damage formation often persist, even at later repair time points. Furthermore, our high-resolution data demonstrate, to our knowledge for the first time, that CPD repair is significantly less efficient at translational positions near the dyad of strongly positioned nucleosomes in the yeast genome. These findings define the global roles of nucleosomes and transcription factors in both UV damage formation and repair, and have important implications for our understanding of UV-induced mutagenesis in human cancers.


2019 ◽  
Vol 35 (16) ◽  
pp. 2859-2861
Author(s):  
Linfang Jin ◽  
Jinhuo Lai ◽  
Yang Zhang ◽  
Ying Fu ◽  
Shuhang Wang ◽  
...  

AbstractSummaryHere we developed a tool called Breakpoint Identification (BreakID) to identity fusion events from targeted sequencing data. Taking discordant read pairs and split reads as supporting evidences, BreakID can identify gene fusion breakpoints at single nucleotide resolution. After validation with confirmed fusion events in cancer cell lines, we have proved that BreakID can achieve high sensitivity of 90.63% along with PPV of 100% at sequencing depth of 500× and perform better than other available fusion detection tools. We anticipate that BreakID will have an extensive popularity in the detection and analysis of fusions involved in clinical and research sequencing scenarios.Availability and implementationSource code is freely available at https://github.com/SinOncology/BreakID.Supplementary informationSupplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Justin T. Roberts ◽  
Allison M. Porman ◽  
Aaron M. Johnson

AbstractMethylation at the N6 position of adenosine (m6A) is one of the most abundant RNA modifications found in eukaryotes, however accurate detection of specific m6A nucleotides within transcripts has been historically challenging due to m6A and unmodified adenosine having virtually indistinguishable chemical properties. While previous strategies such as methyl-RNA immunoprecipitation and sequencing (MeRIP-Seq) have relied on m6A-specific antibodies to isolate RNA fragments containing the modification, these methods do not allow for precise identification of individual m6A residues. More recently, modified cross-linking and immunoprecipitation (CLIP) based approaches that rely on inducing specific mutations during reverse transcription via UV crosslinking of the anti-m6A antibody to methylated RNA have been employed to overcome this limitation. However, the most utilized version of this approach, miCLIP, can be technically challenging to use for achieving high-complexity libraries. Here we present an improved methodology that yields high library complexity and allows for the straightforward identification of individual m6A residues with reliable confidence metrics. Based on enhanced CLIP (eCLIP), our m6A-eCLIP (meCLIP) approach couples the improvements of eCLIP with the inclusion of an input sample and an easy-to-use computational pipeline to allow for precise calling of m6A sites at true single nucleotide resolution. As the effort to accurately identify m6As in an efficient and straightforward way intensifies, this method is a valuable tool for investigators interested in unraveling the m6A epitranscriptome.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Michael A. Boemo

Abstract Background Measuring DNA replication dynamics with high throughput and single-molecule resolution is critical for understanding both the basic biology behind how cells replicate their DNA and how DNA replication can be used as a therapeutic target for diseases like cancer. In recent years, the detection of base analogues in Oxford Nanopore Technologies (ONT) sequencing reads has become a promising new method to supersede existing single-molecule methods such as DNA fibre analysis: ONT sequencing yields long reads with high throughput, and sequenced molecules can be mapped to the genome using standard sequence alignment software. Results This paper introduces DNAscent v2, software that uses a residual neural network to achieve fast, accurate detection of the thymidine analogue BrdU with single-nucleotide resolution. DNAscent v2 also comes equipped with an autoencoder that interprets the pattern of BrdU incorporation on each ONT-sequenced molecule into replication fork direction to call the location of replication origins termination sites. DNAscent v2 surpasses previous versions of DNAscent in BrdU calling accuracy, origin calling accuracy, speed, and versatility across different experimental protocols. Unlike NanoMod, DNAscent v2 positively identifies BrdU without the need for sequencing unmodified DNA. Unlike RepNano, DNAscent v2 calls BrdU with single-nucleotide resolution and detects more origins than RepNano from the same sequencing data. DNAscent v2 is open-source and available at https://github.com/MBoemo/DNAscent. Conclusions This paper shows that DNAscent v2 is the new state-of-the-art in the high-throughput, single-molecule detection of replication fork dynamics. These improvements in DNAscent v2 mark an important step towards measuring DNA replication dynamics in large genomes with single-molecule resolution. Looking forward, the increase in accuracy in single-nucleotide resolution BrdU calls will also allow DNAscent v2 to branch out into other areas of genome stability research, particularly the detection of DNA repair.


Sign in / Sign up

Export Citation Format

Share Document