scholarly journals Paired-end small RNA sequencing reveals a possible overestimation in the isomiR sequence repertoire previously reported from conventional single read data analysis

Author(s):  
Jose Francisco Sanchez-Herrero ◽  
Raquel Pluvinet ◽  
Antonio Luna-de Haro ◽  
Lauro Sumoy

Abstract Background Next generation sequencing has allowed the discovery of miRNA isoforms, termed isomirs. Some isomirs are derived from imprecise processing of pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition of bases at the ends or editing of internal bases, resulting in base differences relative to the template DNA sequence. We hypothesized that some component of the isomir variation reported so far could be due to systematic technical noise and not real. Results We have developed the XICRA pipeline to analyze small RNA sequencing data at the isomir level. We exploited its ability to use single or merged reads to compare isomir results derived from paired-end (PE) reads with those from single reads (SR) to address whether detectable sequence differences relative to canonical miRNAs found in isomirs are true biological variations or the result of errors in sequencing. We have detected non-negligible systematic differences between SR and PE data which primarily affect putative internally edited isomirs, and at a much smaller frequency terminal length changing isomirs. This is relevant for the identification of true isomirs in small RNA sequencing datasets. Conclusions We conclude that potential artifacts derived from sequencing errors and/or data processing could result in an overestimation of abundance and diversity of miRNA isoforms. Efforts in annotating the isomirnome should take this into account.

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jose Francisco Sanchez Herrero ◽  
Raquel Pluvinet ◽  
Antonio Luna de Haro ◽  
Lauro Sumoy

Abstract Background Next generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs. Some isomiRs are derived from imprecise processing of pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition of bases at the ends or editing of internal bases, resulting in base differences relative to the template DNA sequence. We hypothesized that some component of the isomiR variation reported so far could be due to systematic technical noise and not real. Results We have developed the XICRA pipeline to analyze small RNA sequencing data at the isomiR level. We exploited its ability to use single or merged reads to compare isomiR results derived from paired-end (PE) reads with those from single reads (SR) to address whether detectable sequence differences relative to canonical miRNAs found in isomiRs are true biological variations or the result of errors in sequencing. We have detected non-negligible systematic differences between SR and PE data which primarily affect putative internally edited isomiRs, and at a much smaller frequency terminal length changing isomiRs. This is relevant for the identification of true isomiRs in small RNA sequencing datasets. Conclusions We conclude that potential artifacts derived from sequencing errors and/or data processing could result in an overestimation of abundance and diversity of miRNA isoforms. Efforts in annotating the isomiRnome should take this into account.


2019 ◽  
Author(s):  
Anna James-Bott ◽  
Adam P. Cribbs

AbstractMany tools have been developed to analyse small RNA sequencing data, however it remains a challenging task to accurately process reads aligning to small RNA due to their short-read length. Most pipelines have been developed with miRNA analysis in mind and there are currently very few workflows focused on the analysis of transfer RNAs. Moreover, these workflows suffer from being low throughput, difficult to install and lack sufficient visualisation to make the output interpretable. To address these issues, we have built a comprehensive and customisable small RNA-seq data analysis pipeline, with emphasis on the analysis of tRNAs. The pipeline takes as an input a fastq file of small RNA sequencing reads and performs successive steps of mapping and alignment to transposable elements, gene transcripts, miRNAs, snRNAs, rRNA and tRNAs. Subsequent steps are then performed to generate summary statistics on reads of tRNA origin, which are then visualised in a html report. Unlike other low-throughput analysis tools currently available, our high-throughput method allows for the simultaneous analysis of multiple samples and scales with the number of input files. tRNAnalysis is command line runnable and is implemented predominantly using Python and R. The source code is available at https://github.com/Acribbs/tRNAnalysis.


2019 ◽  
Author(s):  
Jiang Li ◽  
Alvin T. Kho ◽  
Robert P. Chase ◽  
Lorena Pantano-Rubino ◽  
Leanna Farnam ◽  
...  

Abstract Background Circulating RNAs are potential disease biomarkers and their function is being actively investigated. Next generation sequencing (NGS) is a common means to interrogate the small RNA'ome or the full spectrum of small RNAs (<200 nucleotide length) of a biological system. A pivotal problem in NGS based small RNA analysis is identifying and quantifying the small RNA'ome constituent components. Most existing NGS data analysis tools focus on the microRNA component and a few other small RNA types like piRNA, snRNA and snoRNA. A comprehensive platform is needed to interrogate the full small RNA'ome, a prerequisite for down-stream data analysis. Results We present COMPASS, a comprehensive modular stand-alone platform for identifying and quantifying small RNAs from small RNA sequencing data. COMPASS contains prebuilt customizable standard RNA databases and sequence processing tools to enable turnkey basic small RNA analysis. We evaluated COMPASS against comparable existing tools on small RNA sequencing data set from serum samples of 12 healthy human controls, and COMPASS identified a greater diversity and abundance of small RNA molecules. Conclusion COMPASS is modular, stand-alone and integrates multiple customizable RNA databases and sequence processing tool and is distributed under the GNU General Public License free to non-commercial registered users at https://regepi.bwh.harvard.edu/circurna/ and the source code is available at https://github.com/cougarlj/COMPASS.


Author(s):  
Vanika Garg ◽  
Rajeev K. Varshney

AbstractOver the past decades, next-generation sequencing (NGS) has been employed extensively for investigating the regulatory mechanisms of small RNAs. Several bioinformatics tools are available for aiding biologists to extract meaningful information from enormous amounts of data generated by NGS platforms. This chapter describes a detailed methodology for analyzing small RNA sequencing data using different open source tools. We elaborate on various steps involved in analysis, from processing the raw sequencing reads to identifying miRNAs, their targets, and differential expression studies.


2019 ◽  
Author(s):  
Jiang Li ◽  
Alvin T. Kho ◽  
Robert P. Chase ◽  
Lorena Pantano-Rubino ◽  
Leanna Farnam ◽  
...  

AbstractBackgroundCirculating RNAs are potential disease biomarkers and their function is being actively investigated. Next generation sequencing (NGS) is a common means to interrogate the small RNA’ome or the full spectrum of small RNAs (<200 nucleotide length) of a biological system. A pivotal problem in NGS based small RNA analysis is identifying and quantifying the small RNA’ome constituent components. Most existing NGS data analysis tools focus on the microRNA component and a few other small RNA types like piRNA, snRNA and snoRNA. A comprehensive platform is needed to interrogate the full small RNA’ome, a prerequisite for down-stream data analysis.ResultsWe present COMPASS, a comprehensive modular stand-alone platform for identifying and quantifying small RNAs from small RNA sequencing data. COMPASS contains prebuilt customizable standard RNA databases and sequence processing tools to enable turnkey basic small RNA analysis. We evaluated COMPASS against comparable existing tools on small RNA sequencing data set from serum samples of 12 healthy human controls, and COMPASS identified a greater diversity and abundance of small RNA molecules.ConclusionCOMPASS is modular, stand-alone and integrates multiple customizable RNA databases and sequence processing tool and is distributed under the GNU General Public License free to non-commercial registered users at https://regepi.bwh.harvard.edu/circurna/ and the source code is available at https://github.com/cougarlj/COMPASS.


2017 ◽  
Vol 45 (21) ◽  
pp. 12140-12151 ◽  
Author(s):  
Xiaogang Wu ◽  
Taek-Kyun Kim ◽  
David Baxter ◽  
Kelsey Scherler ◽  
Aaron Gordon ◽  
...  

2020 ◽  
Vol 522 (3) ◽  
pp. 776-782
Author(s):  
Wei-Hao Lee ◽  
Kai-Pu Chen ◽  
Kai Wang ◽  
Hsuan-Cheng Huang ◽  
Hsueh-Fen Juan

2016 ◽  
Vol 13 (5) ◽  
Author(s):  
Matthew Kanke ◽  
Jeanette Baran-Gale ◽  
Jonathan Villanueva ◽  
Praveen Sethupathy

SummarySmall non-coding RNAs, in particular microRNAs, are critical for normal physiology and are candidate biomarkers, regulators, and therapeutic targets for a wide variety of diseases. There is an ever-growing interest in the comprehensive and accurate annotation of microRNAs across diverse cell types, conditions, species, and disease states. Highthroughput sequencing technology has emerged as the method of choice for profiling microRNAs. Specialized bioinformatic strategies are required to mine as much meaningful information as possible from the sequencing data to provide a comprehensive view of the microRNA landscape. Here we present miRquant 2.0, an expanded bioinformatics tool for accurate annotation and quantification of microRNAs and their isoforms (termed isomiRs) from small RNA-sequencing data. We anticipate that miRquant 2.0 will be useful for researchers interested not only in quantifying known microRNAs but also mining the rich well of additional information embedded in small RNA-sequencing data.


2009 ◽  
Vol 37 (8) ◽  
pp. 2461-2470 ◽  
Author(s):  
H. Alexander Ebhardt ◽  
Herbert H. Tsang ◽  
Denny C. Dai ◽  
Yifeng Liu ◽  
Babak Bostan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document