scholarly journals sRNAnalyzer—a flexible and customizable small RNA sequencing data analysis pipeline

2017 ◽  
Vol 45 (21) ◽  
pp. 12140-12151 ◽  
Author(s):  
Xiaogang Wu ◽  
Taek-Kyun Kim ◽  
David Baxter ◽  
Kelsey Scherler ◽  
Aaron Gordon ◽  
...  
2019 ◽  
Author(s):  
Anna James-Bott ◽  
Adam P. Cribbs

AbstractMany tools have been developed to analyse small RNA sequencing data, however it remains a challenging task to accurately process reads aligning to small RNA due to their short-read length. Most pipelines have been developed with miRNA analysis in mind and there are currently very few workflows focused on the analysis of transfer RNAs. Moreover, these workflows suffer from being low throughput, difficult to install and lack sufficient visualisation to make the output interpretable. To address these issues, we have built a comprehensive and customisable small RNA-seq data analysis pipeline, with emphasis on the analysis of tRNAs. The pipeline takes as an input a fastq file of small RNA sequencing reads and performs successive steps of mapping and alignment to transposable elements, gene transcripts, miRNAs, snRNAs, rRNA and tRNAs. Subsequent steps are then performed to generate summary statistics on reads of tRNA origin, which are then visualised in a html report. Unlike other low-throughput analysis tools currently available, our high-throughput method allows for the simultaneous analysis of multiple samples and scales with the number of input files. tRNAnalysis is command line runnable and is implemented predominantly using Python and R. The source code is available at https://github.com/Acribbs/tRNAnalysis.


2015 ◽  
Vol 10 ◽  
pp. BMI.S25132 ◽  
Author(s):  
Jun-ichi Satoh ◽  
Yoshihiro Kino ◽  
Shumpei Niida

Background Alzheimer's disease (AD) is the most common cause of dementia with no curative therapy currently available. Establishment of sensitive and non-invasive biomarkers that promote an early diagnosis of AD is crucial for the effective administration of disease-modifying drugs. MicroRNAs (miRNAs) mediate posttranscriptional repression of numerous target genes. Aberrant regulation of miRNA expression is implicated in AD pathogenesis, and circulating miRNAs serve as potential biomarkers for AD. However, data analysis of numerous AD-specific miRNAs derived from small RNA-sequencing (RNA-Seq) is most often laborious. Methods To identify circulating miRNA biomarkers for AD, we reanalyzed a publicly available small RNA-Seq dataset, composed of blood samples derived from 48 AD patients and 22 normal control (NC) subjects, by a simple web-based miRNA data analysis pipeline that combines omiRas and DIANA miRPath. Results By using omiRas, we identified 27 miRNAs expressed differentially between both groups, including upregulation in AD of miR-26b-3p, miR-28–3p, miR-30c-5p, miR-30d-5p, miR-148b-5p, miR-151a-3p, miR-186–5p, miR-425–5p, miR-550a-5p, miR-1468, miR-4781–3p, miR-5001–3p, and miR-6513–3p and downregulation in AD of let-7a-5p, let-7e-5p, let-7f-5p, let-7g-5p, miR-15a-5p, miR-17–3p, miR-29b-3p, miR-98–5p, miR-144–5p, miR-148a-3p, miR-502–3p, miR-660–5p, miR-1294, and miR-3200–3p. DIANA miRPath indicated that miRNA-regulated pathways potentially down– regulated in AD are linked with neuronal synaptic functions, while those upregulated in AD are implicated in cell survival and cellular communication. Conclusions The simple web-based miRNA data analysis pipeline helps us to effortlessly identify candidates for miRNA biomarkers and pathways of AD from the complex small RNA–Seq data.


2021 ◽  
Author(s):  
Jose Francisco Sanchez-Herrero ◽  
Raquel Pluvinet ◽  
Antonio Luna-de Haro ◽  
Lauro Sumoy

Abstract Background Next generation sequencing has allowed the discovery of miRNA isoforms, termed isomirs. Some isomirs are derived from imprecise processing of pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition of bases at the ends or editing of internal bases, resulting in base differences relative to the template DNA sequence. We hypothesized that some component of the isomir variation reported so far could be due to systematic technical noise and not real. Results We have developed the XICRA pipeline to analyze small RNA sequencing data at the isomir level. We exploited its ability to use single or merged reads to compare isomir results derived from paired-end (PE) reads with those from single reads (SR) to address whether detectable sequence differences relative to canonical miRNAs found in isomirs are true biological variations or the result of errors in sequencing. We have detected non-negligible systematic differences between SR and PE data which primarily affect putative internally edited isomirs, and at a much smaller frequency terminal length changing isomirs. This is relevant for the identification of true isomirs in small RNA sequencing datasets. Conclusions We conclude that potential artifacts derived from sequencing errors and/or data processing could result in an overestimation of abundance and diversity of miRNA isoforms. Efforts in annotating the isomirnome should take this into account.


2019 ◽  
Author(s):  
Jiang Li ◽  
Alvin T. Kho ◽  
Robert P. Chase ◽  
Lorena Pantano-Rubino ◽  
Leanna Farnam ◽  
...  

Abstract Background Circulating RNAs are potential disease biomarkers and their function is being actively investigated. Next generation sequencing (NGS) is a common means to interrogate the small RNA'ome or the full spectrum of small RNAs (<200 nucleotide length) of a biological system. A pivotal problem in NGS based small RNA analysis is identifying and quantifying the small RNA'ome constituent components. Most existing NGS data analysis tools focus on the microRNA component and a few other small RNA types like piRNA, snRNA and snoRNA. A comprehensive platform is needed to interrogate the full small RNA'ome, a prerequisite for down-stream data analysis. Results We present COMPASS, a comprehensive modular stand-alone platform for identifying and quantifying small RNAs from small RNA sequencing data. COMPASS contains prebuilt customizable standard RNA databases and sequence processing tools to enable turnkey basic small RNA analysis. We evaluated COMPASS against comparable existing tools on small RNA sequencing data set from serum samples of 12 healthy human controls, and COMPASS identified a greater diversity and abundance of small RNA molecules. Conclusion COMPASS is modular, stand-alone and integrates multiple customizable RNA databases and sequence processing tool and is distributed under the GNU General Public License free to non-commercial registered users at https://regepi.bwh.harvard.edu/circurna/ and the source code is available at https://github.com/cougarlj/COMPASS.


2019 ◽  
Author(s):  
Jiang Li ◽  
Alvin T. Kho ◽  
Robert P. Chase ◽  
Lorena Pantano-Rubino ◽  
Leanna Farnam ◽  
...  

AbstractBackgroundCirculating RNAs are potential disease biomarkers and their function is being actively investigated. Next generation sequencing (NGS) is a common means to interrogate the small RNA’ome or the full spectrum of small RNAs (<200 nucleotide length) of a biological system. A pivotal problem in NGS based small RNA analysis is identifying and quantifying the small RNA’ome constituent components. Most existing NGS data analysis tools focus on the microRNA component and a few other small RNA types like piRNA, snRNA and snoRNA. A comprehensive platform is needed to interrogate the full small RNA’ome, a prerequisite for down-stream data analysis.ResultsWe present COMPASS, a comprehensive modular stand-alone platform for identifying and quantifying small RNAs from small RNA sequencing data. COMPASS contains prebuilt customizable standard RNA databases and sequence processing tools to enable turnkey basic small RNA analysis. We evaluated COMPASS against comparable existing tools on small RNA sequencing data set from serum samples of 12 healthy human controls, and COMPASS identified a greater diversity and abundance of small RNA molecules.ConclusionCOMPASS is modular, stand-alone and integrates multiple customizable RNA databases and sequence processing tool and is distributed under the GNU General Public License free to non-commercial registered users at https://regepi.bwh.harvard.edu/circurna/ and the source code is available at https://github.com/cougarlj/COMPASS.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Jose Francisco Sanchez Herrero ◽  
Raquel Pluvinet ◽  
Antonio Luna de Haro ◽  
Lauro Sumoy

Abstract Background Next generation sequencing has allowed the discovery of miRNA isoforms, termed isomiRs. Some isomiRs are derived from imprecise processing of pre-miRNA precursors, leading to length variants. Additional variability is introduced by non-templated addition of bases at the ends or editing of internal bases, resulting in base differences relative to the template DNA sequence. We hypothesized that some component of the isomiR variation reported so far could be due to systematic technical noise and not real. Results We have developed the XICRA pipeline to analyze small RNA sequencing data at the isomiR level. We exploited its ability to use single or merged reads to compare isomiR results derived from paired-end (PE) reads with those from single reads (SR) to address whether detectable sequence differences relative to canonical miRNAs found in isomiRs are true biological variations or the result of errors in sequencing. We have detected non-negligible systematic differences between SR and PE data which primarily affect putative internally edited isomiRs, and at a much smaller frequency terminal length changing isomiRs. This is relevant for the identification of true isomiRs in small RNA sequencing datasets. Conclusions We conclude that potential artifacts derived from sequencing errors and/or data processing could result in an overestimation of abundance and diversity of miRNA isoforms. Efforts in annotating the isomiRnome should take this into account.


2022 ◽  
Author(s):  
Andreas B Diendorfer ◽  
Kseniya.Khamina not provided ◽  
marianne.pultar not provided

miND is a NGS data analysis pipeline for smallRNA sequencing data. In this protocol, the pipeline is setup and run on an AWS EC2 instance with example data from a public repository. Please see the publication paper on F1000 for more details on the pipeline and how to use it.


Sign in / Sign up

Export Citation Format

Share Document