scholarly journals WIND (Workflow for pIRNAs aNd beyonD): a strategy for in-depth analysis of small RNA-seq data

F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1
Author(s):  
Konstantinos Geles ◽  
Domenico Palumbo ◽  
Assunta Sellitto ◽  
Giorgio Giurato ◽  
Eleonora Cianflone ◽  
...  

Current bioinformatics workflows for PIWI-interacting RNA (piRNA) analysis focus primarily on germline-derived piRNAs and piRNA-clusters. Frequently, they suffer from outdated piRNA databases, questionable quantification methods, and lack of reproducibility. Often, pipelines specific to miRNA analysis are used for the piRNA research in silico. Furthermore, the absence of a well-established database for piRNA annotation, as for miRNA, leads to uniformity issues between studies and generates confusion for data analysts and biologists. For these reasons, we have developed WIND (Workflow for pIRNAs aNd beyonD), a bioinformatics workflow that addresses the crucial issue of piRNA annotation, thereby allowing a reliable analysis of small RNA sequencing data for the identification of piRNAs and other small non-coding RNAs (sncRNAs) that in the past have been incorrectly classified as piRNAs. WIND allows the creation of a comprehensive annotation track of sncRNAs combining information available in RNAcentral, with piRNA sequences from piRNABank, the first database dedicated to piRNA annotation. WIND was built with Docker containers for reproducibility and integrates widely used bioinformatics tools for sequence alignment and quantification. In addition, it includes Bioconductor packages for exploratory data and differential expression analysis. Moreover, WIND implements a "dual" approach for the evaluation of sncRNAs expression level quantifying the aligned reads to the annotated genome and carrying out an alignment-free transcript quantification using reads mapped to the transcriptome. Therefore, a broader range of piRNAs can be annotated, improving their quantification and easing the subsequent downstream analysis. WIND performance has been tested with several small RNA-seq datasets, demonstrating how our approach can be a useful and comprehensive resource to analyse piRNAs and other classes of sncRNAs.

F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1
Author(s):  
Konstantinos Geles ◽  
Domenico Palumbo ◽  
Assunta Sellitto ◽  
Giorgio Giurato ◽  
Eleonora Cianflone ◽  
...  

Current bioinformatics workflows for PIWI-interacting RNA (piRNA) analysis focus primarily on germline-derived piRNAs and piRNA-clusters. Frequently, they suffer from outdated piRNA databases, questionable quantification methods, and lack of reproducibility. Often, pipelines specific to miRNA analysis are used for the piRNA research in silico. Furthermore, the absence of a well-established database for piRNA annotation, as for miRNA, leads to uniformity issues between studies and generates confusion for data analysts and biologists. For these reasons, we have developed WIND (Workflow for pIRNAs aNd beyonD), a bioinformatics workflow that addresses the crucial issue of piRNA annotation, thereby allowing a reliable analysis of small RNA sequencing data for the identification of piRNAs and other small non-coding RNAs (sncRNAs) that in the past have been incorrectly classified as piRNAs. WIND allows the creation of a comprehensive annotation track of sncRNAs combining information available in RNAcentral, with piRNA sequences from piRNABank, the first database dedicated to piRNA annotation. WIND was built with Docker containers for reproducibility and integrates widely used bioinformatics tools for sequence alignment and quantification. In addition, it includes Bioconductor packages for exploratory data and differential expression analysis. Moreover, WIND implements a "dual" approach for the evaluation of sncRNAs expression level quantifying the aligned reads to the annotated genome and carrying out an alignment-free transcript quantification using reads mapped to the transcriptome. Therefore, a broader range of piRNAs can be annotated, improving their quantification and easing the subsequent downstream analysis. WIND performance has been tested with several small RNA-seq datasets, demonstrating how our approach can be a useful and comprehensive resource to analyse piRNAs and other classes of sncRNAs.


F1000Research ◽  
2021 ◽  
Vol 10 ◽  
pp. 1
Author(s):  
Konstantinos Geles ◽  
Domenico Palumbo ◽  
Assunta Sellitto ◽  
Giorgio Giurato ◽  
Eleonora Cianflone ◽  
...  

Current bioinformatics workflows for PIWI-interacting RNA (piRNA) analysis focus primarily on germline-derived piRNAs and piRNA-clusters. Frequently, they suffer from outdated piRNA databases, questionable quantification methods, and lack of reproducibility. Often, pipelines specific to miRNA analysis are used for the piRNA research in silico. Furthermore, the absence of a well-established database for piRNA annotation, as for miRNA, leads to uniformity issues between studies and generates confusion for data analysts and biologists. For these reasons, we have developed WIND (Workflow for pIRNAs aNd beyonD), a bioinformatics workflow that addresses the crucial issue of piRNA annotation, thereby allowing a reliable analysis of small RNA sequencing data for the identification of piRNAs and other small non-coding RNAs (sncRNAs) that in the past have been incorrectly classified as piRNAs. WIND allows the creation of a comprehensive annotation track of sncRNAs combining information available in RNAcentral, with piRNA sequences from piRNABank, the first database dedicated to piRNA annotation. WIND was built with Docker containers for reproducibility and integrates widely used bioinformatics tools for sequence alignment and quantification. In addition, it includes Bioconductor packages for exploratory data and differential expression analysis. Moreover, WIND implements a "dual" approach for the evaluation of sncRNAs expression level quantifying the aligned reads to the annotated genome and carrying out an alignment-free transcript quantification using reads mapped to the transcriptome. Therefore, a broader range of piRNAs can be annotated, improving their quantification and easing the subsequent downstream analysis. WIND performance has been tested with several small RNA-seq datasets, demonstrating how our approach can be a useful and comprehensive resource to analyse piRNAs and other classes of sncRNAs.


2020 ◽  
Vol 21 (5) ◽  
pp. 1754 ◽  
Author(s):  
Enrico Gaffo ◽  
Michele Bortolomeazzi ◽  
Andrea Bisognin ◽  
Piero Di Battista ◽  
Federica Lovisa ◽  
...  

MicroRNA-offset RNAs (moRNAs) are microRNA-like small RNAs generated by microRNA precursors. To date, little is known about moRNAs and bioinformatics tools to inspect their expression are still missing. We developed miR&moRe2, the first bioinformatics method to consistently characterize microRNAs, moRNAs, and their isoforms from small RNA sequencing data. To illustrate miR&moRe2 discovery power, we applied it to several published datasets. MoRNAs identified by miR&moRe2 were in agreement with previous research findings. Moreover, we observed that moRNAs and new microRNAs predicted by miR&moRe2 were downregulated upon the silencing of the microRNA-biogenesis pathway. Further, in a sizeable dataset of human blood cell populations, tens of novel miRNAs and moRNAs were discovered, some of them with significantly varied expression levels among the cell types. Results demonstrate that miR&moRe2 is a valid tool for a comprehensive study of small RNAs generated from microRNA precursors and could help to investigate their biogenesis and function.


2019 ◽  
Vol 35 (22) ◽  
pp. 4834-4836
Author(s):  
Tim Jeske ◽  
Peter Huypens ◽  
Laura Stirm ◽  
Selina Höckele ◽  
Christine M Wurmser ◽  
...  

Abstract Summary Despite their fundamental role in various biological processes, the analysis of small RNA sequencing data remains a challenging task. Major obstacles arise when short RNA sequences map to multiple locations in the genome, align to regions that are not annotated or underwent post-transcriptional changes which hamper accurate mapping. In order to tackle these issues, we present a novel profiling strategy that circumvents the need for read mapping to a reference genome by utilizing the actual read sequences to determine expression intensities. After differential expression analysis of individual sequence counts, significant sequences are annotated against user defined feature databases and clustered by sequence similarity. This strategy enables a more comprehensive and concise representation of small RNA populations without any data loss or data distortion. Availability and implementation Code and documentation of our R package at http://ibis.helmholtz-muenchen.de/deus/. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 2 (3) ◽  
Author(s):  
Stuart Lee ◽  
Albert Y Zhang ◽  
Shian Su ◽  
Ashley P Ng ◽  
Aliaksei Z Holik ◽  
...  

Abstract RNA-seq datasets can contain millions of intron reads per library that are typically removed from downstream analysis. Only reads overlapping annotated exons are considered to be informative since mature mRNA is assumed to be the major component sequenced, especially for poly(A) RNA libraries. In this study, we show that intron reads are informative, and through exploratory data analysis of read coverage that intron signal is representative of both pre-mRNAs and intron retention. We demonstrate how intron reads can be utilized in differential expression analysis using our index method where a unique set of differentially expressed genes can be detected using intron counts. In exploring read coverage, we also developed the superintronic software that quickly and robustly calculates user-defined summary statistics for exonic and intronic regions. Across multiple datasets, superintronic enabled us to identify several genes with distinctly retained introns that had similar coverage levels to that of neighbouring exons. The work and ideas presented in this paper is the first of its kind to consider multiple biological sources for intron reads through exploratory data analysis, minimizing bias in discovery and interpretation of results. Our findings open up possibilities for further methods development for intron reads and RNA-seq data in general.


2021 ◽  
Author(s):  
Tobias Fehlmann ◽  
Fabian Kern ◽  
Omar Laham ◽  
Christina Backes ◽  
Jeffrey Solomon ◽  
...  

Abstract Analyzing all features of small non-coding RNA sequencing data can be demanding and challenging. To facilitate this process, we developed miRMaster. After the analysis of over 125 000 human samples and 1.5 trillion human small RNA reads over 4 years, we present miRMaster 2 with a wide range of updates and new features. We extended our reference data sets so that miRMaster 2 now supports the analysis of eight species (e.g. human, mouse, chicken, dog, cow) and 10 non-coding RNA classes (e.g. microRNAs, piRNAs, tRNAs, rRNAs, circRNAs). We also incorporated new downstream analysis modules such as batch effect analysis or sample embeddings using UMAP, and updated annotation data bases included by default (miRBase, Ensembl, GtRNAdb). To accommodate the increasing popularity of single cell small-RNA sequencing data, we incorporated a module for unique molecular identifier (UMI) processing. Further, the output tables and graphics have been improved based on user feedback and new output formats that emerged in the community are now supported (e.g. miRGFF3). Finally, we integrated differential expression analysis with the miRNA enrichment analysis tool miEAA. miRMaster is freely available at https://www.ccb.uni-saarland.de/mirmaster2.


F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 721 ◽  
Author(s):  
Amanda Auerbach ◽  
Gopi Vyas ◽  
Anne Li ◽  
Marc Halushka ◽  
Kenneth W. Witwer

Breast milk is replete with nutritional content as well as nucleic acids including microRNAs (miRNAs). In a recent report, adult humans who drank bovine milk appeared to have increased circulating levels of miRNAs miR-29b-3p and miR-200c-3p. Since these miRNAs are homologous between human and cow, these results could be explained by xeno-miRNA influx, endogenous miRNA regulation, or both. More data were needed to validate the results and explore for additional milk-related alterations in circulating miRNAs. Samples from the published study were obtained, and 223 small RNA features were profiled with a custom OpenArray, followed by individual quantitative PCR assays for selected miRNAs. Additionally, small RNA sequencing (RNA-seq) data obtained from plasma samples of the same project were analyzed to find human and uniquely bovine miRNAs. OpenArray revealed no significantly altered miRNA signals after milk ingestion, and this was confirmed by qPCR. Plasma sequencing data contained no miR-29b or miR-200c reads and no intake-consistent mapping of uniquely bovine miRNAs. In conclusion, the results do not support transfer of dietary xenomiRs into the circulation of adult humans.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Matthew Chung ◽  
Vincent M. Bruno ◽  
David A. Rasko ◽  
Christina A. Cuomo ◽  
José F. Muñoz ◽  
...  

AbstractAdvances in transcriptome sequencing allow for simultaneous interrogation of differentially expressed genes from multiple species originating from a single RNA sample, termed dual or multi-species transcriptomics. Compared to single-species differential expression analysis, the design of multi-species differential expression experiments must account for the relative abundances of each organism of interest within the sample, often requiring enrichment methods and yielding differences in total read counts across samples. The analysis of multi-species transcriptomics datasets requires modifications to the alignment, quantification, and downstream analysis steps compared to the single-species analysis pipelines. We describe best practices for multi-species transcriptomics and differential gene expression.


Plants ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 267
Author(s):  
Axel J. Giudicatti ◽  
Ariel H. Tomassi ◽  
Pablo A. Manavella ◽  
Agustin L. Arce

MicroRNAs are small regulatory RNAs involved in several processes in plants ranging from development and stress responses to defense against pathogens. In order to accomplish their molecular functions, miRNAs are methylated and loaded into one ARGONAUTE (AGO) protein, commonly known as AGO1, to stabilize and protect the molecule and to assemble a functional RNA-induced silencing complex (RISC). A specific machinery controls miRNA turnover to ensure the silencing release of targeted-genes in given circumstances. The trimming and tailing of miRNAs are fundamental modifications related to their turnover and, hence, to their action. In order to gain a better understanding of these modifications, we analyzed Arabidopsis thaliana small RNA sequencing data from a diversity of mutants, related to miRNA biogenesis, action, and turnover, and from different cellular fractions and immunoprecipitations. Besides confirming the effects of known players in these pathways, we found increased trimming and tailing in miRNA biogenesis mutants. More importantly, our analysis allowed us to reveal the importance of ARGONAUTE 1 (AGO1) loading, slicing activity, and cellular localization in trimming and tailing of miRNAs.


Sign in / Sign up

Export Citation Format

Share Document