scholarly journals Limitation of alignment-free tools in total RNA-seq quantification

2018 ◽  
Author(s):  
Douglas C. Wu ◽  
Jun Yao ◽  
Kevin S. Ho ◽  
Alan M. Lambowitz ◽  
Claus O. Wilke

AbstractBackgroundAlignment-free RNA quantification tools have significantly increased the speed of RNA-seq analysis. However, it is unclear whether these state-of-the-art RNA-seq analysis pipelines can quantify small RNAs as accurately as they do with long RNAs in the context of total RNA quantification.ResultWe comprehensively tested and compared four RNA-seq pipelines on the accuracies of gene quantification and fold-change estimation on a novel total RNA benchmarking dataset, in which small non-coding RNAs are highly represented along with other long RNAs. The four RNA-seq pipelines were of two commonly-used alignment-free pipelines and two variants of alignment-based pipelines. We found that all pipelines showed high accuracies for quantifying the expressions of long and highly-abundant genes. However, alignment-free pipelines showed systematically poorer performances in quantifying lowly-abundant and small RNAs.ConclusionWe have shown that alignment-free and traditional alignment-based quantification methods performed similarly for common gene targets, such as protein-coding genes. However, we identified a potential pitfall in analyzing and quantifying lowly-expressed genes and small RNAs with alignment-free pipelines, especially when these small RNAs contain mutations.

2020 ◽  
Vol 6 (2) ◽  
pp. 15 ◽  
Author(s):  
Lucas Maciel ◽  
David Morales-Vicente ◽  
Sergio Verjovski-Almeida

Schistosoma japonicum is a flatworm that causes schistosomiasis, a neglected tropical disease. S. japonicum RNA-Seq analyses has been previously reported in the literature on females and males obtained during sexual maturation from 14 to 28 days post-infection in mouse, resulting in the identification of protein-coding genes and pathways, whose expression levels were related to sexual development. However, this work did not include an analysis of long non-coding RNAs (lncRNAs). Here, we applied a pipeline to identify and annotate lncRNAs in 66 S. japonicum RNA-Seq publicly available libraries, from different life-cycle stages. We also performed co-expression analyses to find stage-specific lncRNAs possibly related to sexual maturation. We identified 12,291 S. japonicum expressed lncRNAs. Sequence similarity search and synteny conservation indicated that some 14% of S. japonicum intergenic lncRNAs have synteny conservation with S. mansoni intergenic lncRNAs. Co-expression analyses showed that lncRNAs and protein-coding genes in S. japonicum males and females have a dynamic co-expression throughout sexual maturation, showing differential expression between the sexes; the protein-coding genes were related to the nervous system development, lipid and drug metabolism, and overall parasite survival. Co-expression pattern suggests that lncRNAs possibly regulate these processes or are regulated by the same activation program as that of protein-coding genes.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Frédéric Jehl ◽  
Kévin Muret ◽  
Maria Bernard ◽  
Morgane Boutin ◽  
Laetitia Lagoutte ◽  
...  

AbstractLong non-coding RNAs (LNC) regulate numerous biological processes. In contrast to human, the identification of LNC in farm species, like chicken, is still lacunar. We propose a catalogue of 52,075 chicken genes enriched in LNC (http://www.fragencode.org/), built from the Ensembl reference extended using novel LNC modelled here from 364 RNA-seq and LNC from four public databases. The Ensembl reference grew from 4,643 to 30,084 LNC, of which 59% and 41% with expression ≥ 0.5 and ≥ 1 TPM respectively. Characterization of these LNC relatively to the closest protein coding genes (PCG) revealed that 79% of LNC are in intergenic regions, as in other species. Expression analysis across 25 tissues revealed an enrichment of co-expressed LNC:PCG pairs, suggesting co-regulation and/or co-function. As expected LNC were more tissue-specific than PCG (25% vs. 10%). Similarly to human, 16% of chicken LNC hosted one or more miRNA. We highlighted a new chicken LNC, hosting miR155, conserved in human, highly expressed in immune tissues like miR155, and correlated with immunity-related PCG in both species. Among LNC:PCG pairs tissue-specific in the same tissue, we revealed an enrichment of divergent pairs with the PCG coding transcription factors, as for example LHX5, HXD3 and TBX4, in both human and chicken.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Martin Bilbao-Arribas ◽  
Endika Varela-Martínez ◽  
Naiara Abendaño ◽  
Damián de Andrés ◽  
Lluís Luján ◽  
...  

Abstract Background Long non-coding RNAs (lncRNAs) are involved in several immune processes, including the immune response to vaccination, but most of them remain uncharacterised in livestock species. The mechanism of action of aluminium adjuvants as vaccine components is neither not fully understood. Results We built a transcriptome from sheep PBMCs RNA-seq data in order to identify unannotated lncRNAs and analysed their expression patterns along protein coding genes. We found 2284 novel lncRNAs and assessed their conservation in terms of sequence and synteny. Differential expression analysis performed between animals inoculated with commercial vaccines or aluminium adjuvant alone and the co-expression analysis revealed lncRNAs related to the immune response to vaccines and adjuvants. A group of co-expressed genes enriched in cytokine signalling and production highlighted the differences between different treatments. A number of differentially expressed lncRNAs were correlated with a divergently located protein-coding gene, such as the OSM cytokine. Other lncRNAs were predicted to act as sponges of miRNAs involved in immune response regulation. Conclusions This work enlarges the lncRNA catalogue in sheep and puts an accent on their involvement in the immune response to repetitive vaccination, providing a basis for further characterisation of the non-coding sheep transcriptome within different immune cells.


2011 ◽  
Vol 18 (9) ◽  
pp. 1075-1082 ◽  
Author(s):  
Eivind Valen ◽  
Pascal Preker ◽  
Peter Refsing Andersen ◽  
Xiaobei Zhao ◽  
Yun Chen ◽  
...  

2021 ◽  
Vol 35 (S1) ◽  
Author(s):  
Hilary Coller ◽  
Huiling Huang ◽  
Mithun Mitra ◽  
Kaiser Atai ◽  
Kirthana Sarathy

2015 ◽  
Vol 12 (5) ◽  
pp. 6568-6576 ◽  
Author(s):  
QI LIAO ◽  
YUNLIANG WANG ◽  
JIA CHENG ◽  
DONGJUN DAI ◽  
XINGYU ZHOU ◽  
...  

Burns ◽  
2020 ◽  
Vol 46 (5) ◽  
pp. 1128-1135 ◽  
Author(s):  
Wenchang Yu ◽  
Zaiwen Guo ◽  
Pengfei Liang ◽  
Bimei Jiang ◽  
Le Guo ◽  
...  

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Mikhail Pomaznoy ◽  
Ashu Sethi ◽  
Jason Greenbaum ◽  
Bjoern Peters

Abstract RNA-seq methods are widely utilized for transcriptomic profiling of biological samples. However, there are known caveats of this technology which can skew the gene expression estimates. Specifically, if the library preparation protocol does not retain RNA strand information then some genes can be erroneously quantitated. Although strand-specific protocols have been established, a significant portion of RNA-seq data is generated in non-strand-specific manner. We used a comprehensive stranded RNA-seq dataset of 15 blood cell types to identify genes for which expression would be erroneously estimated if strand information was not available. We found that about 10% of all genes and 2.5% of protein coding genes have a two-fold or higher difference in estimated expression when strand information of the reads was ignored. We used parameters of read alignments of these genes to construct a machine learning model that can identify which genes in an unstranded dataset might have incorrect expression estimates and which ones do not. We also show that differential expression analysis of genes with biased expression estimates in unstranded read data can be recovered by limiting the reads considered to those which span exonic boundaries. The resulting approach is implemented as a package available at https://github.com/mikpom/uslcount.


Sign in / Sign up

Export Citation Format

Share Document