scholarly journals pulseTD: RNA life cycle dynamics analysis based on pulse model of 4sU-seq time course sequencing data

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e9371
Author(s):  
Xin Wang ◽  
Siyu He ◽  
Jian Li ◽  
Jun Wang ◽  
Chengyi Wang ◽  
...  

The life cycle of intracellular RNA mainly involves transcriptional production, splicing maturation and degradation processes. Their dynamic changes are termed as RNA life cycle dynamics (RLCD). It is still challenging for the accurate and robust identification of RLCD under unknow the functional form of RLCD. By using the pulse model, we developed an R package named pulseTD to identify RLCD by integrating 4sU-seq and RNA-seq data, and it provides flexible functions to capture continuous changes in RCLD rates. More importantly, it also can predict the trend of RNA transcription and expression changes in future time points. The pulseTD shows better accuracy and robustness than some other methods, and it is available on the GitHub repository (https://github.com/bioWzz/pulseTD_0.2.0).


2020 ◽  
Author(s):  
Maxim Ivanov ◽  
Albin Sandelin ◽  
Sebastian Marquardt

Abstract Background: The quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational pipelines for de novo transcriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data. Results: We developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5' and 3' tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.We reconstructed de novo the transcriptional landscape of wild type Arabidopsis thaliana seedlings as a proof-of-principle. A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11. In particular, we identify thousands of transient transcripts missing from the existing annotations. Our new annotation promises to improve the quality of A.thaliana genome research.Conclusions: Our proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.



2021 ◽  
Author(s):  
Daniel Osorio ◽  
Marieke Lydia Kuijjer ◽  
James J. Cai

Motivation: Characterizing cells with rare molecular phenotypes is one of the promises of high throughput single-cell RNA sequencing (scRNA-seq) techniques. However, collecting enough cells with the desired molecular phenotype in a single experiment is challenging, requiring several samples preprocessing steps to filter and collect the desired cells experimentally before sequencing. Data integration of multiple public single-cell experiments stands as a solution for this problem, allowing the collection of enough cells exhibiting the desired molecular signatures. By increasing the sample size of the desired cell type, this approach enables a robust cell type transcriptome characterization. Results: Here, we introduce rPanglaoDB, an R package to download and merge the uniformly processed and annotated scRNA-seq data provided by the PanglaoDB database. To show the potential of rPanglaoDB for collecting rare cell types by integrating multiple public datasets, we present a biological application collecting and characterizing a set of 157 fibrocytes. Fibrocytes are a rare monocyte-derived cell type, that exhibits both the inflammatory features of macrophages and the tissue remodeling properties of fibroblasts. This constitutes the first fibrocytes' unbiased transcriptome profile report. We compared the transcriptomic profile of the fibrocytes against the fibroblasts collected from the same tissue samples and confirm their associated relationship with healing processes in tissue damage and infection through the activation of the prostaglandin biosynthesis and regulation pathway. Availability and Implementation: rPanglaoDB is implemented as an R package available through the CRAN repositories https://CRAN.R-project.org/package=rPanglaoDB.



2018 ◽  
Author(s):  
Fatemeh Gholizadeh ◽  
Zahra Salehi ◽  
Ali Mohammad banaei-Moghaddam ◽  
Abbas Rahimi Foroushani ◽  
Kaveh kavousi

AbstractWith the advent of the Next Generation Sequencing technologies, RNA-seq has become known as an optimal approach for studying gene expression profiling. Particularly, time course RNA-seq differential expression analysis has been used in many studies to identify candidate genes. However, applying a statistical method to efficiently identify differentially expressed genes (DEGs) in time course studies is challenging due to inherent characteristics of such data including correlation and dependencies over time. Here we aim to relatively compare EBSeq-HMM, a Hidden Markov-based model, with multiDE, a Log-Linear-based model, in a real time course RNA sequencing data. In order to conduct the comparison, common DEGs detected by edgeR, DESeq2 and Voom (referred to as Benchmark DEGs) were utilized as a measure. Each of the two models were compared using different normalization methods. The findings revealed that multiDE identified more Benchmark DEGs and showed a higher agreement with them than EBSeq-HMM. Furthermore, multiDE and EBSeq-HMM displayed their best performance using TMM and Upper-Quartile normalization methods, respectively.



2020 ◽  
Author(s):  
Maxim Ivanov ◽  
Albin Sandelin ◽  
Sebastian Marquardt

AbstractBackgroundThe quality of gene annotation determines the interpretation of results obtained in transcriptomic studies. The growing number of genome sequence information calls for experimental and computational pipelines for de novo transcriptome annotation. Ideally, gene and transcript models should be called from a limited set of key experimental data.ResultsWe developed TranscriptomeReconstructoR, an R package which implements a pipeline for automated transcriptome annotation. It relies on integrating features from independent and complementary datasets: i) full-length RNA-seq for detection of splicing patterns and ii) high-throughput 5’ and 3’ tag sequencing data for accurate definition of gene borders. The pipeline can also take a nascent RNA-seq dataset to supplement the called gene model with transient transcripts.We reconstructed de novo the transcriptional landscape of wild type Arabidopsis thaliana seedlings as a proof-of-principle. A comparison to the existing transcriptome annotations revealed that our gene model is more accurate and comprehensive than the two most commonly used community gene models, TAIR10 and Araport11. In particular, we identify thousands of transient transcripts missing from the existing annotations. Our new annotation promises to improve the quality of A.thaliana genome research.ConclusionsOur proof-of-concept data suggest a cost-efficient strategy for rapid and accurate annotation of complex eukaryotic transcriptomes. We combine the choice of library preparation methods and sequencing platforms with the dedicated computational pipeline implemented in the TranscriptomeReconstructoR package. The pipeline only requires prior knowledge on the reference genomic DNA sequence, but not the transcriptome. The package seamlessly integrates with Bioconductor packages for downstream analysis.



2020 ◽  
Vol 36 (10) ◽  
pp. 3115-3123 ◽  
Author(s):  
Teng Fei ◽  
Tianwei Yu

Abstract Motivation Batch effect is a frequent challenge in deep sequencing data analysis that can lead to misleading conclusions. Existing methods do not correct batch effects satisfactorily, especially with single-cell RNA sequencing (RNA-seq) data. Results We present scBatch, a numerical algorithm for batch-effect correction on bulk and single-cell RNA-seq data with emphasis on improving both clustering and gene differential expression analysis. scBatch is not restricted by assumptions on the mechanism of batch-effect generation. As shown in simulations and real data analyses, scBatch outperforms benchmark batch-effect correction methods. Availability and implementation The R package is available at github.com/tengfei-emory/scBatch. The code to generate results and figures in this article is available at github.com/tengfei-emory/scBatch-paper-scripts. Supplementary information Supplementary data are available at Bioinformatics online.



2016 ◽  
Author(s):  
Ning Leng ◽  
Li-Fang Chu ◽  
Jeea Choi ◽  
Christina Kendziorski ◽  
James A. Thomson ◽  
...  

AbstractMotivationWith the development of single cell RNA-seq (scRNA-seq) technology, scRNA-seq experiments with ordered conditions (e.g. time-course) are becoming common. Methods developed for analyzing ordered bulk RNA-seq experiments are not applicable to scRNA-seq, since their distributional assumptions are often violated by additional heterogeneities prevalent in scRNA-seq. Here we present SC-Pattern - an empirical Bayes model to characterize genes with expression changes in ordered scRNA-seq experiments. SCPattern utilizes the non-parametrical Kolmogorov-Smirnov statistic, thus it has the flexibility to identify genes with a wide variety of types of changes. Additionally, the Bayes framework allows SCPattern to classify genes into expression patterns with probability estimates.ResultsSimulation results show that SCPattern is well powered for identifying genes with expression changes while the false discovery rate is well controlled. SCPattern is also able to accurately classify these dynamic genes into directional expression patterns. Applied to a scRNA-seq time course dataset studying human embryonic cell differentiation, SCPattern detected a group of important genes that are involved in mesendoderm and definitive endoderm cell fate decisions, positional patterning, and cell cycle.Availability and ImplementationThe SCPattern is implemented as an R package along with a user-friendly graphical interface, which are available at:https://github.com/lengning/SCPatternContact:[email protected]



2019 ◽  
Author(s):  
Kuan-Hao Chao ◽  
Yi-Wen Hsiao ◽  
Yi-Fang Lee ◽  
Chien-Yueh Lee ◽  
Liang-Chuan Lai ◽  
...  

RNA-Seq analysis has revolutionized researchers' understanding of the transcriptome in biological research. Assessing the differences in transcriptomic profiles between tissue samples or patient groups enables researchers to explore the underlying biological impact of transcription. RNA-Seq analysis requires multiple processing steps and huge computational capabilities. There are many well-developed R packages for individual steps; however, there are few R/Bioconductor packages that integrate existing software tools into a comprehensive RNA-Seq analysis and provide fundamental end-to-end results in pure R environment so that researchers can quickly and easily get fundamental information in big sequencing data. To address this need, we have developed the open source R/Bioconductor package, RNASeqR. It allows users to run an automated RNA-Seq analysis with only six steps, producing essential tabular and graphical results for further biological interpretation. The features of RNASeqR include: six-step analysis, comprehensive visualization, background execution version, and the integration of both R and command-line software. RNASeqR provides fast, light-weight, and easy-to-run RNA-Seq analysis pipeline in pure R environment. It allows users to efficiently utilize popular software tools, including both R/Bioconductor and command-line tools, without predefining the resources or environments. RNASeqR is freely available for Linux and macOS operating systems from Bioconductor (https://bioconductor.org/packages/release/bioc/html/RNASeqR.html).



2018 ◽  
Author(s):  
João Antonio Debarba ◽  
Karina Mariante Monteiro ◽  
Alexandra Lehmkuhl Gerber ◽  
Ana Tereza Ribeiro de Vasconcelos ◽  
Arnaldo Zaha

AbstractBackgroundEchinococcus granulosus has a complex life cycle involving two mammalian hosts. The transition from one host to another is accompanied by changes in gene expression, and the transcriptional events that underlie these processes have not yet been fully characterized.ResultsIn this study, RNA-seq is used to compare the transcription profiles of four time samples of E. granulosus protoscoleces in vitro induced to strobilar development. We identified 818 differentially expressed genes, which were divided into eight expression clusters formed over the entire 24 hours time course and indicated different transcriptional patterns. An enrichment of gene transcripts with molecular functions of signal transduction, enzymes and protein modifications was observed with progression of development.ConclusionThis transcriptomic study provides insight for understanding the complex life cycle of E. granulosus and contributes for searching for the key genes correlating with the strobilar development, providing interesting hints for further studies.



2021 ◽  
Author(s):  
Federico Agostinis ◽  
Chiara Romualdi ◽  
Gabriele Sales ◽  
Davide Risso

Summary: We present NewWave, a scalable R/Bioconductor package for the dimensionality reduction and batch effect removal of single-cell RNA sequencing data. To achieve scalability, NewWave uses mini-batch optimization and can work with out-of-memory data, enabling users to analyze datasets with millions of cells. Availability and implementation: NewWave is implemented as an open-source R package available through the Bioconductor project at https://bioconductor.org/packages/NewWave/ Supplementary information: Supplementary data are available at Bioinformatics online.



Sign in / Sign up

Export Citation Format

Share Document