scholarly journals Deconvolution of Expression for Nascent RNA Sequencing Data (DENR) Highlights Pre-RNA Isoform Diversity in Human Cells

2021 ◽  
Author(s):  
Yixin Zhao ◽  
Noah Dukler ◽  
Gilad Barshad ◽  
Shushan Toneyan ◽  
Charles G. Danko ◽  
...  

AbstractQuantification of mature-RNA isoform abundance from RNA-seq data has been extensively studied, but much less attention has been devoted to quantifying the abundance of distinct precursor RNAs based on nascent RNA sequencing data. Here we address this problem with a new computational method called Deconvolution of Expression for Nascent RNA sequencing data (DENR). DENR models the nascent RNA read counts at each locus as a mixture of user-provided isoforms. The performance of the baseline algorithm is enhanced by the use of machine-learning predictions of transcription start sites (TSSs) and an adjustment for the typical “shape profile” of read counts along a transcription unit. We show using simulated data that DENR clearly outperforms simple read-count-based methods for estimating the abundances of both whole genes and isoforms. By applying DENR to previously published PRO-seq data from K562 and CD4+ T cells, we find that transcription of multiple isoforms per gene is widespread, and the dominant isoform frequently makes use of an internal TSS. We also identify > 200 genes whose dominant isoforms make use of different TSSs in these two cell types. Finally, we apply DENR and StringTie to newly generated PRO-seq and RNA-seq data, respectively, for human CD4+ T cells and CD14+ monocytes, and show that entropy at the pre-RNA level makes a disproportionate contribution to overall isoform diversity, especially across cell types. Altogether, DENR is the first computational tool to enable abundance quantification of pre-RNA isoforms based on nascent RNA sequencing data, and it reveals high levels of pre-RNA isoform diversity in human cells.

2017 ◽  
Author(s):  
Luke Zappia ◽  
Belinda Phipson ◽  
Alicia Oshlack

AbstractAs single-cell RNA sequencing technologies have rapidly developed, so have analysis methods. Many methods have been tested, developed and validated using simulated datasets. Unfortunately, current simulations are often poorly documented, their similarity to real data is not demonstrated, or reproducible code is not available.Here we present the Splatter Bioconductor package for simple, reproducible and well-documented simulation of single-cell RNA-seq data. Splatter provides an interface to multiple simulation methods including Splat, our own simulation, based on a gamma-Poisson distribution. Splat can simulate single populations of cells, populations with multiple cell types or differentiation paths.


2021 ◽  
Author(s):  
Andrey S Glotov ◽  
Irina E Zelenkova ◽  
Elena S Vashukova ◽  
Anna R Shuvalova ◽  
Alexandra D Zolotareva ◽  
...  

Objectives: Although high altitude training has been increasingly popular in endurance athletes, the molecular and cellular bases of this adaptation remain poorly understood. We aimed to define the underlying physiological changes and screen for potential biomarkers of adaptation using transcriptional profiling of whole blood. More generally, we aimed to evaluate the utility of blood RNA sequencing as a modern and sensitive method of athlete's health monitoring. Methods: Seven elite female speed skaters were profiled before and after 1h intense exercise, on the 18th day of Live High, Train High (LHTH) training programme. Whole blood RNA sequencing (RNA-seq) with globin depletion was used to measure gene expression changes associated with high intensity exercise at high altitude. Eight public microarray datasets were used to identify genes uniquely regulated at high altitude. Gene markers derived from single cell RNA-seq data were used to evaluate the changes of individual cell types in the whole blood. Results: Using individual cell type signatures, we were able to deconvolute the changes of finely defined cell populations from the whole blood RNA-seq. We have detected the increase in neutrophils, platelets, erythrocytes, and CD14 monocytes, and the decrease in natural killers, CD8 T cells, memory CD4 T cells, B cells, and plasmacytoid dendritic cells. The levels of naive CD4 T cells, CD16 monocytes, and myeloid dendritic cells were unchanged. Leveraging the previously published transcriptomic data allowed us to define the expression signature unique to high-altitude adaptation. Among the identified genes we highlight PHOSPHO1, which has a known role in erythropoiesis, and MARC1 with a proposed role in endogenic NO metabolism. Finally, we find that platelets and, to a lesser extent, erythrocytes are the two major cell types that uniquely respond to altitude exercise, while neutrophils represent a more generic marker of intense exercise. Conclusions: Using publicly available data from both single-cell RNA-seq atlases and exercise-related blood profiling dramatically increases the value of whole blood RNA-seq for dynamic evaluation of physiological changes in an athlete's body. In addition to the measurement of individual gene expression changes, our approach allowed us to estimate changes of blood cell type counts from a small peripheral blood sample, without sorting or other expensive and unfeasible equipment. We also discuss a surprising parallel of hypoxia and increased thrombosis, and hypothesize about the role exercise can play in COVID-19 outcomes.


Author(s):  
Ploy N. Pratanwanich ◽  
Fei Yao ◽  
Ying Chen ◽  
Casslynn W.Q. Koh ◽  
Christopher Hendra ◽  
...  

AbstractDifferences in RNA expression can provide insights into the molecular identity of a cell, pathways involved in human diseases, and variation in RNA levels across patients associated with clinical phenotypes. RNA modifications such as m6A have been found to contribute to molecular functions of RNAs. However, quantification of differences in RNA modifications has been challenging. Here we develop a computational method (xPore) to identify differential RNA modifications from direct RNA sequencing data. We evaluate our method on transcriptome-wide m6A profiling data, demonstrating that xPore identifies positions of m6A sites at single base resolution, estimates the fraction of modified RNAs in the cell, and quantifies the differential modification rate across conditions. We apply the method to direct RNA-Sequencing data from 6 cell lines and find that many m6A sites are preserved, while a subset of m6A sites show significant differences in their modification rates across cell types. Together, we show that RNA modifications can be identified from direct RNA-sequencing with high accuracy, enabling the analysis of differential modifications and expression from a single high throughput experiment.AvailabilityxPore is available as open source software (https://github.com/GoekeLab/xpore)


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kolja Becker ◽  
Holger Klein ◽  
Eric Simon ◽  
Coralie Viollet ◽  
Christian Haslinger ◽  
...  

AbstractDiabetic Retinopathy (DR) is among the major global causes for vision loss. With the rise in diabetes prevalence, an increase in DR incidence is expected. Current understanding of both the molecular etiology and pathways involved in the initiation and progression of DR is limited. Via RNA-Sequencing, we analyzed mRNA and miRNA expression profiles of 80 human post-mortem retinal samples from 43 patients diagnosed with various stages of DR. We found differentially expressed transcripts to be predominantly associated with late stage DR and pathways such as hippo and gap junction signaling. A multivariate regression model identified transcripts with progressive changes throughout disease stages, which in turn displayed significant overlap with sphingolipid and cGMP–PKG signaling. Combined analysis of miRNA and mRNA expression further uncovered disease-relevant miRNA/mRNA associations as potential mechanisms of post-transcriptional regulation. Finally, integrating human retinal single cell RNA-Sequencing data revealed a continuous loss of retinal ganglion cells, and Müller cell mediated changes in histidine and β-alanine signaling. While previously considered primarily a vascular disease, attention in DR has shifted to additional mechanisms and cell-types. Our findings offer an unprecedented and unbiased insight into molecular pathways and cell-specific changes in the development of DR, and provide potential avenues for future therapeutic intervention.


Author(s):  
Yinlei Hu ◽  
Bin Li ◽  
Falai Chen ◽  
Kun Qu

Abstract Unsupervised clustering is a fundamental step of single-cell RNA sequencing data analysis. This issue has inspired several clustering methods to classify cells in single-cell RNA sequencing data. However, accurate prediction of the cell clusters remains a substantial challenge. In this study, we propose a new algorithm for single-cell RNA sequencing data clustering based on Sparse Optimization and low-rank matrix factorization (scSO). We applied our scSO algorithm to analyze multiple benchmark datasets and showed that the cluster number predicted by scSO was close to the number of reference cell types and that most cells were correctly classified. Our scSO algorithm is available at https://github.com/QuKunLab/scSO. Overall, this study demonstrates a potent cell clustering approach that can help researchers distinguish cell types in single-cell RNA sequencing data.


2019 ◽  
Author(s):  
Anne-Marie Madore ◽  
Lucile Pain ◽  
Anne-Marie Boucher-Lafleur ◽  
Jolyane Meloche ◽  
Andréanne Morin ◽  
...  

AbstractBackgroundThe 17q12-21 locus is the most replicated association with asthma. However, no study had described the genetic mechanisms underlying this association considering all genes of the locus in immune cell samples isolated from asthmatic and non-asthmatic individuals.ObjectiveThis study takes benefit of samples from naïve CD4+ T cells and eosinophils isolated from the same 200 individuals to describe specific interactions between genetic variants, gene expression and DNA methylation levels for the 17q12-21 asthma locus.Methods and ResultsAfter isolation of naïve CD4+ T cells and eosinophils from blood samples, next generation sequencing was used to measure DNA methylation levels and gene expression counts. Genetic interactions were then evaluated considering genetic variants from imputed genotype data. In naïve CD4+ T cells but not eosinophils, 20 SNPs in the fourth and fifth haplotype blocks modulated both GSDMA expression and methylation levels, showing an opposite pattern of allele frequencies and expression counts in asthmatics compared to controls. Moreover, negative correlations have been measured between methylation levels of CpG sites located within the 1.5 kb region from the transcription start site of GSDMA and its expression counts.ConclusionAvailability of sequencing data from two key cell types isolated from asthmatic and non-asthmatic individuals allowed identifying a new gene in naïve CD4+ T cells that drives the association with the 17q12-21 locus, leading to a better understanding of the genetic mechanisms taking place in it.


2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Michael J. Petrany ◽  
Casey O. Swoboda ◽  
Chengyi Sun ◽  
Kashish Chetal ◽  
Xiaoting Chen ◽  
...  

AbstractWhile the majority of cells contain a single nucleus, cell types such as trophoblasts, osteoclasts, and skeletal myofibers require multinucleation. One advantage of multinucleation can be the assignment of distinct functions to different nuclei, but comprehensive interrogation of transcriptional heterogeneity within multinucleated tissues has been challenging due to the presence of a shared cytoplasm. Here, we utilized single-nucleus RNA-sequencing (snRNA-seq) to determine the extent of transcriptional diversity within multinucleated skeletal myofibers. Nuclei from mouse skeletal muscle were profiled across the lifespan, which revealed the presence of distinct myonuclear populations emerging in postnatal development as well as aging muscle. Our datasets also provided a platform for discovery of genes associated with rare specialized regions of the muscle cell, including markers of the myotendinous junction and functionally validated factors expressed at the neuromuscular junction. These findings reveal that myonuclei within syncytial muscle fibers possess distinct transcriptional profiles that regulate muscle biology.


2019 ◽  
Vol 21 (5) ◽  
pp. 1581-1595 ◽  
Author(s):  
Xinlei Zhao ◽  
Shuang Wu ◽  
Nan Fang ◽  
Xiao Sun ◽  
Jue Fan

Abstract Single-cell RNA sequencing (scRNA-seq) has been rapidly developing and widely applied in biological and medical research. Identification of cell types in scRNA-seq data sets is an essential step before in-depth investigations of their functional and pathological roles. However, the conventional workflow based on clustering and marker genes is not scalable for an increasingly large number of scRNA-seq data sets due to complicated procedures and manual annotation. Therefore, a number of tools have been developed recently to predict cell types in new data sets using reference data sets. These methods have not been generally adapted due to a lack of tool benchmarking and user guidance. In this article, we performed a comprehensive and impartial evaluation of nine classification software tools specifically designed for scRNA-seq data sets. Results showed that Seurat based on random forest, SingleR based on correlation analysis and CaSTLe based on XGBoost performed better than others. A simple ensemble voting of all tools can improve the predictive accuracy. Under nonideal situations, such as small-sized and class-imbalanced reference data sets, tools based on cluster-level similarities have superior performance. However, even with the function of assigning ‘unassigned’ labels, it is still challenging to catch novel cell types by solely using any of the single-cell classifiers. This article provides a guideline for researchers to select and apply suitable classification tools in their analysis workflows and sheds some lights on potential direction of future improvement on classification tools.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Li Yao ◽  
Teresa Shippy ◽  
Yongchao Li

Abstract In a developing nervous system, endogenous electric field (EF) influence embryonic growth. We reported the EF-directed migration of both rat Schwann cells (SCs) and oligodendrocyte precursor cells (OPCs) and explored the molecular mechanism using RNA-sequencing assay. However, previous studies revealed the differentially expressed genes (DEGs) associated with EF-guided migration of SCs or OPCs alone. In this study, we performed joint differential expression analysis on the RNA-sequencing data from both cell types. We report a number of significantly enriched gene ontology (GO) terms that are related to the cytoskeleton, cell adhesion, and cell migration. Of the DEGs associated with these terms, nine up-regulated DEGs and 32 down-regulated DEGs showed the same direction of effect in both SCs and OPCs stimulated with EFs, while the remaining DEGs responded differently. Thus, our study reveals the similarities and differences in gene expression and cell migration regulation of different glial cell types in response to EF stimulation.


Author(s):  
Massimo Andreatta ◽  
Santiago J Carmona

Abstract Summary STACAS is a computational method for the identification of integration anchors in the Seurat environment, optimized for the integration of single-cell (sc) RNA-seq datasets that share only a subset of cell types. We demonstrate that by (i) correcting batch effects while preserving relevant biological variability across datasets, (ii) filtering aberrant integration anchors with a quantitative distance measure and (iii) constructing optimal guide trees for integration, STACAS can accurately align scRNA-seq datasets composed of only partially overlapping cell populations. Availability and implementation Source code and R package available at https://github.com/carmonalab/STACAS; Docker image available at https://hub.docker.com/repository/docker/mandrea1/stacas_demo.


Sign in / Sign up

Export Citation Format

Share Document