Estimating and Correcting for Off-Target Cellular Contamination in Brain Cell Type Specific RNA-Seq Data

Transcriptionally profiling minor cellular populations remains an ongoing challenge in molecular genomics. Single-cell RNA sequencing has provided valuable insights into a number of hypotheses, but practical and analytical challenges have limited its widespread adoption. A similar approach, which we term single-cell type RNA sequencing (sctRNA-seq), involves the enrichment and sequencing of a pool of cells, yielding cell type-level resolution transcriptomes. While this approach offers benefits in terms of mRNA sampling from targeted cell types, it is potentially affected by off-target contamination from surrounding cell types. Here, we leveraged single-cell sequencing datasets to apply a computational approach for estimating and controlling the amount of off-target cell type contamination in sctRNA-seq datasets. In datasets obtained using a number of technologies for cell purification, we found that most sctRNA-seq datasets tended to show some amount of off-target mRNA contamination from surrounding cells. However, using covariates for cellular contamination in downstream differential expression analyses increased the quality of our models for differential expression analysis in case/control comparisons and typically resulted in the discovery of more differentially expressed genes. In general, our method provides a flexible approach for detecting and controlling off-target cell type contamination in sctRNA-seq datasets.

Download Full-text

Localization of migraine susceptibility genes in human brain by single-cell RNA sequencing

Cephalalgia ◽

10.1177/0333102418762476 ◽

2018 ◽

Vol 38 (13) ◽

pp. 1976-1983 ◽

Cited By ~ 5

Author(s):

William Renthal

Keyword(s):

Human Brain ◽

Single Cell ◽

Rna Sequencing ◽

Expression Profiles ◽

Cell Types ◽

Susceptibility Genes ◽

Brain Cell ◽

Cell Type ◽

Single Cell Rna Sequencing ◽

Brain Cell Types

Background Migraine is a debilitating disorder characterized by severe headaches and associated neurological symptoms. A key challenge to understanding migraine has been the cellular complexity of the human brain and the multiple cell types implicated in its pathophysiology. The present study leverages recent advances in single-cell transcriptomics to localize the specific human brain cell types in which putative migraine susceptibility genes are expressed. Methods The cell-type specific expression of both familial and common migraine-associated genes was determined bioinformatically using data from 2,039 individual human brain cells across two published single-cell RNA sequencing datasets. Enrichment of migraine-associated genes was determined for each brain cell type. Results Analysis of single-brain cell RNA sequencing data from five major subtypes of cells in the human cortex (neurons, oligodendrocytes, astrocytes, microglia, and endothelial cells) indicates that over 40% of known migraine-associated genes are enriched in the expression profiles of a specific brain cell type. Further analysis of neuronal migraine-associated genes demonstrated that approximately 70% were significantly enriched in inhibitory neurons and 30% in excitatory neurons. Conclusions This study takes the next step in understanding the human brain cell types in which putative migraine susceptibility genes are expressed. Both familial and common migraine may arise from dysfunction of discrete cell types within the neurovascular unit, and localization of the affected cell type(s) in an individual patient may provide insight into to their susceptibility to migraine.

Download Full-text

A Markov Random Field Model for Network-based Differential Expression Analysis of Single-cell RNA-seq Data

10.21203/rs.3.rs-116107/v1 ◽

2020 ◽

Author(s):

Hongyu Li ◽

Zhichao Xu ◽

Taylor Adams ◽

Naftali Kaminski ◽

Hongyu Zhao

Keyword(s):

Random Field ◽

Single Cell ◽

Differential Expression ◽

Markov Random Field ◽

Statistical Power ◽

Differential Expression Analysis ◽

Cell Types ◽

Cell Type ◽

Markov Random ◽

Cell Type Specific

Abstract Background: Recent development of single cell sequencing technologies has made it possible to identify genes with different expression (DE) levels at the cell type level between different groups of samples. However, the often-low sample size of single cell data limits the statistical power to identify DE genes. In this article, we propose to borrow information through known biological networks. Results: We develop MRFscRNAseq, which is based on a Markov Random Field (MRF) model to appropriately accommodate gene network information as well as dependencies among cell types to identify cell-type specific DE genes. We implement an Expectation-Maximization (EM) algorithm with mean field-like approximation to estimate model parameters and a Gibbs sampler to infer DE status. Simulation study shows that our method has better power to detect cell-type specific DE genes than conventional methods while appropriately controlling type I error rate. The usefulness of our method is demonstrated through its application to study the pathogenesis and biological processes of idiopathic pulmonary fibrosis (IPF) using a single-cell RNA-sequencing (scRNA-seq) data set, which contains 18,150 protein-coding genes across 38 cell types on lung tissues from 32 IPF patients and 28 normal controls.Conclusions: The proposed MRF model is implemented in the R package MRFscRNAseq available on GitHub. By utilizing gene-gene and cell-cell networks, our method provides differential expression analysis for scRNA-seq data with increased statistical power.

Download Full-text

Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data

Nucleic Acids Research ◽

10.1093/nar/gkx754 ◽

2017 ◽

Vol 45 (19) ◽

pp. 10978-10988 ◽

Cited By ~ 26

Author(s):

Cheng Jia ◽

Yu Hu ◽

Derek Kelly ◽

Junhyong Kim ◽

Mingyao Li ◽

...

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Sequencing ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Sequencing Data ◽

Technical Noise ◽

Single Cell Rna Sequencing

Download Full-text

Integrated Single Cell Atlas of Endothelial Cells of the Human Lung

Circulation ◽

10.1161/circulationaha.120.052318 ◽

2021 ◽

Author(s):

Jonas C. Schupp ◽

Taylor S. Adams ◽

Carlos Cosme Jr. ◽

Micha Sam Brickman Raredon ◽

Yifan Yuan ◽

...

Keyword(s):

Endothelial Cells ◽

Pulmonary Hypertension ◽

Single Cell ◽

Differential Expression ◽

Human Lung ◽

Differential Expression Analysis ◽

Cell Types ◽

Marker Genes ◽

Lung Endothelium ◽

Lung Endothelial Cells

Background: The cellular diversity of the lung endothelium has not been systematically characterized in humans. Here, we provide a reference atlas of human lung endothelial cells (ECs) to facilitate a better understanding of the phenotypic diversity and composition of cells comprising the lung endothelium. Methods: We reprocessed human control single cell RNA sequencing (scRNAseq) data from six datasets. EC populations were characterized through iterative clustering with subsequent differential expression analysis. Marker genes were validated by fluorescent microscopy and in situ hybridization. scRNAseq of primary lung ECs cultured in-vitro was performed. The signaling network between different lung cell types was studied. For cross species analysis or disease relevance, we applied the same methods to scRNAseq data obtained from mouse lungs or from human lungs with pulmonary hypertension. Results: Six lung scRNAseq datasets were reanalyzed and annotated to identify over 15,000 vascular EC cells from 73 individuals. Differential expression analysis of EC revealed signatures corresponding to endothelial lineage, including pan-endothelial, pan-vascular and subpopulation-specific marker gene sets. Beyond the broad cellular categories of lymphatic, capillary, arterial and venous ECs, we found previously indistinguishable subpopulations: among venous EC, we identified two previously indistinguishable populations, pulmonary-venous ECs (COL15A1neg) localized to the lung parenchyma and systemic-venous ECs (COL15A1pos) localized to the airways and the visceral pleura; among capillary EC, we confirmed their subclassification into recently discovered aerocytes characterized by EDNRB, SOSTDC1 and TBX2 and general capillary EC. We confirmed that all six endothelial cell types, including the systemic-venous EC and aerocytes, are present in mice and identified endothelial marker genes conserved in humans and mice. Ligand-receptor connectome analysis revealed important homeostatic crosstalk of EC with other lung resident cell types. scRNAseq of commercially available primary lung ECs demonstrated a loss of their native lung phenotype in culture. scRNAseq revealed that the endothelial diversity is maintained in pulmonary hypertension. Our manuscript is accompanied by an online data mining tool (www.LungEndothelialCellAtlas.com). Conclusions: Our integrated analysis provides the comprehensive and well-crafted reference atlas of lung endothelial cells in the normal lung and confirms and describes in detail previously unrecognized endothelial populations across a large number of humans and mice.

Download Full-text

Single-cell RNA sequencing reveals cell type- and artery type-specific vascular remodelling in male spontaneously hypertensive rats

Cardiovascular Research ◽

10.1093/cvr/cvaa164 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jun Cheng ◽

Wenduo Gu ◽

Ting Lan ◽

Jiacheng Deng ◽

Zhichao Ni ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Spontaneously Hypertensive Rats ◽

Cell Types ◽

Vascular Remodelling ◽

Cell Type ◽

Hypertensive Rats ◽

Spontaneously Hypertensive ◽

Single Cell Rna Sequencing ◽

Cell Type Specific

Abstract Aims Hypertension is a major risk factor for cardiovascular diseases. However, vascular remodelling, a hallmark of hypertension, has not been systematically characterized yet. We described systematic vascular remodelling, especially the artery type- and cell type-specific changes, in hypertension using spontaneously hypertensive rats (SHRs). Methods and results Single-cell RNA sequencing was used to depict the cell atlas of mesenteric artery (MA) and aortic artery (AA) from SHRs. More than 20 000 cells were included in the analysis. The number of immune cells more than doubled in aortic aorta in SHRs compared to Wistar Kyoto controls, whereas an expansion of MA mesenchymal stromal cells (MSCs) was observed in SHRs. Comparison of corresponding artery types and cell types identified in integrated datasets unravels dysregulated genes specific for artery types and cell types. Intersection of dysregulated genes with curated gene sets including cytokines, growth factors, extracellular matrix (ECM), receptors, etc. revealed vascular remodelling events involving cell–cell interaction and ECM re-organization. Particularly, AA remodelling encompasses upregulated cytokine genes in smooth muscle cells, endothelial cells, and especially MSCs, whereas in MA, change of genes involving the contractile machinery and downregulation of ECM-related genes were more prominent. Macrophages and T cells within the aorta demonstrated significant dysregulation of cellular interaction with vascular cells. Conclusion Our findings provide the first cell landscape of resistant and conductive arteries in hypertensive animal models. Moreover, it also offers a systematic characterization of the dysregulated gene profiles with unbiased, artery type-specific and cell type-specific manners during hypertensive vascular remodelling.

Download Full-text

CHETAH: a selective, hierarchical cell type identification method for single-cell RNA sequencing

Nucleic Acids Research ◽

10.1093/nar/gkz543 ◽

2019 ◽

Vol 47 (16) ◽

pp. e95-e95 ◽

Cited By ~ 30

Author(s):

Jurrian K de Kanter ◽

Philip Lijnzaad ◽

Tito Candelli ◽

Thanasis Margaritis ◽

Frank C P Holstege

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Reference Data ◽

Classification Tree ◽

Cell Types ◽

Biological Information ◽

Identification Algorithm ◽

Intermediate Cell ◽

Cell Type ◽

Single Cell Rna Sequencing

Abstract Cell type identification is essential for single-cell RNA sequencing (scRNA-seq) studies, currently transforming the life sciences. CHETAH (CHaracterization of cEll Types Aided by Hierarchical classification) is an accurate cell type identification algorithm that is rapid and selective, including the possibility of intermediate or unassigned categories. Evidence for assignment is based on a classification tree of previously available scRNA-seq reference data and includes a confidence score based on the variance in gene expression per cell type. For cell types represented in the reference data, CHETAH’s accuracy is as good as existing methods. Its specificity is superior when cells of an unknown type are encountered, such as malignant cells in tumor samples which it pinpoints as intermediate or unassigned. Although designed for tumor samples in particular, the use of unassigned and intermediate types is also valuable in other exploratory studies. This is exemplified in pancreas datasets where CHETAH highlights cell populations not well represented in the reference dataset, including cells with profiles that lie on a continuum between that of acinar and ductal cell types. Having the possibility of unassigned and intermediate cell types is pivotal for preventing misclassification and can yield important biological information for previously unexplored tissues.

Download Full-text

UMI-count modeling and differential expression analysis for single-cell RNA sequencing

Genome Biology ◽

10.1186/s13059-018-1438-9 ◽

2018 ◽

Vol 19 (1) ◽

Cited By ~ 31

Author(s):

Wenan Chen ◽

Yan Li ◽

John Easton ◽

David Finkelstein ◽

Gang Wu ◽

...

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Sequencing ◽

Expression Analysis ◽

Differential Expression Analysis ◽

Single Cell Rna Sequencing

Download Full-text

A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies

Genes ◽

10.3390/genes12121947 ◽

2021 ◽

Vol 12 (12) ◽

pp. 1947

Author(s):

Samarendra Das ◽

Anil Rai ◽

Michael L. Merchant ◽

Matthew C. Cave ◽

Shesh N. Rai

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Sequencing ◽

High Throughput Sequencing ◽

Performance Metrics ◽

Differential Expression Analysis ◽

Individual Performance ◽

Rna Seq ◽

Gene Expressions ◽

Single Cell Rna Sequencing

Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.

Download Full-text

RNA-seq library preparation from single pancreatic acinar cells

10.1101/085696 ◽

2016 ◽

Author(s):

Damian Wollny ◽

Sheng Zhao ◽

Ana Martin-Villalba

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Acinar Cells ◽

Cell Types ◽

Cellular Heterogeneity ◽

Pancreatic Acinar Cells ◽

Library Preparation ◽

Cell Type ◽

Promising Tool ◽

Single Cell Rna Sequencing

Single cell RNA sequencing technology has emerged as a promising tool to uncover previously neglected cellular heterogeneity. Multiple methods and protocols have been developed to apply single cell sequencing to different cell types from various organs. However, library preparation for RNA sequencing remains challenging for cell types with high RNAse content due to rapid degradation of endogenous RNA molecules upon cell lysis. To this end, we developed a protocol based on the SMART-seq2 technology for single cell RNA sequencing of pancreatic acinar cells, the cell type with one of the highest ribonuclease concentration measured to date. This protocol reliably produces high quality libraries from single acinar cells reaching a total of 5x106 reads / cell and ∼ 80% transcript mapping rate with no detectable 3´end bias. Thus, our protocol makes single cell transcriptomics accessible to cell type with very high RNAse content.

Download Full-text

scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data

BMC Bioinformatics ◽

10.1186/s12859-021-04028-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Bobby Ranjan ◽

Florian Schmidt ◽

Wenjie Sun ◽

Jinyu Park ◽

Mohammad Amin Honardoost ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Differentially Expressed Genes ◽

Cell Types ◽

Unsupervised Clustering ◽

Differentially Expressed ◽

Consensus Clustering ◽

Cell Type ◽

Sequencing Data ◽

Single Cell Rna Sequencing

Abstract Background Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. Results We present scConsensus, an $${\mathbf {R}}$$ R framework for generating a consensus clustering by (1) integrating results from both unsupervised and supervised approaches and (2) refining the consensus clusters using differentially expressed genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. Conclusions scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is implemented in $${\mathbf {R}}$$ R and is freely available on GitHub at https://github.com/prabhakarlab/scConsensus.

Download Full-text