FIRM: Fast Integration of single-cell RNA-sequencing data across Multiple platforms

Mapping Intimacies ◽

10.1101/2020.06.02.129031 ◽

2020 ◽

Author(s):

Jingsi Ming ◽

Zhixiang Lin ◽

Xiang Wan ◽

Can Yang ◽

Angela Ruohao Wu

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Cell Biology ◽

Developmental Trajectories ◽

Expression Patterns ◽

Cell Types ◽

Original Structure ◽

Cell Type ◽

Sequencing Data ◽

Single Cell Rna Sequencing

AbstractSingle-cell RNA-sequencing (scRNA-seq) has now been used extensively to discover novel cell types and reconstruct developmental trajectories by measuring mRNA expression patterns of individual cells. However, datasets collected using different scRNA-seq technology platforms, including the popular SMART-Seq2 (SS2) and 10X platforms, are difficult to compare because of their heterogeneity. Each platform has unique advantages, and integration of these datasets would provide deeper insights into cell biology and gene regulation. Through comprehensive data exploration, we found that accurate integration is often hampered by differences in cell-type compositions. Herein we describe FIRM, an algorithm that addresses this problem and achieves efficient and accurate integration of heterogeneous scRNA-seq datasets across multiple platforms. We applied FIRM to numerous scRNA-seq datasets generated using SS2 and 10X from mouse, mouse lemur, and human, comparing its performance in dataset integration with other state-of-the-art methods. The integrated datasets generated using FIRM show accurate mixing of shared cell type identities and superior preservation of original structure for each dataset. FIRM not only generates robust integrated datasets for downstream analysis, but is also a facile way to transfer cell type labels and annotations from one dataset to another, making it a versatile and indispensable tool for scRNA-seq analysis.

Download Full-text

scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data

BMC Bioinformatics ◽

10.1186/s12859-021-04028-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Bobby Ranjan ◽

Florian Schmidt ◽

Wenjie Sun ◽

Jinyu Park ◽

Mohammad Amin Honardoost ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Differentially Expressed Genes ◽

Cell Types ◽

Unsupervised Clustering ◽

Differentially Expressed ◽

Consensus Clustering ◽

Cell Type ◽

Sequencing Data ◽

Single Cell Rna Sequencing

Abstract Background Clustering is a crucial step in the analysis of single-cell data. Clusters identified in an unsupervised manner are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering approaches have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. Results We present scConsensus, an $${\mathbf {R}}$$ R framework for generating a consensus clustering by (1) integrating results from both unsupervised and supervised approaches and (2) refining the consensus clusters using differentially expressed genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. Conclusions scConsensus combines the merits of unsupervised and supervised approaches to partition cells with better cluster separation and homogeneity, thereby increasing our confidence in detecting distinct cell types. scConsensus is implemented in $${\mathbf {R}}$$ R and is freely available on GitHub at https://github.com/prabhakarlab/scConsensus.

Download Full-text

scConsensus: combining supervised and unsupervised clustering for cell type identification in single-cell RNA sequencing data

10.1101/2020.04.22.056473 ◽

2020 ◽

Author(s):

Bobby Ranjan ◽

Florian Schmidt ◽

Wenjie Sun ◽

Jinyu Park ◽

Mohammad Amin Honardoost ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Cell Types ◽

Unsupervised Clustering ◽

Differentially Expressed ◽

Consensus Clustering ◽

Cell Type ◽

Sequencing Data ◽

Single Cell Rna Sequencing ◽

Data Clusters

Clustering is a crucial step in the analysis of single-cell data. Clusters identified using unsupervised clustering are typically annotated to cell types based on differentially expressed genes. In contrast, supervised methods use a reference panel of labelled transcriptomes to guide both clustering and cell type identification. Supervised and unsupervised clustering strategies have their distinct advantages and limitations. Therefore, they can lead to different but often complementary clustering results. Hence, a consensus approach leveraging the merits of both clustering paradigms could result in a more accurate clustering and a more precise cell type annotation. We present scConsensus, an R framework for generating a consensus clustering by (i) integrating the results from both unsupervised and supervised approaches and (ii) refining the consensus clusters using differentially expressed (DE) genes. The value of our approach is demonstrated on several existing single-cell RNA sequencing datasets, including data from sorted PBMC sub-populations. scConsensus is freely available on GitHub at https://github.com/prabhakarlab/scConsensus.

Download Full-text

Self-assembling Manifolds in Single-cell RNA Sequencing Data

10.1101/364166 ◽

2018 ◽

Cited By ~ 3

Author(s):

Alexander J. Tarashansky ◽

Yuan Xue ◽

Pengyang Li ◽

Stephen R. Quake ◽

Bo Wang

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Developmental Trajectories ◽

Cell Types ◽

Selection Strategy ◽

Sequencing Data ◽

Biologically Relevant ◽

Self Assembling ◽

Single Cell Rna Sequencing ◽

Stem Cell Populations

AbstractSingle-cell RNA sequencing has spurred the development of computational methods that enable researchers to classify cell types, delineate developmental trajectories, and measure molecular responses to external perturbations. Many of these technologies rely on their ability to detect genes whose cell-to-cell variations arise from the biological processes of interest rather than transcriptional or technical noise. However, for datasets in which the biologically relevant differences between cells are subtle, identifying these genes is a challenging task. We present the self-assembling manifold (SAM) algorithm, an iterative soft feature selection strategy to quantify gene relevance and improve dimensionality reduction. We demonstrate its advantages over other state-of-the-art methods with experimental validation in identifying novel stem cell populations of Schistosoma, a prevalent parasite that infects hundreds of millions of people. Extending our analysis to a total of 56 datasets, we show that SAM is generalizable and consistently outperforms other methods in a variety of biological and quantitative benchmarks.

Download Full-text

Self-assembling manifolds in single-cell RNA sequencing data

eLife ◽

10.7554/elife.48994 ◽

2019 ◽

Vol 8 ◽

Cited By ~ 8

Author(s):

Alexander J Tarashansky ◽

Yuan Xue ◽

Pengyang Li ◽

Stephen R Quake ◽

Bo Wang

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Developmental Trajectories ◽

Cell Types ◽

Selection Strategy ◽

Sequencing Data ◽

Biologically Relevant ◽

Self Assembling ◽

Single Cell Rna Sequencing ◽

Stem Cell Populations

Single-cell RNA sequencing has spurred the development of computational methods that enable researchers to classify cell types, delineate developmental trajectories, and measure molecular responses to external perturbations. Many of these technologies rely on their ability to detect genes whose cell-to-cell variations arise from the biological processes of interest rather than transcriptional or technical noise. However, for datasets in which the biologically relevant differences between cells are subtle, identifying these genes is challenging. We present the self-assembling manifold (SAM) algorithm, an iterative soft feature selection strategy to quantify gene relevance and improve dimensionality reduction. We demonstrate its advantages over other state-of-the-art methods with experimental validation in identifying novel stem cell populations of Schistosoma mansoni, a prevalent parasite that infects hundreds of millions of people. Extending our analysis to a total of 56 datasets, we show that SAM is generalizable and consistently outperforms other methods in a variety of biological and quantitative benchmarks.

Download Full-text

Cell type-specific weighting-factors for accurate virtual single-cell RNA-sequencing of diverse organs

10.1101/2020.07.05.188276 ◽

2020 ◽

Author(s):

Kengo Tejima ◽

Satoshi Kozawa ◽

Thomas N. Sato

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Cell Types ◽

Cell Type ◽

Sequencing Data ◽

Human Organ ◽

Weighting Factors ◽

Single Cell Rna Sequencing ◽

Cell Type Specific ◽

Bulk Tissue

AbstractComputational deconvolution of transcriptome data of organs/tissues uncovers their structural and functional complexities at cellular resolution without performing single-cell RNA-sequencing experiments. However, the deconvolution of highly heterogenous diverse organs/tissues remains a challenge. Herein, we report “cell type-specific weighting-factors” that are essential for accurate deconvolution, but critically lacking in the existing methods. We computed such weighting-factors for 97 cell-types across 10 mouse organs and demonstrate their effective usage in the Bayesian framework to generate their virtual single-cell RNA-sequencing data, hence accurately estimating both cell-type ratios and the complete transcriptome of each cell-type in these organs. The method also efficiently detects the temporal changes of such cell type-profiles during organ pathogenesis in disease models. Furthermore, we present its potential utility for human organ/bulk-tissue deconvolution. Taken together, the weighting-factors reported herein and their computation for new cell-types and/or new species such as human are essential tools/resources for studying high-resolution biology and disease.

Download Full-text

Single-cell data clustering based on sparse optimization and low-rank matrix factorization

G3 Genes|Genome|Genetics ◽

10.1093/g3journal/jkab098 ◽

2021 ◽

Author(s):

Yinlei Hu ◽

Bin Li ◽

Falai Chen ◽

Kun Qu

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Matrix Factorization ◽

Data Clustering ◽

Cell Types ◽

Low Rank ◽

Sequencing Data ◽

Rank Matrix ◽

Single Cell Rna Sequencing ◽

Low Rank Matrix

Abstract Unsupervised clustering is a fundamental step of single-cell RNA sequencing data analysis. This issue has inspired several clustering methods to classify cells in single-cell RNA sequencing data. However, accurate prediction of the cell clusters remains a substantial challenge. In this study, we propose a new algorithm for single-cell RNA sequencing data clustering based on Sparse Optimization and low-rank matrix factorization (scSO). We applied our scSO algorithm to analyze multiple benchmark datasets and showed that the cluster number predicted by scSO was close to the number of reference cell types and that most cells were correctly classified. Our scSO algorithm is available at https://github.com/QuKunLab/scSO. Overall, this study demonstrates a potent cell clustering approach that can help researchers distinguish cell types in single-cell RNA sequencing data.

Download Full-text

Critical downstream analysis steps for single-cell RNA sequencing data

Briefings in Bioinformatics ◽

10.1093/bib/bbab105 ◽

2021 ◽

Author(s):

Zilong Zhang ◽

Feifei Cui ◽

Chen Lin ◽

Lingling Zhao ◽

Chunyu Wang ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Noisy Data ◽

Single Cell Level ◽

Cell Type ◽

Sequencing Data ◽

Cell Level ◽

Bioinformatics Tool ◽

Single Cell Rna Sequencing ◽

Downstream Analysis

Abstract Single-cell RNA sequencing (scRNA-seq) has enabled us to study biological questions at the single-cell level. Currently, many analysis tools are available to better utilize these relatively noisy data. In this review, we summarize the most widely used methods for critical downstream analysis steps (i.e. clustering, trajectory inference, cell-type annotation and integrating datasets). The advantages and limitations are comprehensively discussed, and we provide suggestions for choosing proper methods in different situations. We hope this paper will be useful for scRNA-seq data analysts and bioinformatics tool developers.

Download Full-text

Identifying common and novel cell types in single-cell RNA-sequencing data using FR-Match

10.1101/2021.10.17.464718 ◽

2021 ◽

Author(s):

Yun Zhang ◽

Brian Aevermann ◽

Rohan Gala ◽

Richard H. Scheuermann

Keyword(s):

Single Cell ◽

Cell Types ◽

Sample Type ◽

Cell Type ◽

Sequencing Data ◽

Excellent Performance ◽

Single Cell Rna Sequencing ◽

Accurate Performance ◽

Cross Platform ◽

Tissue Region

Reference cell type atlases powered by single cell transcriptomic profiling technologies have become available to study cellular diversity at a granular level. We present FR-Match for matching query datasets to reference atlases with robust and accurate performance for identifying novel cell types and non-optimally clustered cell types in the query data. This approach shows excellent performance for cross-platform, cross-sample type, cross-tissue region, and cross-data modality cell type matching.

Download Full-text

Localization of migraine susceptibility genes in human brain by single-cell RNA sequencing

Cephalalgia ◽

10.1177/0333102418762476 ◽

2018 ◽

Vol 38 (13) ◽

pp. 1976-1983 ◽

Cited By ~ 5

Author(s):

William Renthal

Keyword(s):

Human Brain ◽

Single Cell ◽

Rna Sequencing ◽

Expression Profiles ◽

Cell Types ◽

Susceptibility Genes ◽

Brain Cell ◽

Cell Type ◽

Single Cell Rna Sequencing ◽

Brain Cell Types

Background Migraine is a debilitating disorder characterized by severe headaches and associated neurological symptoms. A key challenge to understanding migraine has been the cellular complexity of the human brain and the multiple cell types implicated in its pathophysiology. The present study leverages recent advances in single-cell transcriptomics to localize the specific human brain cell types in which putative migraine susceptibility genes are expressed. Methods The cell-type specific expression of both familial and common migraine-associated genes was determined bioinformatically using data from 2,039 individual human brain cells across two published single-cell RNA sequencing datasets. Enrichment of migraine-associated genes was determined for each brain cell type. Results Analysis of single-brain cell RNA sequencing data from five major subtypes of cells in the human cortex (neurons, oligodendrocytes, astrocytes, microglia, and endothelial cells) indicates that over 40% of known migraine-associated genes are enriched in the expression profiles of a specific brain cell type. Further analysis of neuronal migraine-associated genes demonstrated that approximately 70% were significantly enriched in inhibitory neurons and 30% in excitatory neurons. Conclusions This study takes the next step in understanding the human brain cell types in which putative migraine susceptibility genes are expressed. Both familial and common migraine may arise from dysfunction of discrete cell types within the neurovascular unit, and localization of the affected cell type(s) in an individual patient may provide insight into to their susceptibility to migraine.

Download Full-text

Single-cell RNA sequencing reveals cell type- and artery type-specific vascular remodelling in male spontaneously hypertensive rats

Cardiovascular Research ◽

10.1093/cvr/cvaa164 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jun Cheng ◽

Wenduo Gu ◽

Ting Lan ◽

Jiacheng Deng ◽

Zhichao Ni ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Spontaneously Hypertensive Rats ◽

Cell Types ◽

Vascular Remodelling ◽

Cell Type ◽

Hypertensive Rats ◽

Spontaneously Hypertensive ◽

Single Cell Rna Sequencing ◽

Cell Type Specific

Abstract Aims Hypertension is a major risk factor for cardiovascular diseases. However, vascular remodelling, a hallmark of hypertension, has not been systematically characterized yet. We described systematic vascular remodelling, especially the artery type- and cell type-specific changes, in hypertension using spontaneously hypertensive rats (SHRs). Methods and results Single-cell RNA sequencing was used to depict the cell atlas of mesenteric artery (MA) and aortic artery (AA) from SHRs. More than 20 000 cells were included in the analysis. The number of immune cells more than doubled in aortic aorta in SHRs compared to Wistar Kyoto controls, whereas an expansion of MA mesenchymal stromal cells (MSCs) was observed in SHRs. Comparison of corresponding artery types and cell types identified in integrated datasets unravels dysregulated genes specific for artery types and cell types. Intersection of dysregulated genes with curated gene sets including cytokines, growth factors, extracellular matrix (ECM), receptors, etc. revealed vascular remodelling events involving cell–cell interaction and ECM re-organization. Particularly, AA remodelling encompasses upregulated cytokine genes in smooth muscle cells, endothelial cells, and especially MSCs, whereas in MA, change of genes involving the contractile machinery and downregulation of ECM-related genes were more prominent. Macrophages and T cells within the aorta demonstrated significant dysregulation of cellular interaction with vascular cells. Conclusion Our findings provide the first cell landscape of resistant and conductive arteries in hypertensive animal models. Moreover, it also offers a systematic characterization of the dysregulated gene profiles with unbiased, artery type-specific and cell type-specific manners during hypertensive vascular remodelling.

Download Full-text