scholarly journals MetaCell: analysis of single-cell RNA-seq data using K-nn graph partitions

2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Yael Baran ◽  
Akhiad Bercovich ◽  
Arnau Sebe-Pedros ◽  
Yaniv Lubling ◽  
Amir Giladi ◽  
...  

Abstract scRNA-seq profiles each represent a highly partial sample of mRNA molecules from a unique cell that can never be resampled, and robust analysis must separate the sampling effect from biological variance. We describe a methodology for partitioning scRNA-seq datasets into metacells: disjoint and homogenous groups of profiles that could have been resampled from the same cell. Unlike clustering analysis, our algorithm specializes at obtaining granular as opposed to maximal groups. We show how to use metacells as building blocks for complex quantitative transcriptional maps while avoiding data smoothing. Our algorithms are implemented in the MetaCell R/C++ software package.

2018 ◽  
Author(s):  
Yael Baran ◽  
Arnau Sebe-Pedros ◽  
Yaniv Lubling ◽  
Amir Giladi ◽  
Elad Chomsky ◽  
...  

ABSTRACTSingle cell RNA-seq (scRNA-seq) has become the method of choice for analyzing mRNA distributions in heterogeneous cell populations. scRNA-seq only partially samples the cells in a tissue and the RNA in each cell, resulting in sparse data that challenge analysis. We develop a methodology that addresses scRNA-seq’s sparsity through partitioning the data into metacells: disjoint, homogenous and highly compact groups of cells, each exhibiting only sampling variance. Metacells constitute local building blocks for clustering and quantitative analysis of gene expression, while not enforcing any global structure on the data, thereby maintaining statistical control and minimizing biases. We illustrate the MetaCell framework by re-analyzing cell type and transcriptional gradients in peripheral blood and whole organism scRNA-seq maps. Our algorithms are implemented in the new MetaCell R/C++ software package.


2017 ◽  
Vol 18 (1) ◽  
Author(s):  
Zhuo Wang ◽  
Shuilin Jin ◽  
Guiyou Liu ◽  
Xiurui Zhang ◽  
Nan Wang ◽  
...  

Blood ◽  
2015 ◽  
Vol 126 (23) ◽  
pp. SCI-20-SCI-20
Author(s):  
H. Leighton Grimes ◽  
Singh Harinder ◽  
Andre Olsson ◽  
Nathan Salomonis ◽  
Bruce J. Aronow ◽  
...  

Abstract Single-cell RNA-Seq has the potential to become a dominant approach in probing diverse and complex developmental compartments. Its unbiased and comprehensive nature could enable developmental ordering of cellular and regulatory gene hierarchies without prior knowledge. To test general utility we performed single-cell RNA-seq of murine hematopoietic progenitors focusing on the myeloid developmental hierarchy. Using novel unsupervised clustering analysis, ICDS, we correctly ordered known hierarchical states as well as revealed rare intermediates. Regulatory state analysis suggested that the transcription factors Gfi1 and Irf8 function antagonistically to control homeostatic neutrophil and macrophage production, respectively. This prediction was validated by complementary genetic and genomic experiments in granulocyte-macrophage progenitors. Using knock-in reporters for Gfi1 and Irf8 and clonogenic analyses coupled with single-cell RNA-seq we distinguished regulatory states of bi-potential progenitors from their lineage specifying or committed progeny. Thus single-cell RNA-Seq is a powerful developmental tool to characterize hierarchical and rare cellular states along with the regulators that control their dynamics. Disclosures No relevant conflicts of interest to declare.


2021 ◽  
Author(s):  
Yang Xu ◽  
Priyojit Das ◽  
Rachel Patton McCord

Deep learning approaches have empowered single-cell omics data analysis in many ways, generating new insights from complex cellular systems. As there is an increasing need for single cell omics data to be integrated across sources, types, and features of data, the challenges of integrating single-cell omics data are rising. Here, we present a deep clustering algorithm that learns discriminative representation for single-cell data via maximizing mutual information, SMILE (Single-cell Mutual Information Learning). Using a unique cell-pairing design, SMILE successfully integrates multi-source single-cell transcriptome data, removing batch effects and projecting similar cell types, even from different tissues, into the same representation space. SMILE can also integrate data from two or more modalities, such as joint profiling technologies using single-cell ATAC-seq, RNA-seq, DNA methylation, Hi-C, and ChIP data. SMILE works well even when feature types are unmatched, such as genes for RNA-seq and genome wide peaks for ATAC-seq.


2018 ◽  
Author(s):  
Erica A.K. DePasquale ◽  
Daniel J. Schnell ◽  
Íñigo Valiente-Alandí ◽  
Burns C. Blaxall ◽  
H. Leighton Grimes ◽  
...  

SUMMARYMethods for single-cell RNA sequencing (scRNA-Seq) have greatly advanced in recent years. While droplet- and well-based methods have increased the capture frequency of cells for scRNA-Seq, these technologies readily produce technical artifacts, such as doublet-cell and multiplet-cell captures. Doublets occurring between distinct cell-types can appear as hybrid scRNA-Seq profiles, but do not have distinct transcriptomes from individual cell states. We introduce DoubletDecon, an approach that detects doublets with a combination of deconvolution analyses and the identification of unique cell-state gene expression. We demonstrate the ability of DoubletDecon to identify synthetic and cell-hashing cell singlets and doublets from scRNA-Seq datasets of varying cellular complexity. DoubletDecon is able to account for cell-cycle effects and is compatible with diverse species and unsupervised population detection algorithms (e.g., ICGS, Seurat). We believe this approach has the potential to become a standard quality control step for the accurate delineation of cell states.


2017 ◽  
Author(s):  
Jesse M. Zhang ◽  
Jue Fan ◽  
H. Christina Fan ◽  
David Rosenfeld ◽  
David N. Tse

ABSTRACTBackgroundWith the recent proliferation of single-cell RNA-Seq experiments, several methods have been developed for unsupervised analysis of the resulting datasets. These methods often rely on unintuitive hyperparameters and do not explicitly address the subjectivity associated with clustering.ResultsIn this work, we present DendroSplit, an interpretable framework for analyzing single-cell RNA-Seq datasets that addresses both the clustering interpretability and clustering subjectivity issues. DendroSplit offers a novel perspective on the single-cell RNA-Seq clustering problem motivated by the definition of “cell type,” allowing us to cluster using feature selection to uncover multiple levels of biologically meaningful populations in the data. We analyze several landmark single-cell datasets, demonstrating both the method’s efficacy and computational efficiency.ConclusionDendroSplit offers a clustering framework that is comparable to existing methods in terms of accuracy and speed but is novel in its emphasis on interpretabilty. We provide the full DendroSplit software package at https://github.com/jessemzhang/dendrosplit.


2018 ◽  
Author(s):  
Fan Zhang ◽  
Kevin Wei ◽  
Kamil Slowikowski ◽  
Chamith Y. Fonseka ◽  
Deepak A. Rao ◽  
...  

AbstractTo define the cell populations in rheumatoid arthritis (RA) driving joint inflammation, we applied single-cell RNA-seq (scRNA-seq), mass cytometry, bulk RNA-seq, and flow cytometry to sorted T cells, B cells, monocytes, and fibroblasts from 51 synovial tissue RA and osteoarthritis (OA) patient samples. Utilizing an integrated computational strategy based on canonical correlation analysis to 5,452 scRNA-seq profiles, we identified 18 unique cell populations. Combining mass cytometry and transcriptomics together revealed cell states expanded in RA synovia: THY1+HLAhigh sublining fibroblasts (OR=33.8), IL1B+ pro-inflammatory monocytes (OR=7.8), CD11c+T-bet+ autoimmune-associated B cells (OR=5.7), and PD-1+Tph/Tfh (OR=3.0). We also defined CD8+ T cell subsets characterized by GZMK+, GZMB+, and GNLY+ expression. Using bulk and single-cell data, we mapped inflammatory mediators to source cell populations, for example attributing IL6 production to THY1+HLAhigh fibroblasts and naïve B cells, and IL1B to pro-inflammatory monocytes. These populations are potentially key mediators of RA pathogenesis.


2018 ◽  
Author(s):  
Amir Alavi ◽  
Matthew Ruffalo ◽  
Aiyappa Parvangada ◽  
Zhilin Huang ◽  
Ziv Bar-Joseph

SummarySingle cell RNA-Seq (scRNA-seq) studies often profile upward of thousands of cells in heterogeneous environments. Current methods for characterizing cells perform unsupervised analysis followed by assignment using a small set of known marker genes. Such approaches are limited to a few, well characterized cell types. To enable large scale supervised characterization we developed an automated pipeline to download, process, and annotate publicly available scRNA-seq datasets. We extended supervised neural networks to obtain efficient and accurate representations for scRNA-seq data. We applied our pipeline to analyze data from over 500 different studies with over 300 unique cell types and show that supervised methods greatly outperform unsupervised methods for cell type identification. A case study of neural degeneration data highlights the ability of these methods to identify differences between cell type distributions in healthy and diseased mice. We implemented a web server that compares new datasets to collected data employing fast matching methods in order to determine cell types, key genes, similar prior studies, and more.


IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 129679-129688 ◽  
Author(s):  
Jiao Hua ◽  
Hongkun Liu ◽  
Boyang Zhang ◽  
Shuilin Jin

Sign in / Sign up

Export Citation Format

Share Document