scholarly journals Comparison of Principal Component Analysis and t-Stochastic Neighbor Embedding with Distance Metric Modifications for Single-cell RNA-sequencing Data Analysis

2017 ◽  
Author(s):  
Haejoon (Ellen) Kwon ◽  
Jean Fan ◽  
Peter Kharchenko

AbstractRecent developments in technological tools such as next generation sequencing along with peaking interest in the study of single cells has enabled single-cell RNA-sequencing, in which whole transcriptomes are analyzed on a single-cell level. Studies, however, have been hindered by the ability to effectively analyze these single cell RNA-seq datasets, due to the high-dimensional nature and intrinsic noise in the data. While many techniques have been introduced to reduce dimensionality of such data for visualization and subpopulation identification, the utility to identify new cellular subtypes in a reliable and robust manner remains unclear. Here, we compare dimensionality reduction visualization methods including principle component analysis and t-stochastic neighbor embedding along with various distance metric modifications to visualize single-cell RNA-seq datasets, and assess their performance in identifying known cellular subtypes. Our results suggest that selecting variable genes prior to analysis on single-cell RNA-seq data is vital to yield reliable classification, and that when variable genes are used, the choice of distance metric modification does not particularly influence the quality of classification. Still, in order to take advantage of all the gene expression information, alternative methods must be used for a reliable classification.

Kidney360 ◽  
2021 ◽  
pp. 10.34067/KID.0003682021
Author(s):  
Rachel M B Bell ◽  
Laura Denby

Kidney disease represents a global health burden of increasing prevalence and is an independent risk factor for cardiovascular disease. Myeloid cells are a major cellular compartment of the immune system; they are found in the healthy kidney and in increased numbers in the damaged and/or diseased kidney, where they act as key players in the progression of injury, inflammation and fibrosis. They possess enormous plasticity and heterogeneity, adopting different phenotypic and functional characteristics in response to stimuli in the local milieu. Though this inherent complexity remains to be fully understood in the kidney, advances in single-cell genomics promises to change this. Specifically, single-cell RNA sequencing (scRNA-seq) has had a transformative effect on kidney research, enabling the profiling and analysis of the transcriptomes of single cells at unprecedented resolution and throughput, and subsequent generation of cell atlases. Moving forward, combining scRNA- and single-nuclear RNA-seq with greater resolution spatial transcriptomics will allow spatial mapping of kidney disease of varying aetiology to further reveal the patterning of immune cells and non-immune renal cells. This review summarises the roles of myeloid cells in kidney health and disease, the experimental workflow in currently available scRNA-seq technologies and published findings using scRNA-seq in the context of myeloid cells and the kidney.


2018 ◽  
Vol 31 (Supplement_1) ◽  
pp. 190-190
Author(s):  
Robert Walker ◽  
Maria Secrier ◽  
Jack Harrington ◽  
Rachel Parker ◽  
Jamie Kelly ◽  
...  

Abstract Background Esophageal adenocarcinoma (EAC) develops in a complex ecosystem that defines tumour evolution, response to treatment and patient outcomes. We have shown that high levels of cancer associated fibroblast (CAF) in the tumour microenvironment (TME) predicts poor outcome and is inversely related to tumour infiltrating lymphocyte (TILs) abundance. Bulk sequencing studies lack the resolution to dissect the phenotypic and functional heterogeneity and cell-cell interactions of the TME. We have applied single cell RNA sequencing to 12 EAC patients to address this challenge. Methods A total of 24 single-cell suspensions were prepared from resected specimens and paired normal tissue. Single cells and barcoded mRNA-binding micro-particles were combined in droplets containing cell lysis buffer using a custom-built microfluidic platform. Captured mRNA with a cell barcode and unique molecular identifier was reverse transcribed, amplified and sequenced. SEURAT (v2.1, R-package) was used to identify highly-variable genes and perform cell clustering. Results Analysis of 6859 of the highest quality cells using the 4167 most variable genes revealed 46 clusters which were divided into 12 broad populations. Antigen Presenting Cells (n = 189), B Cells (n = 265), Cancer Cells (n = 1449), Endothelial Cells (321), Fibroblasts (1690), Mast Cells (n = 184), Monocytes/Macrophages (n = 254), Plasma Cells (n = 208), Smooth Muscle, (n = 115), Squamous Epithelium (n = 751), T Cells (n = 1433). Analysis of publicly available bulk RNAseq datasets (TCGA) of EAC showed that tumours that were ‘hot’ for a CAF gene signature were ‘cold’ for a T-Cell signature. Subset analysis of the fibroblasts from tumour samples that were enriched for the same CAF signature revealed 3 subtly different clusters. One of these sub-populations differentially expressed genes associated with the Gene Ontology terms GO:00,40011 (locomotion) GO:0,006928 (movement of cell or subcellular component) and GO:00,48870 (cell motility). The two tumours with the highest ratio of this type of CAF to T-Cells were both found to have distant metastasis at resection. Conclusion EAC CAFs are a heterogeneous population with distinct biological functions which may have different implications for prognosis. These early results suggest that we are able to identify candidate biological processes that may describe the mechanisms through which CAFs and TILs influence outcome. Disclosure All authors have declared no conflicts of interest.


2021 ◽  
Author(s):  
Feiyang Ma ◽  
Patrice A Salomé ◽  
Sabeeha S Merchant ◽  
Matteo Pellegrini

Abstract The photosynthetic unicellular alga Chlamydomonas (Chlamydomonas reinhardtii) is a versatile reference for algal biology because of its ease of culture in the laboratory. Genomic and systems biology approaches have previously described transcriptome responses to environmental changes using bulk data, thus representing the average behavior from pools of cells. Here, we apply single-cell RNA sequencing (scRNA-seq) to probe the heterogeneity of Chlamydomonas cell populations under three environments and in two genotypes differing by the presence of a cell wall. First, we determined that RNA can be extracted from single algal cells with or without a cell wall, offering the possibility to sample natural algal communities. Second, scRNA-seq successfully separated single cells into non-overlapping cell clusters according to their growth conditions. Cells exposed to iron or nitrogen deficiency were easily distinguished despite a shared tendency to arrest photosynthesis and cell division to economize resources. Notably, these groups of cells recapitulated known patterns observed with bulk RNA-seq, but also revealed their inherent heterogeneity. A substantial source of variation between cells originated from their endogenous diurnal phase, although cultures were grown in constant light. We exploited this result to show that circadian iron responses may be conserved from algae to land plants. We document experimentally that bulk RNA-seq data represent an average of typically hidden heterogeneity in the population.


Author(s):  
Feiyang Ma ◽  
Patrice A. Salomé ◽  
Sabeeha S. Merchant ◽  
Matteo Pellegrini

ABSTRACTThe photosynthetic unicellular alga Chlamydomonas (Chlamydomonas reinhardtii) is a versatile reference for algal biology because of the facility with which it can be cultured in the laboratory. Genomic and systems biology approaches have previously been used to describe how the transcriptome responds to environmental changes, but this analysis has been limited to bulk data, representing the average behavior from pools of cells. Here, we apply single-cell RNA sequencing (scRNA-seq) to probe the heterogeneity of Chlamydomonas cell populations under three environments and in two genotypes differing in the presence of a cell wall. First, we determined that RNA can be extracted from single algal cells with or without a cell wall, offering the possibility to sample algae communities in the wild. Second, scRNA-seq successfully separated single cells into non-overlapping cell clusters according to their growth conditions. Cells exposed to iron or nitrogen deficiency were easily distinguished despite a shared tendency to arrest cell division to economize resources. Notably, these groups of cells recapitulated known patterns observed with bulk RNA-seq, but also revealed their inherent heterogeneity. A substantial source of variation between cells originated from their endogenous diurnal phase, although cultures were grown in constant light. We exploited this result to show that circadian iron responses may be conserved from algae to land plants. We propose that bulk RNA-seq data represent an average of varied cell states that hides underappreciated heterogeneity.One-sentence summaryWe show that single-cell RNA-seq (scRNA-seq) can be applied to Chlamydomonas cultures to reveal the that heterogenity in bulk cultures is largely driven by diurnal cycle phasesThe author responsible for distribution of materials integral to the findings presented in this article in accordance with the policy described in the Instructions for Authors (www.plantcell.org) is: Matteo Pellegrini ([email protected])


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Sunny Z. Wu ◽  
Daniel L. Roden ◽  
Ghamdan Al-Eryani ◽  
Nenad Bartonicek ◽  
Kate Harvey ◽  
...  

Abstract Background High throughput single-cell RNA sequencing (scRNA-Seq) has emerged as a powerful tool for exploring cellular heterogeneity among complex human cancers. scRNA-Seq studies using fresh human surgical tissue are logistically difficult, preclude histopathological triage of samples, and limit the ability to perform batch processing. This hindrance can often introduce technical biases when integrating patient datasets and increase experimental costs. Although tissue preservation methods have been previously explored to address such issues, it is yet to be examined on complex human tissues, such as solid cancers and on high throughput scRNA-Seq platforms. Methods Using the Chromium 10X platform, we sequenced a total of ~ 120,000 cells from fresh and cryopreserved replicates across three primary breast cancers, two primary prostate cancers and a cutaneous melanoma. We performed detailed analyses between cells from each condition to assess the effects of cryopreservation on cellular heterogeneity, cell quality, clustering and the identification of gene ontologies. In addition, we performed single-cell immunophenotyping using CITE-Seq on a single breast cancer sample cryopreserved as solid tissue fragments. Results Tumour heterogeneity identified from fresh tissues was largely conserved in cryopreserved replicates. We show that sequencing of single cells prepared from cryopreserved tissue fragments or from cryopreserved cell suspensions is comparable to sequenced cells prepared from fresh tissue, with cryopreserved cell suspensions displaying higher correlations with fresh tissue in gene expression. We showed that cryopreservation had minimal impacts on the results of downstream analyses such as biological pathway enrichment. For some tumours, cryopreservation modestly increased cell stress signatures compared to freshly analysed tissue. Further, we demonstrate the advantage of cryopreserving whole-cells for detecting cell-surface proteins using CITE-Seq, which is impossible using other preservation methods such as single nuclei-sequencing. Conclusions We show that the viable cryopreservation of human cancers provides high-quality single-cells for multi-omics analysis. Our study guides new experimental designs for tissue biobanking for future clinical single-cell RNA sequencing studies.


2019 ◽  
Author(s):  
Imad Abugessaisa ◽  
Shuhei Noguchi ◽  
Melissa Cardon ◽  
Akira Hasegawa ◽  
Kazuhide Watanabe ◽  
...  

AbstractAnalysis and interpretation of single-cell RNA-sequencing (scRNA-seq) experiments are compromised by the presence of poor quality cells. For meaningful analyses, such poor quality cells should be excluded to avoid biases and large variation. However, no clear guidelines exist. We introduce SkewC, a novel quality-assessment method to identify poor quality single-cells in scRNA-seq experiments. The method is based on the assessment of gene coverage for each single cell and its skewness as a quality measure. To validate the method, we investigated the impact of poor quality cells on downstream analyses and compared biological differences between typical and poor quality cells. Moreover, we measured the ratio of intergenic expression, suggesting genomic contamination, and foreign organism contamination of single-cell samples. SkewC is tested in 37,993 single-cells generated by 15 scRNA-seq protocols. We envision SkewC as an indispensable QC method to be incorporated into scRNA-seq experiment to preclude the possibility of scRNA-seq data misinterpretation.


2016 ◽  
Author(s):  
Hannah R. Dueck ◽  
Rizi Ai ◽  
Adrian Camarena ◽  
Bo Ding ◽  
Reymundo Dominguez ◽  
...  

AbstractRecently, measurement of RNA at single cell resolution has yielded surprising insights. Methods for single-cell RNA sequencing (scRNA-seq) have received considerable attention, but the broad reliability of single cell methods and the factors governing their performance are still poorly known. Here, we conducted a large-scale control experiment to assess the transfer function of three scRNA-seq methods and factors modulating the function. All three methods detected greater than 70% of the expected number of genes and had a 50% probability of detecting genes with abundance greater than 2 to 4 molecules. Despite the small number of molecules, sequencing depth significantly affected gene detection. While biases in detection and quantification were qualitatively similar across methods, the degree of bias differed, consistent with differences in molecular protocol. Measurement reliability increased with expression level for all methods and we conservatively estimate the measurement transfer functions to be linear above ~5-10 molecules. Based on these extensive control studies, we propose that RNA-seq of single cells has come of age, yielding quantitative biological information.


2021 ◽  
Author(s):  
Nicole C. Rondeau ◽  
JJ L. Miranda

We detected precise coordination of RNA levels between two latent genes of the Kaposi sarcoma-associated herpesvirus (KSHV) using single-cell RNA sequencing. LANA and vIL6 are expressed during latency by different promoters on remote regions of the episome.…


Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 240 ◽  
Author(s):  
Prashant N. M. ◽  
Hongyu Liu ◽  
Pavlos Bousounis ◽  
Liam Spurr ◽  
Nawaf Alomran ◽  
...  

With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, the estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate the allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10×Genomics Chromium platform. We analyzed 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), sequenced to an average of 150K sequencing reads per cell (more than 4 billion scRNA-seq reads in total). High-quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimated the expressed variant allele fraction (VAFRNA) from SNV-aware alignments and analyzed its variance and distribution (mono- and bi-allelic) at different minimum sequencing read thresholds. Our analysis shows that when assessing positions covered by a minimum of three unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at a threshold of 10 reads, nearly 90% of the SNVs are bi-allelic. In addition, our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3′-based library generation protocol of 10×Genomics scRNA-seq data can be informative in SNV-based studies, including analyses of transcriptional kinetics.


2018 ◽  
Author(s):  
Xianwen Ren ◽  
Liangtao Zheng ◽  
Zemin Zhang

ABSTRACTClustering is a prevalent analytical means to analyze single cell RNA sequencing data but the rapidly expanding data volume can make this process computational challenging. New methods for both accurate and efficient clustering are of pressing needs. Here we proposed a new clustering framework based on random projection and feature construction for large scale single-cell RNA sequencing data, which greatly improves clustering accuracy, robustness and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells, our method reached 20% improvements for clustering accuracy and 50-fold acceleration but only consumed 66% memory usage compared to the widely-used software package SC3. Compared to k-means, the accuracy improvement can reach 3-fold depending on the concrete dataset. An R implementation of the framework is available from https://github.com/Japrin/sscClust.


Sign in / Sign up

Export Citation Format

Share Document