scholarly journals Nested Stochastic Block Models applied to the analysis of single cell data

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Leonardo Morelli ◽  
Valentina Giansanti ◽  
Davide Cittaro

AbstractSingle cell profiling has been proven to be a powerful tool in molecular biology to understand the complex behaviours of heterogeneous system. The definition of the properties of single cells is the primary endpoint of such analysis, cells are typically clustered to underpin the common determinants that can be used to describe functional properties of the cell mixture under investigation. Several approaches have been proposed to identify cell clusters; while this is matter of active research, one popular approach is based on community detection in neighbourhood graphs by optimisation of modularity. In this paper we propose an alternative and principled solution to this problem, based on Stochastic Block Models. We show that such approach not only is suitable for identification of cell groups, it also provides a solid framework to perform other relevant tasks in single cell analysis, such as label transfer. To encourage the use of Stochastic Block Models, we developed a python library, , that is compatible with the popular framework.

2020 ◽  
Author(s):  
Leonardo Morelli ◽  
Valentina Giansanti ◽  
Davide Cittaro

AbstractSingle cell profiling has been proven to be a powerful tool in molecular biology to understand the complex behaviours of heterogeneous system. While properties of single cells is the primary endpoint of such analysis, these are typically clustered to underpin the common determinants that can be used to describe functional properties of the cell mixture under investigation. Several approaches have been proposed to identify cell clusters; while this is matter of active research, one popular approach is based on community detection in neighbourhood graphs by optimisation of modularity. In this paper we propose an alternative solution to this problem, based on nested Stochastic Block Models; we show a threefold advantage of our approach as it is able to correctly identify cell groups, it returns a meaningful hierarchical structure and, lastly, it provides a statistical measure of association between cells and the assigned clusters.


2020 ◽  
Vol 22 (Supplement_2) ◽  
pp. ii71-ii71
Author(s):  
Bharati Mehani ◽  
Hye-Jung Chung ◽  
Russell Bandle ◽  
Sarah Young ◽  
Michael Kelly ◽  
...  

Abstract Non-coding RNAs have critical functions across biological processes that regulate glioma initiation and progression, and deregulated expression of long non-coding RNAs (lncRNAs) have been implicated in the onset and progression of malignancies. The majority of these transcripts exhibit tissue- and cancer-specific expression but little has been investigated at the single-cell level. We performed single cell RNA Sequencing (10x Genomics) for 9 IDH-wild-type glioblastomas from 7 patients. In total 66,825 cells dissociated from tumor tissues and not sorted were included in this analysis which encompassed 41,989 mean sequencing reads and 2,619 median coding genes per cell. Single cell analysis of lncRNAs in captured 190 median lncRNAs per cell and demonstrated a distinct lncRNA expression profile for glioma cells compared to the non-tumor cells with SOX2-OT significantly upregulated (2X) in glioma cells. Consistent with this finding, SOX2-OT is known to be overexpressed in a variety of cancers and has been previously implicated in glioma proliferation and migration. We then examined patterns of lncRNA expression in GBM expression subtypes. Subtype correlation indicated overexpression of RMST (classical subtype), PCED1B-AS1 (mesenchymal) and LINC00689 (proneural) lncRNAs in these expression subtypes. Consistent with these findings, upregulation of each of these 3 lncRNAs have previously been implicated on pro-tumorigenic effects, including in glioma. Examination of an independent published single cell GBM dataset also validated PCED1B-AS1 in the mesenchymal subtype. Comparison with bulk tumor GBM profiles (IDHwt TCGA GBM dataset) also showed correlations with the expression of RMST, PCED1B-AS1 and LINC00689 lncRNAs in the classical, mesenchymal and proneural subtypes respectively. Overall, these results indicate lncRNA expression can be determined in 10x-generated glioma single cell data and may reveal additional insights about cellular state and glioma biology.


2021 ◽  
Author(s):  
Yidi Deng ◽  
Jarny Choi ◽  
Kim-Anh Le Cao

Characterizing the molecular identity of a cell is an essential step in single cell RNA-sequencing (scRNA-seq) data analysis. Numerous tools exist for predicting cell identity using single cell reference atlases. However, many challenges remain, including correcting for inherent batch effects between reference and query data and insufficient phenotype data from the reference. One solution is to project single cell data onto established bulk reference atlases to leverage their rich phenotype information. Sincast is a computational framework to query scRNA-seq data based on bulk reference atlases. Prior to projection, single cell data are transformed to be directly comparable to bulk data, either with pseudo-bulk aggregation or graph-based imputation to address sparse single cell expression profiles. Sincast avoids batch effect correction, and cell identity is predicted along a continuum to highlight new cell states not found in the reference atlas. In several case study scenarios, we show that Sincast projects single cells into the correct biological niches in the expression space of the bulk reference atlas. We demonstrate the effectiveness of our imputation approach that was specifically developed for querying scRNA-seq data based on bulk reference atlases. We show that Sincast is an efficient and powerful tool for single cell profiling that will facilitate downstream analysis of scRNA-seq data.


Author(s):  
Matthew P. Mulè ◽  
Andrew J. Martins ◽  
John S. Tsang

AbstractRecent methods enable simultaneous measurement of protein expression with the transcriptome in single cells by combining protein labeling with DNA barcoded antibodies followed by droplet based single cell capture and sequencing (e.g. CITE-seq). While data normalization and denoising have received considerable attention for single cell RNA-seq data, such methods for protein data have been less explored. Here we showed that a major source of noise in CITE-seq data originated from unbound antibody encapsulated in droplets. We also found that the counts of isotype controls and those of the “negative” population inferred from all protein counts of each cell are significantly correlated, suggesting that their covariation likely reflects cell-to-cell differences due to technical factors such as non-specific antibody binding and droplet-to-droplet differences in capture efficiency of the DNA tags. Motivated by these observations, we developed a normalization method for CITE-seq protein expression data called Denoised and Scaled by Background (DSB). DSB corrects for 1) protein-specific background noise as reflected by empty droplets, 2) the technical cell-to-cell variation as captured by the latent noise component described above. DSB normalization improves separation between positive and negative populations for each protein, centers the negative-staining population around zero, and can improve unbiased protein expression-based clustering. DSB is available through the open source R package “DSB” via a single function call and can be readily integrated with existing single cell analysis workflows, including those in Bioconductor and Seurat.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jeremy A. Lombardo ◽  
Marzieh Aliaghaei ◽  
Quy H. Nguyen ◽  
Kai Kessenbrock ◽  
Jered B. Haun

AbstractTissues are complex mixtures of different cell subtypes, and this diversity is increasingly characterized using high-throughput single cell analysis methods. However, these efforts are hindered, as tissues must first be dissociated into single cell suspensions using methods that are often inefficient, labor-intensive, highly variable, and potentially biased towards certain cell subtypes. Here, we present a microfluidic platform consisting of three tissue processing technologies that combine tissue digestion, disaggregation, and filtration. The platform is evaluated using a diverse array of tissues. For kidney and mammary tumor, microfluidic processing produces 2.5-fold more single cells. Single cell RNA sequencing further reveals that endothelial cells, fibroblasts, and basal epithelium are enriched without affecting stress response. For liver and heart, processing time is dramatically reduced. We also demonstrate that recovery of cells from the system at periodic intervals during processing increases hepatocyte and cardiomyocyte numbers, as well as increases reproducibility from batch-to-batch for all tissues.


2019 ◽  
Vol 116 (13) ◽  
pp. 5979-5984 ◽  
Author(s):  
Yahui Ji ◽  
Dongyuan Qi ◽  
Linmei Li ◽  
Haoran Su ◽  
Xiaojie Li ◽  
...  

Extracellular vesicles (EVs) are important intercellular mediators regulating health and diseases. Conventional methods for EV surface marker profiling, which was based on population measurements, masked the cell-to-cell heterogeneity in the quantity and phenotypes of EV secretion. Herein, by using spatially patterned antibody barcodes, we realized multiplexed profiling of single-cell EV secretion from more than 1,000 single cells simultaneously. Applying this platform to profile human oral squamous cell carcinoma (OSCC) cell lines led to a deep understanding of previously undifferentiated single-cell heterogeneity underlying EV secretion. Notably, we observed that the decrement of certain EV phenotypes (e.g.,CD63+EV) was associated with the invasive feature of both OSCC cell lines and primary OSCC cells. We also realized multiplexed detection of EV secretion and cytokines secretion simultaneously from the same single cells to investigate the multidimensional spectrum of cellular communications, from which we resolved tiered functional subgroups with distinct secretion profiles by visualized clustering and principal component analysis. In particular, we found that different cell subgroups dominated EV secretion and cytokine secretion. The technology introduced here enables a comprehensive evaluation of EV secretion heterogeneity at single-cell level, which may become an indispensable tool to complement current single-cell analysis and EV research.


eLife ◽  
2013 ◽  
Vol 2 ◽  
Author(s):  
Daniel R Larson ◽  
Christoph Fritzsch ◽  
Liang Sun ◽  
Xiuhau Meng ◽  
David S Lawrence ◽  
...  

Single-cell analysis has revealed that transcription is dynamic and stochastic, but tools are lacking that can determine the mechanism operating at a single gene. Here we utilize single-molecule observations of RNA in fixed and living cells to develop a single-cell model of steroid-receptor mediated gene activation. We determine that steroids drive mRNA synthesis by frequency modulation of transcription. This digital behavior in single cells gives rise to the well-known analog dose response across the population. To test this model, we developed a light-activation technology to turn on a single steroid-responsive gene and follow dynamic synthesis of RNA from the activated locus.


2019 ◽  
Author(s):  
Wu Liu ◽  
Mehmet U. Caglar ◽  
Zhangming Mao ◽  
Andrew Woodman ◽  
Jamie J. Arnold ◽  
...  

SUMMARYDevelopment of antiviral therapeutics emphasizes minimization of the effective dose and maximization of the toxic dose, first in cell culture and later in animal models. Long-term success of an antiviral therapeutic is determined not only by its efficacy but also by the duration of time required for drug-resistance to evolve. We have developed a microfluidic device comprised of ~6000 wells, with each well containing a microstructure to capture single cells. We have used this device to characterize enterovirus inhibitors with distinct mechanisms of action. In contrast to population methods, single-cell analysis reveals that each class of inhibitor interferes with the viral infection cycle in a manner that can be distinguished by principal component analysis. Single-cell analysis of antiviral candidates reveals not only efficacy but also properties of the members of the viral population most sensitive to the drug, the stage of the lifecycle most affected by the drug, and perhaps even if the drug targets an interaction of the virus with its host.


2020 ◽  
Author(s):  
Tyler N. Chen ◽  
Anushka Gupta ◽  
Mansi Zalavadia ◽  
Aaron M. Streets

AbstractSingle-cell RNA sequencing (scRNA-seq) enables the investigation of complex biological processes in multicellular organisms with high resolution. However, many phenotypic features that are critical to understanding the functional role of cells in a heterogeneous tissue or organ are not directly encoded in the genome and therefore cannot be profiled with scRNA-seq. Quantitative optical microscopy has long been a powerful approach for characterizing diverse cellular phenotypes including cell morphology, protein localization, and chemical composition. Combining scRNA-seq with optical imaging has the potential to provide comprehensive single-cell analysis, allowing for functional integration of gene expression profiling and cell-state characterization. However, it is difficult to track single cells through both measurements; therefore, coupling current scRNA-seq protocols with optical measurements remains a challenge. Here, we report Microfluidic Cell Barcoding and Sequencing (μCB-seq), a microfluidic platform that combines high-resolution imaging and sequencing of single cells. μCB-seq is enabled by a novel fabrication method that preloads primers with known barcode sequences inside addressable reaction chambers of a microfluidic device. In addition to enabling multi-modal single-cell analysis, μCB-seq improves gene detection sensitivity, providing a scalable and accurate method for information-rich characterization of single cells.


Sign in / Sign up

Export Citation Format

Share Document