scholarly journals Single-Cell Entropy to Quantify the Cellular Order Parameter from Single-Cell RNA-Seq Data

2020 ◽  
Vol 15 (01) ◽  
pp. 35-49
Author(s):  
Jingxin Liu ◽  
You Song ◽  
Jinzhi Lei

The cell is the basic functional and biological unit of life, and a complex system that contains a huge number of molecular components. How can we quantify the macroscopic state of a cell from the microscopic information of these molecular components? This is a fundamental question to increase the understanding of the human body. The recent maturation of single-cell RNA sequencing (scRNA-seq) technologies has allowed researchers to gain information on the transcriptomes of individual cells. Although considerable progress has been made in terms of cell-type clustering over the past few years, there is no strong consensus about how to define a cell state from scRNA-seq data. Here, we present single-cell entropy (scEntropy) as an order parameter for cellular transcriptome profiles from scRNA-seq data. scEntropy is a straightforward parameter with which to define the intrinsic transcriptional state of a cell that can provide a quantity to measure the developmental process and to distinguish different cell types. The proposed scEntropy followed by Gaussian mixture model (scEGMM) provides a coherent method of cell-type classification that is simple, includes no parameters or clustering and is comparable to existing machine learning-based methods in benchmarking studies. The results of cell-type classification based on scEGMM are robust and easy to biologically interpret.

2019 ◽  
Author(s):  
Jingxin Liu ◽  
You Song ◽  
Jinzhi Lei

We present the use of single-cell entropy (scEntropy) to measure the order of the cellular transcriptome profile from single-cell RNA-seq data, which leads to a method of unsupervised cell type classification through scEntropy followed by the Gaussian mixture model (scEGMM). scEntropy is straightforward in defining an intrinsic transcriptional state of a cell. scEGMM is a coherent method of cell type classification that includes no parameters and no clustering; however, it is comparable to existing machine learning-based methods in benchmarking studies and facilitates biological interpretation.


2020 ◽  
Author(s):  
Pawan K. Jha ◽  
Utham K. Valekunja ◽  
Sandipan Ray ◽  
Mathieu Nollet ◽  
Akhilesh B. Reddy

Every day, we sleep for a third of the day. Sleep is important for cognition, brain waste clearance, metabolism, and immune responses. The molecular mechanisms governing sleep are largely unknown. Here, we used a combination of single cell RNA sequencing and cell-type specific proteomics to interrogate the molecular underpinnings of sleep. Different cell types in three important brain regions for sleep (brainstem, cortex, and hypothalamus) exhibited diverse transcriptional responses to sleep need. Sleep restriction modulates astrocyte-neuron crosstalk and sleep need enhances expression of specific sets of transcription factors in different brain regions. In cortex, we also interrogated the proteome of two major cell types: astrocytes and neurons. Sleep deprivation differentially alters the expression of proteins in astrocytes and neurons. Similarly, phosphoproteomics revealed large shifts in cell-type specific protein phosphorylation. Our results indicate that sleep need regulates transcriptional, translational, and post-translational responses in a cell-specific manner.


2019 ◽  
Author(s):  
Matthew N. Bernstein ◽  
Zhongjie Ma ◽  
Michael Gleicher ◽  
Colin N. Dewey

SummaryCell type annotation is a fundamental task in the analysis of single-cell RNA-sequencing data. In this work, we present CellO, a machine learning-based tool for annotating human RNA-seq data with the Cell Ontology. CellO enables accurate and standardized cell type classification by considering the rich hierarchical structure of known cell types, a source of prior knowledge that is not utilized by existing methods. Furthemore, CellO comes pre-trained on a novel, comprehensive dataset of human, healthy, untreated primary samples in the Sequence Read Archive, which to the best of our knowledge, is the most diverse curated collection of primary cell data to date. CellO’s comprehensive training set enables it to run out-of-the-box on diverse cell types and achieves superior or competitive performance when compared to existing state-of-the-art methods. Lastly, CellO’s linear models are easily interpreted, thereby enabling exploration of cell type-specific expression signatures across the ontology. To this end, we also present the CellO Viewer: a web application for exploring CellO’s models across the ontology.HighlightWe present CellO, a tool for hierarchically classifying cell type from single-cell RNA-seq data against the graph-structured Cell OntologyCellO is pre-trained on a comprehensive dataset comprising nearly all bulk RNA-seq primary cell samples in the Sequence Read ArchiveCellO achieves superior or comparable performance with existing methods while featuring a more comprehensive pre-packaged training setCellO is built with easily interpretable models which we expose through a novel web application, the CellO Viewer, for exploring cell type-specific signatures across the Cell OntologyGraphical Abstract


1985 ◽  
Vol 101 (4) ◽  
pp. 1442-1454 ◽  
Author(s):  
P Cowin ◽  
H P Kapprell ◽  
W W Franke

Desmosomal plaque proteins have been identified in immunoblotting and immunolocalization experiments on a wide range of cell types from several species, using a panel of monoclonal murine antibodies to desmoplakins I and II and a guinea pig antiserum to desmosomal band 5 protein. Specifically, we have taken advantage of the fact that certain antibodies react with both desmoplakins I and II, whereas others react only with desmoplakin I, indicating that desmoplakin I contains unique regions not present on the closely related desmoplakin II. While some of these antibodies recognize epitopes conserved between chick and man, others display a narrow species specificity. The results show that proteins whose size, charge, and biochemical behavior are very similar to those of desmoplakin I and band 5 protein of cow snout epidermis are present in all desmosomes examined. These include examples of simple and pseudostratified epithelia and myocardial tissue, in addition to those of stratified epithelia. In contrast, in immunoblotting experiments, we have detected desmoplakin II only among cells of stratified and pseudostratified epithelial tissues. This suggests that the desmosomal plaque structure varies in its complement of polypeptides in a cell-type specific manner. We conclude that the obligatory desmosomal plaque proteins, desmoplakin I and band 5 protein, are expressed in a coordinate fashion but independently from other differentiation programs of expression such as those specific for either epithelial or cardiac cells.


2019 ◽  
Author(s):  
Kai Yao ◽  
Nash D. Rochman ◽  
Sean X. Sun

AbstractConvolutional neural networks (ConvNets) have been used for both classification and semantic segmentation of cellular images. Here we establish a method for cell type classification utilizing images taken on a benchtop microscope directly from cell culture flasks eliminating the need for a dedicated imaging platform. Significant flask-to-flask heterogeneity was discovered and overcome to support network generalization to novel data. Cell density was found to be a prominent source of heterogeneity even within the single-cell regime indicating the presence of morphological effects due to diffusion-mediated cell-cell interaction. Expert classification was poor for single-cell images and excellent for multi-cell images suggesting experts rely on the identification of characteristic phenotypes within subsets of each population and not ubiquitous identifiers. Finally we introduce Self-Label Clustering, an unsupervised clustering method relying on ConvNet feature extraction able to identify distinct morphological phenotypes within a cell type, some of which are observed to be cell density dependent.Author summaryK.Y., N.D.R., and S.X.S. designed experiments and computational analysis. K.Y. performed experiments and ConvNets design/training, K.Y., N.D.R and S.X.S wrote the paper.


Author(s):  
Yinghao Cao ◽  
Xiaoyue Wang ◽  
Gongxin Peng

AbstractCurrently most methods take manual strategies to annotate cell types after clustering the single-cell RNA sequencing (scRNA-seq) data. Such methods are labor-intensive and heavily rely on user expertise, which may lead to inconsistent results. We present SCSA, an automatic tool to annotate cell types from scRNA-seq data, based on a score annotation model combining differentially expressed genes (DEGs) and confidence levels of cell markers from both known and user-defined information. Evaluation on real scRNA-seq datasets from different sources with other methods shows that SCSA is able to assign the cells into the correct types at a fully automated mode with a desirable precision.


2019 ◽  
Author(s):  
Xiaoyang Chen ◽  
Shengquan Chen ◽  
Rui Jiang

AbstractBackgroundIn recent years, the rapid development of single-cell RNA-sequencing (scRNA-seq) techniques enables the quantitative characterization of cell types at a single-cell resolution. With the explosive growth of the number of cells profiled in individual scRNA-seq experiments, there is a demand for novel computational methods for classifying newly-generated scRNA-seq data onto annotated labels. Although several methods have recently been proposed for the cell-type classification of single-cell transcriptomic data, such limitations as inadequate accuracy, inferior robustness, and low stability greatly limit their wide applications.ResultsWe propose a novel ensemble approach, named EnClaSC, for accurate and robust cell-type classification of single-cell transcriptomic data. Through comprehensive validation experiments, we demonstrate that EnClaSC can not only be applied to the self-projection within a specific dataset and the cell-type classification across different datasets, but also scale up well to various data dimensionality and different data sparsity. We further illustrate the ability of EnClaSC to effectively make cross-species classification, which may shed light on the studies in correlation of different species. EnClaSC is freely available at https://github.com/xy-chen16/EnClaSC.ConclusionsEnClaSC enables highly accurate and robust cell-type classification of single-cell transcriptomic data via an ensemble learning method. We expect to see wide applications of our method to not only transcriptome studies, but also the classification of more general data.


BMC Genomics ◽  
2021 ◽  
Vol 22 (S3) ◽  
Author(s):  
Yueh-Hua Tu ◽  
Hsueh-Fen Juan ◽  
Hsuan-Cheng Huang

Abstract Background A new class of regulatory elements called super-enhancers, comprised of multiple neighboring enhancers, have recently been reported to be the key transcriptional drivers of cellular, developmental, and disease states. Results Here, we defined super-enhancer RNAs as highly expressed enhancer RNAs that are transcribed from a cluster of localized genomic regions. Using the cap analysis of gene expression sequencing data from FANTOM5, we systematically explored the enhancer and messenger RNA landscapes in hundreds of different cell types in response to various environments. Applying non-negative matrix factorization (NMF) to super-enhancer RNA profiles, we found that different cell types were well classified. In addition, through the NMF of individual time-course profiles from a single cell-type, super-enhancer RNAs were clustered into several states with progressive patterns. We further investigated the enriched biological functions of the proximal genes involved in each pattern, and found that they were associated with the corresponding developmental process. Conclusions The proposed super-enhancer RNAs can act as a good alternative, without the complicated measurement of histone modifications, for identifying important regulatory elements of cell type specification and identifying dynamic cell states.


2021 ◽  
Author(s):  
Wenxuan Deng ◽  
Biqing Zhu ◽  
Seyoung Park ◽  
Tomokazu S. Sumida ◽  
Avraham Unterman ◽  
...  

Compared with sequencing-based global genomic profiling, cytometry labels targeted surface markers on millions of cells in parallel either by conjugated rare earth metal particles or Unique Molecular Identifier (UMI) barcodes. Correct annotation of these cells to specific cell types is a key step in the analysis of these data. However, there is no computational tool that automatically annotates single cell proteomics data for cell type inference. In this manuscript, we propose an automated single cell proteomics data annotation approach called ProtAnno to facilitate cell type assignments without laborious manual gating. ProtAnno is designed to incorporate information from annotated single cell RNA-seq (scRNA-seq), CITE-seq, and prior data knowledge (which can be imprecise) on biomarkers for different cell types. We have performed extensive simulations to demonstrate the accuracy and robustness of ProtAnno. For several single cell proteomics datasets that have been manually labeled, ProtAnno was able to correctly label most single cells. In summary, ProtAnno offers an accurate and robust tool to automate cell type annotations for large single cell proteomics datasets, and the analysis of such annotated cell types can offer valuable biological insights.


2020 ◽  
Author(s):  
Jiaxin Fan ◽  
Xuran Wang ◽  
Rui Xiao ◽  
Mingyao Li

AbstractAllelic expression imbalance (AEI), quantified by the relative expression of two alleles of a gene in a diploid organism, can help explain phenotypic variations among individuals. Traditional methods detect AEI using bulk RNA sequencing (RNA-seq) data, a data type that averages out cell-to-cell heterogeneity in gene expression across cell types. Since the patterns of AEI may vary across different cell types, it is desirable to study AEI in a cell-type-specific manner. Although this can be achieved by single-cell RNA sequencing (scRNA-seq), it requires full-length transcript to be sequenced in single cells of a large number of individuals, which are still cost prohibitive to generate. To overcome this limitation and utilize the vast amount of existing disease relevant bulk tissue RNA-seq data, we developed BSCET, which enables the characterization of cell-type-specific AEI in bulk RNA-seq data by integrating cell type composition information inferred from a small set of scRNA-seq samples, possibly obtained from an external dataset. By modeling covariate effect, BSCET can also detect genes whose cell-type-specific AEI are associated with clinical factors. Through extensive benchmark evaluations, we show that BSCET correctly detected genes with cell-type-specific AEI and differential AEI between healthy and diseased samples using bulk RNA-seq data. BSCET also uncovered cell-type-specific AEIs that were missed in bulk data analysis when the directions of AEI are opposite in different cell types. We further applied BSCET to two pancreatic islet bulk RNA-seq datasets, and detected genes showing cell-type-specific AEI that are related to the progression of type 2 diabetes. Since bulk RNA-seq data are easily accessible, BSCET provided a convenient tool to integrate information from scRNA-seq data to gain insight on AEI with cell type resolution. Results from such analysis will advance our understanding of cell type contributions in human diseases.Author SummaryDetection of allelic expression imbalance (AEI), a phenomenon where the two alleles of a gene differ in their expression magnitude, is a key step towards the understanding of phenotypic variations among individuals. Existing methods detect AEI use bulk RNA sequencing (RNA-seq) data and ignore AEI variations among different cell types. Although single-cell RNA sequencing (scRNA-seq) has enabled the characterization of cell-to-cell heterogeneity in gene expression, the high costs have limited its application in AEI analysis. To overcome this limitation, we developed BSCET to characterize cell-type-specific AEI using the widely available bulk RNA-seq data by integrating cell-type composition information inferred from scRNA-seq samples. Since the degree of AEI may vary with disease phenotypes, we further extended BSCET to detect genes whose cell-type-specific AEIs are associated with clinical factors. Through extensive benchmark evaluations and analyses of two pancreatic islet bulk RNA-seq datasets, we demonstrated BSCET’s ability to refine bulk-level AEI to cell-type resolution, and to identify genes whose cell-type-specific AEIs are associated with the progression of type 2 diabetes. With the vast amount of easily accessible bulk RNA-seq data, we believe BSCET will be a valuable tool for elucidating cell type contributions in human diseases.


Sign in / Sign up

Export Citation Format

Share Document