scholarly journals Cell type classification and unsupervised morphological phenotype identification from low-res images with deep learning

2019 ◽  
Author(s):  
Kai Yao ◽  
Nash D. Rochman ◽  
Sean X. Sun

AbstractConvolutional neural networks (ConvNets) have been used for both classification and semantic segmentation of cellular images. Here we establish a method for cell type classification utilizing images taken on a benchtop microscope directly from cell culture flasks eliminating the need for a dedicated imaging platform. Significant flask-to-flask heterogeneity was discovered and overcome to support network generalization to novel data. Cell density was found to be a prominent source of heterogeneity even within the single-cell regime indicating the presence of morphological effects due to diffusion-mediated cell-cell interaction. Expert classification was poor for single-cell images and excellent for multi-cell images suggesting experts rely on the identification of characteristic phenotypes within subsets of each population and not ubiquitous identifiers. Finally we introduce Self-Label Clustering, an unsupervised clustering method relying on ConvNet feature extraction able to identify distinct morphological phenotypes within a cell type, some of which are observed to be cell density dependent.Author summaryK.Y., N.D.R., and S.X.S. designed experiments and computational analysis. K.Y. performed experiments and ConvNets design/training, K.Y., N.D.R and S.X.S wrote the paper.

2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Kai Yao ◽  
Nash D. Rochman ◽  
Sean X. Sun

Abstract Convolutional neural networks (ConvNets) have proven to be successful in both the classification and semantic segmentation of cell images. Here we establish a method for cell type classification utilizing images taken with a benchtop microscope directly from cell culture flasks, eliminating the need for a dedicated imaging platform. Significant flask-to-flask morphological heterogeneity was discovered and overcome to support network generalization to novel data. Cell density was found to be a prominent source of heterogeneity even when cells are not in contact. For the same cell types, expert classification was poor for single-cell images and better for multi-cell images, suggesting experts rely on the identification of characteristic phenotypes within subsets of each population. We also introduce Self-Label Clustering (SLC), an unsupervised clustering method relying on feature extraction from the hidden layers of a ConvNet, capable of cellular morphological phenotyping. This clustering approach is able to identify distinct morphological phenotypes within a cell type, some of which are observed to be cell density dependent. Finally, our cell classification algorithm was able to accurately identify cells in mixed populations, showing that ConvNet cell type classification can be a label-free alternative to traditional cell sorting and identification.


2019 ◽  
Author(s):  
Jingxin Liu ◽  
You Song ◽  
Jinzhi Lei

We present the use of single-cell entropy (scEntropy) to measure the order of the cellular transcriptome profile from single-cell RNA-seq data, which leads to a method of unsupervised cell type classification through scEntropy followed by the Gaussian mixture model (scEGMM). scEntropy is straightforward in defining an intrinsic transcriptional state of a cell. scEGMM is a coherent method of cell type classification that includes no parameters and no clustering; however, it is comparable to existing machine learning-based methods in benchmarking studies and facilitates biological interpretation.


2020 ◽  
Vol 15 (01) ◽  
pp. 35-49
Author(s):  
Jingxin Liu ◽  
You Song ◽  
Jinzhi Lei

The cell is the basic functional and biological unit of life, and a complex system that contains a huge number of molecular components. How can we quantify the macroscopic state of a cell from the microscopic information of these molecular components? This is a fundamental question to increase the understanding of the human body. The recent maturation of single-cell RNA sequencing (scRNA-seq) technologies has allowed researchers to gain information on the transcriptomes of individual cells. Although considerable progress has been made in terms of cell-type clustering over the past few years, there is no strong consensus about how to define a cell state from scRNA-seq data. Here, we present single-cell entropy (scEntropy) as an order parameter for cellular transcriptome profiles from scRNA-seq data. scEntropy is a straightforward parameter with which to define the intrinsic transcriptional state of a cell that can provide a quantity to measure the developmental process and to distinguish different cell types. The proposed scEntropy followed by Gaussian mixture model (scEGMM) provides a coherent method of cell-type classification that is simple, includes no parameters or clustering and is comparable to existing machine learning-based methods in benchmarking studies. The results of cell-type classification based on scEGMM are robust and easy to biologically interpret.


2019 ◽  
Author(s):  
Matthew N. Bernstein ◽  
Zhongjie Ma ◽  
Michael Gleicher ◽  
Colin N. Dewey

SummaryCell type annotation is a fundamental task in the analysis of single-cell RNA-sequencing data. In this work, we present CellO, a machine learning-based tool for annotating human RNA-seq data with the Cell Ontology. CellO enables accurate and standardized cell type classification by considering the rich hierarchical structure of known cell types, a source of prior knowledge that is not utilized by existing methods. Furthemore, CellO comes pre-trained on a novel, comprehensive dataset of human, healthy, untreated primary samples in the Sequence Read Archive, which to the best of our knowledge, is the most diverse curated collection of primary cell data to date. CellO’s comprehensive training set enables it to run out-of-the-box on diverse cell types and achieves superior or competitive performance when compared to existing state-of-the-art methods. Lastly, CellO’s linear models are easily interpreted, thereby enabling exploration of cell type-specific expression signatures across the ontology. To this end, we also present the CellO Viewer: a web application for exploring CellO’s models across the ontology.HighlightWe present CellO, a tool for hierarchically classifying cell type from single-cell RNA-seq data against the graph-structured Cell OntologyCellO is pre-trained on a comprehensive dataset comprising nearly all bulk RNA-seq primary cell samples in the Sequence Read ArchiveCellO achieves superior or comparable performance with existing methods while featuring a more comprehensive pre-packaged training setCellO is built with easily interpretable models which we expose through a novel web application, the CellO Viewer, for exploring cell type-specific signatures across the Cell OntologyGraphical Abstract


2020 ◽  
Author(s):  
Jinjin Tian ◽  
Jiebiao Wang ◽  
Kathryn Roeder

AbstractMotivationGene-gene co-expression networks (GCN) are of biological interest for the useful information they provide for understanding gene-gene interactions. The advent of single cell RNA-sequencing allows us to examine more subtle gene co-expression occurring within a cell type. Many imputation and denoising methods have been developed to deal with the technical challenges observed in single cell data; meanwhile, several simulators have been developed for benchmarking and assessing these methods. Most of these simulators, however, either do not incorporate gene co-expression or generate co-expression in an inconvenient manner.ResultsTherefore, with the focus on gene co-expression, we propose a new simulator, ESCO, which adopts the idea of the copula to impose gene co-expression, while preserving the highlights of available simulators, which perform well for simulation of gene expression marginally. Using ESCO, we assess the performance of imputation methods on GCN recovery and find that imputation generally helps GCN recovery when the data are not too sparse, and the ensemble imputation method works best among leading methods. In contrast, imputation fails to help in the presence of an excessive fraction of zero counts, where simple data aggregating methods are a better choice. These findings are further verified with mouse and human brain cell data.AvailabilityThe ESCO implementation is available as R package SplatterESCO (https://github.com/JINJINT/SplatterESCO)[email protected]


2015 ◽  
Author(s):  
Flore Nallet-Staub ◽  
Xueqian Yin ◽  
Cristèle Gilbert ◽  
Véronique Marsaud ◽  
Saber Ben Mimoun ◽  
...  

2019 ◽  
Author(s):  
Hiraku Miyagi ◽  
Michio Hiroshima ◽  
Yasushi Sako

AbstractGrowth factors regulate cell fates, including their proliferation, differentiation, survival, and death, according to the cell type. Even when the response to a specific growth factor is deterministic for collective cell behavior, significant levels of fluctuation are often observed between single cells. Statistical analyses of single-cell responses provide insights into the mechanism of cell fate decisions but very little is known about the distributions of the internal states of cells responding to growth factors. Using multi-color immunofluorescent staining, we have here detected the phosphorylation of seven elements in the early response of the ERBB–RAS–MAPK system to two growth factors. Among these seven elements, five were analyzed simultaneously in distinct combinations in the same single cells. Although principle component analysis suggested cell-type and input specific phosphorylation patterns, cell-to-cell fluctuation was large. Mutual information analysis suggested that cells use multitrack (bush-like) signal transduction pathways under conditions in which clear cell fate changes have been reported. The clustering of single-cell response patterns indicated that the fate change in a cell population correlates with the large entropy of the response, suggesting a bet-hedging strategy is used in decision making. A comparison of true and randomized datasets further indicated that this large variation is not produced by simple reaction noise, but is defined by the properties of the signal-processing network.Author SummaryHow extracellular signals, such as growth factors (GFs), induce fate changes in biological cells is still not fully understood. Some GFs induce cell proliferation and others induce differentiation by stimulating a common reaction network. Although the response to each GF is reproducible for a cell population, not all single cells respond similarly. The question that arises is whether a certain GF conducts all the responding cells in the same direction during a fate change, or if it initially stimulates a variety of behaviors among single cells, from which the cells that move in the appropriate direction are later selected. Our current statistical analysis of single-cell responses suggests that the latter process, which is called a bet-hedging mechanism is plausible. The complex pathways of signal transmission seem to be responsible for this bet-hedging.


Author(s):  
Yinghao Cao ◽  
Xiaoyue Wang ◽  
Gongxin Peng

AbstractCurrently most methods take manual strategies to annotate cell types after clustering the single-cell RNA sequencing (scRNA-seq) data. Such methods are labor-intensive and heavily rely on user expertise, which may lead to inconsistent results. We present SCSA, an automatic tool to annotate cell types from scRNA-seq data, based on a score annotation model combining differentially expressed genes (DEGs) and confidence levels of cell markers from both known and user-defined information. Evaluation on real scRNA-seq datasets from different sources with other methods shows that SCSA is able to assign the cells into the correct types at a fully automated mode with a desirable precision.


2019 ◽  
Author(s):  
Xiaoyang Chen ◽  
Shengquan Chen ◽  
Rui Jiang

AbstractBackgroundIn recent years, the rapid development of single-cell RNA-sequencing (scRNA-seq) techniques enables the quantitative characterization of cell types at a single-cell resolution. With the explosive growth of the number of cells profiled in individual scRNA-seq experiments, there is a demand for novel computational methods for classifying newly-generated scRNA-seq data onto annotated labels. Although several methods have recently been proposed for the cell-type classification of single-cell transcriptomic data, such limitations as inadequate accuracy, inferior robustness, and low stability greatly limit their wide applications.ResultsWe propose a novel ensemble approach, named EnClaSC, for accurate and robust cell-type classification of single-cell transcriptomic data. Through comprehensive validation experiments, we demonstrate that EnClaSC can not only be applied to the self-projection within a specific dataset and the cell-type classification across different datasets, but also scale up well to various data dimensionality and different data sparsity. We further illustrate the ability of EnClaSC to effectively make cross-species classification, which may shed light on the studies in correlation of different species. EnClaSC is freely available at https://github.com/xy-chen16/EnClaSC.ConclusionsEnClaSC enables highly accurate and robust cell-type classification of single-cell transcriptomic data via an ensemble learning method. We expect to see wide applications of our method to not only transcriptome studies, but also the classification of more general data.


Sign in / Sign up

Export Citation Format

Share Document