scholarly journals Using Neural Networks to Improve Single Cell RNA-Seq Data Analysis

2017 ◽  
Author(s):  
Chieh Lin ◽  
Siddhartha Jain ◽  
Hannah Kim ◽  
Ziv Bar-Joseph

AbstractWhile only recently developed, the ability to profile expression data in single cells (scRNA-Seq) has already led to several important studies and findings. However, this technology has also raised several new computational challenges including questions related to handling the noisy and sometimes incomplete data, how to identify unique group of cells in such experiments and how to determine the state or function of specific cells based on their expression profile. To address these issues we develop and test a method based on neural networks (NN) for the analysis and retrieval of single cell RNA-Seq data. We tested various NN architectures, some biologically motivated, and used these to obtain a reduced dimension representation of the single cell expression data. We show that the NN method improves upon prior methods in both, the ability to correctly group cells in experiments not used in the training and the ability to correctly infer cell type or state by querying a database of tens of thousands of single cell profiles. Such database queries (which can be performed using our web server) will enable researchers to better characterize cells when analyzing heterogeneous scRNA-Seq samples.Supporting website: http://sb.cs.cmu.edu/scnn/Password for accessing the retrieval task webserver: scRNA-Seq

2021 ◽  
Author(s):  
Hongru Shen ◽  
Xilin Shen ◽  
Mengyao Feng ◽  
Dan Wu ◽  
Chao Zhang ◽  
...  

Advancement in single-cell RNA sequencing leads to exponential accumulation of single-cell expression data. However, there is still lack of tools that could integrate these unlimited accumulation of single-cell expression data. Here, we presented a universal approach iSEEEK for integrating super large-scale single-cell expression via exploring expression rankings of top-expressing genes. We developed iSEEEK with 13.7 million single-cells. We demonstrated the efficiency of iSEEEK with canonical single-cell downstream tasks on five heterogenous datasets encompassing human and mouse samples. iSEEEK achieved good clustering performance benchmarked against well-annotated cell labels. In addition, iSEEEK could transfer its knowledge learned from large-scale expression data on new dataset that was not involved in its development. iSEEEK enables identification of gene-gene interaction networks that are characteristic of specific cell types. Our study presents a simple and yet effective method to integrate super large-scale single-cell transcriptomes and would facilitate translational single-cell research from bench to bedside.


2017 ◽  
Author(s):  
Tao Peng ◽  
Qing Nie

AbstractMeasurement of gene expression levels for multiple genes in single cells provides a powerful approach to study heterogeneity of cell populations and cellular plasticity. While the expression levels of multiple genes in each cell are available in such data, the potential connections among the cells (e.g. the cellular state transition relationship) are not directly evident from the measurement. Classifying the cellular states, identifying their transitions among those states, and extracting the pseudotime ordering of cells are challenging due to the noise in the data and the high-dimensionality in the number of genes in the data. In this paper we adapt the classical self-organizing-map (SOM) approach for single-cell gene expression data (SOMSC), such as those based on single cell qPCR and single cell RNA-seq. In SOMSC, a cellular state map (CSM) is derived and employed to identify cellular states inherited in the population of the measured single cells. Cells located in the same basin of the CSM are considered as in one cellular state while barriers among the basins in CSM provide information on transitions among the cellular states. A cellular state transitions path (e.g. differentiation) and a temporal ordering of the measured single cells are consequently obtained. In addition, SOMSC could estimate the cellular state replication probability and transition probabilities. Applied to a set of synthetic data, one single-cell qPCR data set on mouse early embryonic development and two single-cell RNA-seq data sets, SOMSC shows effectiveness in capturing cellular states and their transitions presented in the high-dimensional single-cell data. This approach will have broader applications to analyzing cellular fate specification and cell lineages using single cell gene expression data


Author(s):  
Yuanchao Zhang ◽  
Man S. Kim ◽  
Elizabeth Nguyen ◽  
Deanne M. Taylor

AbstractCellular metabolism encompasses the biochemical reactions and transportation of various metabolites in cells and their surroundings, which are integrated at all levels of cellular functions. We developed a method to systematically simulate cellular metabolism using single-cell RNA-seq (scRNA-seq) data through constraint-based context specific metabolic modeling. We simulated the NAD+ biosynthesis activity in 7 different mouse tissues, and the simulated NAD+ biosynthesis flux levels showed significant linear correlation with experimental measurements in previous research. We also show that the simulated NAD+ biosynthesis fluxes are reproducible using two additional scRNA-seq datasets.


2020 ◽  
Author(s):  
Snehalika Lall ◽  
Sumanta Ray ◽  
Sanghamitra Bandyopadhyay

ABSTRACTWe propose RgCop, a novel regularized copula based method for gene selection from large single cell RNA-seq data. RgCop utilizes copula correlation (Ccor), a robust equitable dependence measure that captures multivariate dependency among a set of genes in single cell expression data. We raise an objective function by adding a l1 regularization term with Ccor to penalizes the redundant co-efficient of features/genes, resulting non-redundant effective features/genes set. Results show a significant improvement in the clustering/classification performance of real life scRNA-seq data over the other state-of-the-art. RgCop performs extremely well in capturing dependence among the features of noisy data due to the scale invariant property of copula, thereby improving the stability of the method. Moreover, the differentially expressed (DE) genes identified from the clusters of scRNA-seq data are found to provide an accurate annotation of cells. Finally, the features/genes obtained from RgCop can able to annotate the unknown cells with high accuracy.The corresponding software is available in: https://github.com/Snehalikalall/RgCop


2019 ◽  
Author(s):  
Debajyoti Sinha ◽  
Pradyumn Sinha ◽  
Ritwik Saha ◽  
Sanghamitra Bandyopadhyay ◽  
Debarka Sengupta

Abstract Summary DropClust leverages Locality Sensitive Hashing (LSH) to speed up clustering of large scale single cell expression data. Here we present the improved dropClust, a complete R package that is, fast, interoperable and minimally resource intensive. The new dropClust features a novel batch effect removal algorithm that allows integrative analysis of single cell RNA-seq (scRNA-seq) datasets. Availability and implementation dropClust is freely available at https://github.com/debsin/dropClust as an R package. A lightweight online version of the dropClust is available at https://debsinha.shinyapps.io/dropClust/. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 7 (8) ◽  
pp. eabe3610
Author(s):  
Conor J. Kearney ◽  
Stephin J. Vervoort ◽  
Kelly M. Ramsbottom ◽  
Izabela Todorovski ◽  
Emily J. Lelliott ◽  
...  

Multimodal single-cell RNA sequencing enables the precise mapping of transcriptional and phenotypic features of cellular differentiation states but does not allow for simultaneous integration of critical posttranslational modification data. Here, we describe SUrface-protein Glycan And RNA-seq (SUGAR-seq), a method that enables detection and analysis of N-linked glycosylation, extracellular epitopes, and the transcriptome at the single-cell level. Integrated SUGAR-seq and glycoproteome analysis identified tumor-infiltrating T cells with unique surface glycan properties that report their epigenetic and functional state.


2017 ◽  
Vol 45 (17) ◽  
pp. e156-e156 ◽  
Author(s):  
Chieh Lin ◽  
Siddhartha Jain ◽  
Hannah Kim ◽  
Ziv Bar-Joseph
Keyword(s):  

2018 ◽  
Vol 9 (1) ◽  
Author(s):  
Aashi Jindal ◽  
Prashant Gupta ◽  
Jayadeva ◽  
Debarka Sengupta

2018 ◽  
Vol 47 (D1) ◽  
pp. D711-D715 ◽  
Author(s):  
Awais Athar ◽  
Anja Füllgrabe ◽  
Nancy George ◽  
Haider Iqbal ◽  
Laura Huerta ◽  
...  

2017 ◽  
Vol 45 (22) ◽  
pp. e179-e179 ◽  
Author(s):  
Shun H. Yip ◽  
Panwen Wang ◽  
Jean-Pierre A. Kocher ◽  
Pak Chung Sham ◽  
Junwen Wang

Sign in / Sign up

Export Citation Format

Share Document