scholarly journals BD-08 A novel approach to analyze single cell RNA-Seq data from lupus nephritis samples

2018 ◽  
Author(s):  
Brian J Kegerreis ◽  
Amrie C Grammer ◽  
Peter E Lipsky
2019 ◽  
Author(s):  
Marcus Alvarez ◽  
Elior Rahmani ◽  
Brandon Jew ◽  
Kristina M. Garske ◽  
Zong Miao ◽  
...  

AbstractSingle-nucleus RNA sequencing (snRNA-seq) measures gene expression in individual nuclei instead of cells, allowing for unbiased cell type characterization in solid tissues. Contrary to single-cell RNA seq (scRNA-seq), we observe that snRNA-seq is commonly subject to contamination by high amounts of extranuclear background RNA, which can lead to identification of spurious cell types in downstream clustering analyses if overlooked. We present a novel approach to remove debris-contaminated droplets in snRNA-seq experiments, called Debris Identification using Expectation Maximization (DIEM). Our likelihood-based approach models the gene expression distribution of debris and cell types, which are estimated using EM. We evaluated DIEM using three snRNA-seq data sets: 1) human differentiating preadipocytes in vitro, 2) fresh mouse brain tissue, and 3) human frozen adipose tissue (AT) from six individuals. All three data sets showed various degrees of extranuclear RNA contamination. We observed that existing methods fail to account for contaminated droplets and led to spurious cell types. When compared to filtering using these state of the art methods, DIEM better removed droplets containing high levels of extranuclear RNA and led to higher quality clusters. Although DIEM was designed for snRNA-seq data, we also successfully applied DIEM to single-cell data. To conclude, our novel method DIEM removes debris-contaminated droplets from single-cell-based data fast and effectively, leading to cleaner downstream analysis. Our code is freely available for use at https://github.com/marcalva/diem.


Author(s):  
Kaikun Xie ◽  
Zehua Liu ◽  
Ning Chen ◽  
Ting Chen

AbstractRecent advancement of single-cell RNA-seq technology facilitates the study of cell lineages in developmental processes as well as cancer. In this manuscript, we developed a computational method, called redPATH, to reconstruct the pseudo developmental time of cell lineages using a consensus asymmetric Hamiltonian path algorithm. Besides, we implemented a novel approach to visualize the trajectory development of cells and visualization methods to provide biological insights. We validated the performance of redPATH by segmenting different stages of cell development on multiple neural stem cell and cancerous datasets, as well as other single-cell transcriptome data. In particular, we identified a subpopulation of malignant glioma cells, which are stem cell-like. These cells express known proliferative markers such as GFAP (also identified ATP1A2, IGFBPL1, ALDOC) and remain silenced in quiescent markers such as ID3. Furthermore, MCL1 is identified as a significant gene that regulates cell apoptosis, and CSF1R confirms previous studies for re-programming macrophages to control tumor growth. In conclusion, redPATH is a comprehensive tool for analyzing single-cell RNA-Seq datasets along a pseudo developmental time. The software is available via http://github.com/tinglab/redPATH.


2019 ◽  
Author(s):  
Magdalena E Strauss ◽  
Paul D W Kirk ◽  
John E Reid ◽  
Lorenz Wernisch

Abstract Motivation Many methods have been developed to cluster genes on the basis of their changes in mRNA expression over time, using bulk RNA-seq or microarray data. However, single-cell data may present a particular challenge for these algorithms, since the temporal ordering of cells is not directly observed. One way to address this is to first use pseudotime methods to order the cells, and then apply clustering techniques for time course data. However, pseudotime estimates are subject to high levels of uncertainty, and failing to account for this uncertainty is liable to lead to erroneous and/or over-confident gene clusters. Results The proposed method, GPseudoClust, is a novel approach that jointly infers pseudotemporal ordering and gene clusters, and quantifies the uncertainty in both. GPseudoClust combines a recent method for pseudotime inference with nonparametric Bayesian clustering methods, efficient MCMC sampling, and novel subsampling strategies which aid computation.We consider a broad array of simulated and experimental datasets to demonstrate the effectiveness of GPseudoClust in a range of settings. Availability An implementation is available on GitHub: https://github.com/magStra/nonparametricSummaryPSM and https://github.com/magStra/GPseudoClust. Supplementary Information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Luqin Gan ◽  
Giuseppe Vinci ◽  
Genevera I. Allen

AbstractSingle cell RNA sequencing is a powerful technique that measures the gene expression of individual cells in a high throughput fashion. However, due to sequencing inefficiency, the data is unreliable due to dropout events, or technical artifacts where genes erroneously appear to have zero expression. Many data imputation methods have been proposed to alleviate this issue. Yet, effective imputation can be difficult and biased because the data is sparse and high-dimensional, resulting in major distortions in downstream analyses. In this paper, we propose a completely novel approach that imputes the gene-by-gene correlations rather than the data itself. We call this method SCENA: Single cell RNA-seq Correlation completion by ENsemble learning and Auxiliary information. The SCENA gene-by-gene correlation matrix estimate is obtained by model stacking of multiple imputed correlation matrices based on known auxiliary information about gene connections. In an extensive simulation study based on real scRNA-seq data, we demonstrate that SCENA not only accurately imputes gene correlations but also outperforms existing imputation approaches in downstream analyses such as dimension reduction, cell clustering, graphical model estimation.


Sign in / Sign up

Export Citation Format

Share Document