UMI-count modeling and differential expression analysis for single-cell RNA sequencing

Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.

Download Full-text

Differential dropout analysis captures biological variation in single-cell RNA sequencing data

10.1101/2021.02.01.429187 ◽

2021 ◽

Author(s):

Gerard A. Bouland ◽

Ahmed Mahfouz ◽

Marcel J.T. Reinders

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Sequencing ◽

Relative Abundance ◽

Differential Expression Analysis ◽

Biological Variation ◽

Sequencing Data ◽

Single Cell Rna Sequencing ◽

Technical Artifacts ◽

Zero Counts

AbstractSingle-cell RNA sequencing data is characterized by a large number of zero counts, yet there is growing evidence that these zeros reflect biological rather than technical artifacts. We propose differential dropout analysis (DDA), as an alternative to differential expression analysis (DEA), to identify the effects of biological variation in single-cell RNA sequencing data. Using 16 publicly available datasets, we show that dropout patterns are biological in nature and can assess the relative abundance of transcripts more robustly than counts.

Download Full-text

CONICS integrates scRNA-seq with DNA sequencing to map gene expression to tumor sub-clones

Bioinformatics ◽

10.1093/bioinformatics/bty316 ◽

2018 ◽

Vol 34 (18) ◽

pp. 3217-3219 ◽

Cited By ~ 21

Author(s):

Sören Müller ◽

Ara Cho ◽

Siyuan J Liu ◽

Daniel A Lim ◽

Aaron Diaz

Keyword(s):

Gene Expression ◽

Single Cell ◽

Rna Sequencing ◽

Expression Analysis ◽

Copy Number ◽

Differential Expression Analysis ◽

Software Tool ◽

Supplementary Information ◽

Single Cell Rna Sequencing ◽

Robust Separation

Abstract Motivation Single-cell RNA-sequencing (scRNA-seq) has enabled studies of tissue composition at unprecedented resolution. However, the application of scRNA-seq to clinical cancer samples has been limited, partly due to a lack of scRNA-seq algorithms that integrate genomic mutation data. Results To address this, we present CONICS COpy-Number analysis In single-Cell RNA-Sequencing. CONICS is a software tool for mapping gene expression from scRNA-seq to tumor clones and phylogenies, with routines enabling: the quantitation of copy-number alterations in scRNA-seq, robust separation of neoplastic cells from tumor-infiltrating stroma, inter-clone differential-expression analysis and intra-clone co-expression analysis. Availability and implementation CONICS is written in Python and R, and is available from https://github.com/diazlab/CONICS. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Analysis of Single-Cell RNA-Sequencing Data: A Step-by-Step Guide

BioMedInformatics ◽

10.3390/biomedinformatics2010003 ◽

2021 ◽

Vol 2 (1) ◽

pp. 43-61

Author(s):

Aanchal Malhotra ◽

Samarendra Das ◽

Shesh N. Rai

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Sequencing ◽

Count Data ◽

Negative Binomial ◽

Expression Profiles ◽

Differential Expression Analysis ◽

Sequencing Data ◽

Zero Inflation ◽

Single Cell Rna Sequencing

Single-cell RNA-sequencing (scRNA-seq) technology provides an excellent platform for measuring the expression profiles of genes in heterogeneous cell populations. Multiple tools for the analysis of scRNA-seq data have been developed over the years. The tools require complicated commands and steps to analyze the underlying data, which are not easy to follow by genome researchers and experimental biologists. Therefore, we describe a step-by-step workflow for processing and analyzing the scRNA-seq unique molecular identifier (UMI) data from Human Lung Adenocarcinoma cell lines. We demonstrate the basic analyses including quality check, mapping and quantification of transcript abundance through suitable real data example to obtain UMI count data. Further, we performed basic statistical analyses, such as zero-inflation, differential expression and clustering analyses on the obtained count data. We studied the effects of excess zero-inflation present in scRNA-seq data on the downstream analyses. Our findings indicate that the zero-inflation associated with UMI data had no or minimal role in clustering, while it had significant effect on identifying differentially expressed genes. We also provide an insight into the comparative analysis for differential expression analysis tools based on zero-inflated negative binomial and negative binomial models on scRNA-seq data. The sensitivity analysis enhanced our findings in that the negative binomial model-based tool did not provide an accurate and efficient way to analyze the scRNA-seq data. This study provides a set of guidelines for the users to handle and analyze real scRNA-seq data more easily.

Download Full-text

McImpute: Matrix completion based imputation for single cell RNA-seq data

10.1101/361980 ◽

2018 ◽

Cited By ~ 3

Author(s):

Aanchal Mongia ◽

Debarka Sengupta ◽

Angshul Majumdar

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Expression Analysis ◽

Matrix Completion ◽

Differential Expression Analysis ◽

Low Rank ◽

Specific Cell ◽

Sequencing Data ◽

Reduction Techniques ◽

Single Cell Rna Sequencing

AbstractMotivationSingle cell RNA sequencing has been proved to be revolutionary for its potential of zooming into complex biological systems. Genome wide expression analysis at single cell resolution, provides a window into dynamics of cellular phenotypes. This facilitates characterization of transcriptional heterogeneity in normal and diseased tissues under various conditions. It also sheds light on development or emergence of specific cell populations and phenotypes. However, owing to the paucity of input RNA, a typical single cell RNA sequencing data features a high number of dropout events where transcripts fail to get amplified.ResultsWe introduce mcImpute, a low-rank matrix completion based technique to impute dropouts in single cell expression data. On a number of real datasets, application of mcImpute yields significant improvements in separation of true zeros from dropouts, cell-clustering, differential expression analysis, cell type separability, performance of dimensionality reduction techniques for cell visualization and gene distribution.Availability and Implementationhttps://github.com/aanchalMongia/McImpute_scRNAseq

Download Full-text