scholarly journals UMI-count modeling and differential expression analysis for single-cell RNA sequencing

2018 ◽  
Vol 19 (1) ◽  
Author(s):  
Wenan Chen ◽  
Yan Li ◽  
John Easton ◽  
David Finkelstein ◽  
Gang Wu ◽  
...  
Genes ◽  
2021 ◽  
Vol 12 (12) ◽  
pp. 1947
Author(s):  
Samarendra Das ◽  
Anil Rai ◽  
Michael L. Merchant ◽  
Matthew C. Cave ◽  
Shesh N. Rai

Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.


2021 ◽  
Author(s):  
Gerard A. Bouland ◽  
Ahmed Mahfouz ◽  
Marcel J.T. Reinders

AbstractSingle-cell RNA sequencing data is characterized by a large number of zero counts, yet there is growing evidence that these zeros reflect biological rather than technical artifacts. We propose differential dropout analysis (DDA), as an alternative to differential expression analysis (DEA), to identify the effects of biological variation in single-cell RNA sequencing data. Using 16 publicly available datasets, we show that dropout patterns are biological in nature and can assess the relative abundance of transcripts more robustly than counts.


2018 ◽  
Vol 34 (18) ◽  
pp. 3217-3219 ◽  
Author(s):  
Sören Müller ◽  
Ara Cho ◽  
Siyuan J Liu ◽  
Daniel A Lim ◽  
Aaron Diaz

Abstract Motivation Single-cell RNA-sequencing (scRNA-seq) has enabled studies of tissue composition at unprecedented resolution. However, the application of scRNA-seq to clinical cancer samples has been limited, partly due to a lack of scRNA-seq algorithms that integrate genomic mutation data. Results To address this, we present CONICS COpy-Number analysis In single-Cell RNA-Sequencing. CONICS is a software tool for mapping gene expression from scRNA-seq to tumor clones and phylogenies, with routines enabling: the quantitation of copy-number alterations in scRNA-seq, robust separation of neoplastic cells from tumor-infiltrating stroma, inter-clone differential-expression analysis and intra-clone co-expression analysis. Availability and implementation CONICS is written in Python and R, and is available from https://github.com/diazlab/CONICS. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 2 (1) ◽  
pp. 43-61
Author(s):  
Aanchal Malhotra ◽  
Samarendra Das ◽  
Shesh N. Rai

Single-cell RNA-sequencing (scRNA-seq) technology provides an excellent platform for measuring the expression profiles of genes in heterogeneous cell populations. Multiple tools for the analysis of scRNA-seq data have been developed over the years. The tools require complicated commands and steps to analyze the underlying data, which are not easy to follow by genome researchers and experimental biologists. Therefore, we describe a step-by-step workflow for processing and analyzing the scRNA-seq unique molecular identifier (UMI) data from Human Lung Adenocarcinoma cell lines. We demonstrate the basic analyses including quality check, mapping and quantification of transcript abundance through suitable real data example to obtain UMI count data. Further, we performed basic statistical analyses, such as zero-inflation, differential expression and clustering analyses on the obtained count data. We studied the effects of excess zero-inflation present in scRNA-seq data on the downstream analyses. Our findings indicate that the zero-inflation associated with UMI data had no or minimal role in clustering, while it had significant effect on identifying differentially expressed genes. We also provide an insight into the comparative analysis for differential expression analysis tools based on zero-inflated negative binomial and negative binomial models on scRNA-seq data. The sensitivity analysis enhanced our findings in that the negative binomial model-based tool did not provide an accurate and efficient way to analyze the scRNA-seq data. This study provides a set of guidelines for the users to handle and analyze real scRNA-seq data more easily.


2018 ◽  
Author(s):  
Aanchal Mongia ◽  
Debarka Sengupta ◽  
Angshul Majumdar

AbstractMotivationSingle cell RNA sequencing has been proved to be revolutionary for its potential of zooming into complex biological systems. Genome wide expression analysis at single cell resolution, provides a window into dynamics of cellular phenotypes. This facilitates characterization of transcriptional heterogeneity in normal and diseased tissues under various conditions. It also sheds light on development or emergence of specific cell populations and phenotypes. However, owing to the paucity of input RNA, a typical single cell RNA sequencing data features a high number of dropout events where transcripts fail to get amplified.ResultsWe introduce mcImpute, a low-rank matrix completion based technique to impute dropouts in single cell expression data. On a number of real datasets, application of mcImpute yields significant improvements in separation of true zeros from dropouts, cell-clustering, differential expression analysis, cell type separability, performance of dimensionality reduction techniques for cell visualization and gene distribution.Availability and Implementationhttps://github.com/aanchalMongia/McImpute_scRNAseq


2018 ◽  
Vol 34 (19) ◽  
pp. 3340-3348 ◽  
Author(s):  
Zhijin Wu ◽  
Yi Zhang ◽  
Michael L Stitzel ◽  
Hao Wu

Sign in / Sign up

Export Citation Format

Share Document