SCHNAPPs - Single Cell sHiNy APPlication(s)

ABSTRACTMotivationSingle-cell RNA-sequencing (scRNAseq) experiments are becoming a standard tool for bench-scientists to explore the cellular diversity present in all tissues. On one hand, the data produced by scRNASeq is technically complex, with analytical workflows that are still very much an active field of bioinformatics research, and on the other hand, a wealth of biological background knowledge is often needed to guide the investigation. Therefore, there is an increasing need to develop applications geared towards bench-scientists to help them abstract the technical challenges of the analysis, so that they can focus on the Science at play. It is also expected that such applications should support closer collaboration between bioinformaticians and bench-scientists by providing reproducible science tools.ResultsWe present SCHNAPPs, a computer program designed to enable bench-scientists to autonomously explore and interpret single cell RNA-seq expression data and associated annotations. The Shiny-based application allows selecting genes and cells of interest, performing quality control, normalization, clustering, and differential expression analyses, applying standard workflows from Seurat (Stuart et al., 2019) or Scran (Lun et al., 2016) packages, and most of the common visualizations. An R-markdown report can be generated that tracks the modifications, and selected visualizations facilitating communication and reproducibility between bench-scientist and bioinformatician. The modular design of the tool allows to easily integrate new visualizations and analyses by bioinformaticians. We still recommend that a data analysis specialist oversees the analysis and interpretation.AvailabilityThe SCHNAPPs application, docker file, and documentation are available on GitHub: https://c3bi-pasteur-fr.github.io/UTechSCB-SCHNAPPs; Example contribution are available at the following GitHub site: https://github.com/baj12/SCHNAPPsContributions.

Download Full-text

Genotype-free demultiplexing of pooled single-cell RNA-seq

Genome Biology ◽

10.1186/s13059-019-1852-7 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 6

Author(s):

Jun Xu ◽

Caitlin Falconer ◽

Quan Nguyen ◽

Joanna Crawford ◽

Brett D. McKinnon ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Genetic Differences ◽

Rna Seq ◽

Link Type ◽

Genotype Information ◽

Single Cell Rna Sequencing ◽

Pooled Samples

AbstractA variety of methods have been developed to demultiplex pooled samples in a single cell RNA sequencing (scRNA-seq) experiment which either require hashtag barcodes or sample genotypes prior to pooling. We introduce scSplit which utilizes genetic differences inferred from scRNA-seq data alone to demultiplex pooled samples. scSplit also enables mapping clusters to original samples. Using simulated, merged, and pooled multi-individual datasets, we show that scSplit prediction is highly concordant with demuxlet predictions and is highly consistent with the known truth in cell-hashing dataset. scSplit is ideally suited to samples without external genotype information and is available at: https://github.com/jon-xu/scSplit

Download Full-text

Linnorm: improved statistical analysis for single cell RNA-seq expression data

Nucleic Acids Research ◽

10.1093/nar/gkx828 ◽

2017 ◽

Vol 45 (22) ◽

pp. e179-e179 ◽

Cited By ~ 38

Author(s):

Shun H. Yip ◽

Panwen Wang ◽

Jean-Pierre A. Kocher ◽

Pak Chung Sham ◽

Junwen Wang

Keyword(s):

Statistical Analysis ◽

Single Cell ◽

Expression Data ◽

Rna Seq

Download Full-text

Flexible comparison of batch correction methods for single-cell RNA-seq using BatchBench

10.1101/2020.05.22.111211 ◽

2020 ◽

Author(s):

Ruben Chazarra-Gil ◽

Stijn van Dongen ◽

Vladimir Yu Kiselev ◽

Martin Hemberg

Keyword(s):

Single Cell ◽

Computational Methods ◽

Rna Seq ◽

Batch Effects ◽

Systematic Comparison ◽

Batch Correction ◽

Link Type ◽

Biological Signals ◽

The Cost

AbstractAs the cost of single-cell RNA-seq experiments has decreased, an increasing number of datasets are now available. Combining newly generated and publicly accessible datasets is challenging due to non-biological signals, commonly known as batch effects. Although there are several computational methods available that can remove batch effects, evaluating which method performs best is not straightforward. Here we present BatchBench (https://github.com/cellgeni/batchbench), a modular and flexible pipeline for comparing batch correction methods for single-cell RNA-seq data. We apply BatchBench to eight methods, highlighting their methodological differences and assess their performance and computational requirements through a compendium of well-studied datasets. This systematic comparison guides users in the choice of batch correction tool, and the pipeline makes it easy to evaluate other datasets.

Download Full-text

scMerge leverages factor analysis, stable expression, and pseudoreplication to merge multiple single-cell RNA-seq datasets

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1820006116 ◽

2019 ◽

Vol 116 (20) ◽

pp. 9775-9784 ◽

Cited By ~ 38

Author(s):

Yingxin Lin ◽

Shila Ghazanfar ◽

Kevin Y. X. Wang ◽

Johann A. Gagnon-Bartsch ◽

Kitty K. Lo ◽

...

Keyword(s):

Factor Analysis ◽

Data Integration ◽

Single Cell ◽

Rna Seq ◽

Cell Type ◽

Large Collection ◽

Single Cell Rna Sequencing ◽

Development Trajectory ◽

Biological Discovery ◽

Public Datasets

Concerted examination of multiple collections of single-cell RNA sequencing (RNA-seq) data promises further biological insights that cannot be uncovered with individual datasets. Here we present scMerge, an algorithm that integrates multiple single-cell RNA-seq datasets using factor analysis of stably expressed genes and pseudoreplicates across datasets. Using a large collection of public datasets, we benchmark scMerge against published methods and demonstrate that it consistently provides improved cell type separation by removing unwanted factors; scMerge can also enhance biological discovery through robust data integration, which we show through the inference of development trajectory in a liver dataset collection.

Download Full-text

SSCC: a novel computational framework for rapid and accurate clustering large single cell RNA-seq data

10.1101/344242 ◽

2018 ◽

Cited By ~ 2

Author(s):

Xianwen Ren ◽

Liangtao Zheng ◽

Zemin Zhang

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Large Scale ◽

Random Projection ◽

Rna Seq ◽

Sequencing Data ◽

Computational Framework ◽

Human Blood Cells ◽

Single Cell Rna Sequencing ◽

Data Volume

ABSTRACTClustering is a prevalent analytical means to analyze single cell RNA sequencing data but the rapidly expanding data volume can make this process computational challenging. New methods for both accurate and efficient clustering are of pressing needs. Here we proposed a new clustering framework based on random projection and feature construction for large scale single-cell RNA sequencing data, which greatly improves clustering accuracy, robustness and computational efficacy for various state-of-the-art algorithms benchmarked on multiple real datasets. On a dataset with 68,578 human blood cells, our method reached 20% improvements for clustering accuracy and 50-fold acceleration but only consumed 66% memory usage compared to the widely-used software package SC3. Compared to k-means, the accuracy improvement can reach 3-fold depending on the concrete dataset. An R implementation of the framework is available from https://github.com/Japrin/sscClust.

Download Full-text

Single Cell Viewer (SCV): An interactive visualization data portal for single cell RNA sequence data

10.1101/664789 ◽

2019 ◽

Cited By ~ 2

Author(s):

Shuoguo Wang ◽

Constance Brett ◽

Mohan Bolisetty ◽

Ryan Golhar ◽

Isaac Neuhaus ◽

...

Keyword(s):

Single Cell ◽

Sequence Data ◽

Single Cells ◽

Link Type ◽

Technological Advances ◽

R Shiny ◽

Data Volume ◽

Exploratory Data ◽

Cell Data ◽

Shiny Application

AbstractMotivationThanks to technological advances made in the last few years, we are now able to study transcriptomes from thousands of single cells. These have been applied widely to study various aspects of Biology. Nevertheless, comprehending and inferring meaningful biological insights from these large datasets is still a challenge. Although tools are being developed to deal with the data complexity and data volume, we do not have yet an effective visualizations and comparative analysis tools to realize the full value of these datasets.ResultsIn order to address this gap, we implemented a single cell data visualization portal called Single Cell Viewer (SCV). SCV is an R shiny application that offers users rich visualization and exploratory data analysis options for single cell datasets.AvailabilitySource code for the application is available online at GitHub (http://www.github.com/neuhausi/single-cell-viewer) and there is a hosted exploration application using the same example dataset as this publication at http://periscopeapps.org/[email protected]; [email protected]

Download Full-text

Ultra-high throughput single-cell RNA sequencing by combinatorial fluidic indexing

10.1101/2019.12.17.879304 ◽

2019 ◽

Cited By ~ 4

Author(s):

Paul Datlinger ◽

André F Rendeiro ◽

Thorina Boenke ◽

Thomas Krausgruber ◽

Daniele Barreca ◽

...

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Population Genomics ◽

Cost Effective ◽

Mouse Cell ◽

Droplet Microfluidics ◽

Rna Seq ◽

Single Cell Rna Sequencing ◽

Massive Scale ◽

Tcr Activation

AbstractCell atlas projects and single-cell CRISPR screens hit the limits of current technology, as they require cost-effective profiling for millions of individual cells. To satisfy these enormous throughput requirements, we developed “single-cell combinatorial fluidic indexing” (scifi) and applied it to single-cell RNA sequencing. The resulting scifi-RNA-seq assay combines one-step combinatorial pre-indexing of single-cell transcriptomes with subsequent single-cell RNA-seq using widely available droplet microfluidics. Pre-indexing allows us to load multiple cells per droplet, which increases the throughput of droplet-based single-cell RNA-seq up to 15-fold, and it provides a straightforward way of multiplexing hundreds of samples in a single scifi-RNA-seq experiment. Compared to multi-round combinatorial indexing, scifi-RNA-seq provides an easier, faster, and more efficient workflow, thereby enabling massive-scale scRNA-seq experiments for a broad range of applications ranging from population genomics to drug screens with scRNA-seq readout. We benchmarked scifi-RNA-seq on various human and mouse cell lines, and we demonstrated its feasibility for human primary material by profiling TCR activation in T cells.

Download Full-text

A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies

Genes ◽

10.3390/genes12121947 ◽

2021 ◽

Vol 12 (12) ◽

pp. 1947

Author(s):

Samarendra Das ◽

Anil Rai ◽

Michael L. Merchant ◽

Matthew C. Cave ◽

Shesh N. Rai

Keyword(s):

Single Cell ◽

Differential Expression ◽

Rna Sequencing ◽

High Throughput Sequencing ◽

Performance Metrics ◽

Differential Expression Analysis ◽

Individual Performance ◽

Rna Seq ◽

Gene Expressions ◽

Single Cell Rna Sequencing

Single-cell RNA-sequencing (scRNA-seq) is a recent high-throughput sequencing technique for studying gene expressions at the cell level. Differential Expression (DE) analysis is a major downstream analysis of scRNA-seq data. DE analysis the in presence of noises from different sources remains a key challenge in scRNA-seq. Earlier practices for addressing this involved borrowing methods from bulk RNA-seq, which are based on non-zero differences in average expressions of genes across cell populations. Later, several methods specifically designed for scRNA-seq were developed. To provide guidance on choosing an appropriate tool or developing a new one, it is necessary to comprehensively study the performance of DE analysis methods. Here, we provide a review and classification of different DE approaches adapted from bulk RNA-seq practice as well as those specifically designed for scRNA-seq. We also evaluate the performance of 19 widely used methods in terms of 13 performance metrics on 11 real scRNA-seq datasets. Our findings suggest that some bulk RNA-seq methods are quite competitive with the single-cell methods and their performance depends on the underlying models, DE test statistic(s), and data characteristics. Further, it is difficult to obtain the method which will be best-performing globally through individual performance criterion. However, the multi-criteria and combined-data analysis indicates that DECENT and EBSeq are the best options for DE analysis. The results also reveal the similarities among the tested methods in terms of detecting common DE genes. Our evaluation provides proper guidelines for selecting the proper tool which performs best under particular experimental settings in the context of the scRNA-seq.

Download Full-text

Subpopulation identification for single-cell RNA-sequencing data using functional data analysis

10.1101/760413 ◽

2019 ◽

Author(s):

Kyungmin Ahn ◽

Hironobu Fujiwara

Keyword(s):

Gene Expression ◽

Data Analysis ◽

Single Cell ◽

Gene Expression Data ◽

Functional Data Analysis ◽

Functional Data ◽

Clustering Algorithms ◽

Expression Data ◽

Clustering Methods ◽

Single Cell Rna Sequencing

AbstractBackgroundIn single-cell RNA-sequencing (scRNA-seq) data analysis, a number of statistical tools in multivariate data analysis (MDA) have been developed to help analyze the gene expression data. This MDA approach is typically focused on examining discrete genomic units of genes that ignores the dependency between the data components. In this paper, we propose a functional data analysis (FDA) approach on scRNA-seq data whereby we consider each cell as a single function. To avoid a large number of dropouts (zero or zero-closed values) and reduce the high dimensionality of the data, we first perform a principal component analysis (PCA) and assign PCs to be the amplitude of the function. Then we use the index of PCs directly from PCA for the phase components. This approach allows us to apply FDA clustering methods to scRNA-seq data analysis.ResultsTo demonstrate the robustness of our method, we apply several existing FDA clustering algorithms to the gene expression data to improve the accuracy of the classification of the cell types against the conventional clustering methods in MDA. As a result, the FDA clustering algorithms achieve superior accuracy on simulated data as well as real data such as human and mouse scRNA-seq data.ConclusionsThis new statistical technique enhances the classification performance and ultimately improves the understanding of stochastic biological processes. This new framework provides an essentially different scRNA-seq data analytical approach, which can complement conventional MDA methods. It can be truly effective when current MDA methods cannot detect or uncover the hidden functional nature of the gene expression dynamics.

Download Full-text

Splatter: simulation of single-cell RNA sequencing data

10.1101/133173 ◽

2017 ◽

Cited By ~ 8

Author(s):

Luke Zappia ◽

Belinda Phipson ◽

Alicia Oshlack

Keyword(s):

Single Cell ◽

Rna Sequencing ◽

Real Data ◽

Cell Types ◽

Rna Seq ◽

Sequencing Data ◽

Sequencing Technologies ◽

Simulation Based ◽

Single Cell Rna Sequencing ◽

Multiple Cell

AbstractAs single-cell RNA sequencing technologies have rapidly developed, so have analysis methods. Many methods have been tested, developed and validated using simulated datasets. Unfortunately, current simulations are often poorly documented, their similarity to real data is not demonstrated, or reproducible code is not available.Here we present the Splatter Bioconductor package for simple, reproducible and well-documented simulation of single-cell RNA-seq data. Splatter provides an interface to multiple simulation methods including Splat, our own simulation, based on a gamma-Poisson distribution. Splat can simulate single populations of cells, populations with multiple cell types or differentiation paths.

Download Full-text