A Multi-center Cross-platform Single-cell RNA Sequencing Reference Dataset
AbstractSingle-cell RNA sequencing (scRNA-seq) is developing rapidly, and investigators seeking to use this technology are left with a variety of options for both experimental platform and bioinformatics methods. There is an urgent need for scRNA-seq reference datasets for benchmarking of different scRNA-seq platforms and bioinformatics methods. To be broadly applicable, these should be generated from renewable, well characterized reference samples and processed in multiple centers across different platforms. Here we present a benchmarking scRNA-seq dataset that includes 20 scRNA-seq datasets acquired either as a mixtures or as individual samples from two biologically distinct cell lines for which a large amount of multi-platform whole genome sequencing data are also available. These scRNA-seq datasets were generated from multiple popular platforms across four sequencing centers. Our benchmark datasets provide a resource that we believe will have great value for the single-cell community by serving as a reference dataset for evaluating various bioinformatics methods for scRNA-seq analyses, including but not limited to data preprocessing, imputation, normalization, clustering, batch correction, and differential analysis.