scGate: marker-based purification of cell types from heterogeneous single-cell RNA-seq datasets
A common bioinformatics task in single-cell data analysis is to purify a cell type or cell population of interest from heterogeneous datasets. Here we present scGate, an algorithm that automatizes marker-based purification of specific cell populations, without requiring training data or reference gene expression profiles. scGate purifies a cell population of interest using a set of markers organized in a hierarchical structure, akin to gating strategies employed in flow cytometry. In our benchmark for blood-derived and tumor-infiltrating immune cells, scGate outperforms SingleR, a state-of-the-art classifier for single-cell data. scGate is implemented as an R package and integrated with the Seurat framework, providing an intuitive tool to isolate cell populations of interest from complex scRNA-seq datasets. Availability: R package source code and reproducible tutorials are available at https://github.com/carmonalab/scGate