ReadZS detects developmentally regulated RNA processing programs in single cell RNA-seq and defines subpopulations independent of gene expression
Post-transcriptional regulation of RNA processing (RNAP), including splicing and alternative polyadenylation (APA), controls eukaryotic gene function. Conservative estimates based on bulk tissue studies conclude that at least 50% of mammalian genes undergo APA. Single-cell RNA sequencing (scRNA-seq) could enable a near complete estimate of the extent, function, and regulation of these and other forms of RNA processing. Yet, statistical methods to detect regulated RNAP are limited in their detection power because they suffer from reliance on (a) incomplete annotations of 3' untranslated regions (3' UTRs), (b) peak calling heuristics, (c) analysis based on measurements collapsed over all cells in a cell type (pseudobulking), or (d) APA-specific detection. Here, we introduce ReadZS, a computationally-efficient, and annotation-free statistical approach to identify regulated RNAP, including but not limited to APA, in single cells. ReadZS rediscovers and substantially extends the scope of known cell type-specific RNAP in the human lung and during human spermatogenesis. The unique single-cell resolution and statistical properties of ReadZS enable discovery of new evolutionarily conserved, developmentally regulated RNAP and subpopulations of lung-resident macrophages, homogenous by gene expression alone.