ABSTRACTThe gene expression profiles of human breast tumours fall into three main groups that have been called luminal, basal and either HER2-enriched or molecular apocrine. To escape from the circularity of descriptive classifications based purely on gene signatures I describe a biological classification based on a model of the mammary lineage. In this model I propose that the third group is a tumour derived from a mammary hormone-sensing cell that has undergone apocrine metaplasia. I first split tumours into hormone sensing and milk secreting cells based on the expression of transcription factors linked to cell identity (the luminal progenitor split), then split the hormone sensing group into luminal and apocrine groups based on oestrogen receptor activity (the luminal-apocrine split). I show that the luminal-apocrine-basal (LAB) approach can be applied to microarray data (186 tumours) from an EORTC trial and to RNA-seq data from TCGA (674 tumours), and compare results obtained with the LAB and PAM50 approaches. Unlike pure signature-based approaches, classification based on an explicit biological model has the advantage that it is both refutable and capable of meaningful improvement as biological understanding of mammary tumorigenesis improves.