Differential privacy in the 2020 Census will distort COVID-19 rates
Scientists and policy makers rely on accurate population and mortality data to inform efforts regarding the coronavirus disease 2019 (COVID-19) pandemic, with age-specific mortality rates of high importance due to the concentration of COVID-19 deaths at older ages. Population counts – the principal denominators for calculating age-specific mortality rates – will be subject to noise infusion in the United States with the 2020 Census via a disclosure avoidance system based on differential privacy. Using COVID-19 mortality curves from the CDC, we show that differential privacy will introduce substantial distortion in COVID-19 mortality rates – sometimes causing mortality rates to exceed 100\% -- hindering our ability to understand the pandemic. This distortion is particularly large for population groupings with fewer than 1000 persons – 40\% of all county-level age-sex groupings and 60\% of race groupings. The US Census Bureau should consider a larger privacy budget and data users should consider pooling data to increase population sizes to minimize differential privacy’s distortion.