Mapping Chronic Disease Prevalence based on Medication Use and Socio-demographic variables: an Application of LASSO in healthcare in the Netherlands
Abstract Objectives Policymakers generally lack sufficiently detailed health information to develop localized health policy plans. Chronic disease prevalence mapping is difficult as accurate direct sources are often lacking. Improvement is possible by adding extra information such as medication use and demographic information to identify disease. The aim of the current study was to use a LASSO (Least Absolute Shrinkage and Selection) model on a wide set of variables including medication use to obtain small geographic area prevalence estimates for four common chronic diseases and investigate regional patterns of disease. Methods Administrative hospital records and general practitioner registry data were linked to medication use and socio-economic characteristics. The training set (n=707021) contained GP diagnosis and/or hospital admission diagnosis as the standard for disease prevalence. For the entire Dutch population (n = 16,777,888), all information except GP and hospital admission was available. A LASSO operator regression model for binary outcomes was used to select variables strongly associated with disease. Dutch municipality (non-)standardized prevalence estimates for stroke, CHD, COPD and diabetes were then based on the average of individual predicted probabilities. Results Adding medication use data as a predictor substantially improves model performance. Estimates at the municipality level are best for diabetes with a weighted percentage error (WPE) of 6.8%, and worst WPE for COPD, with 14.5%. Disease prevalence has clear regional patterns, also after standardization for age. Conclusion Adding medication use as an indicator of disease prevalence next to socio-economic variables substantially improved estimates at the municipality level. The resulting individual disease probabilities can be aggregated into any desired regional level and provide a useful tool to identify regional patterns and subsequently inform local policy.