Abstract P030: Complementary Variable Selection Methods Highlight Joint Contribution Of Cystatin C And Apolipoprotein B For Cardiovascular Risk Prediction
Introduction: Variable selection methods can provide an unbiased means of identifying informative predictors but have rarely been applied to CVD risk prediction. Hypothesis: Additional variables beyond those in pooled cohort equations may improve CVD risk prediction. Methods: Use of two complementary variable selection methods (LASSO stability selection, parametric, and survival random forests, non-parametric) to identify jointly informative sets of predictors for CVD risk and rank them in order of predictive accuracy. We used a prospective cohort (UK Biobank) of 304,839 participants aged 40-69 years at enrollment (2006—2010) without prior CVD, with follow-up to March 2017. Variables comprised those in pooled cohort equations with additional biochemistry and hematology data and polygenic risk scores for CVD. Outcomes were CVD hospitalization, procedure/operation or mortality. Data were sex-stratified and divided into independent variable selection (40%), training (30%) and test (30%) sets. Variable selection via penalized (LASSO) Cox regression with stability analysis. Variables ranked according to mean change in C statistic after variable permutation in survival random forests. Results: Mean age 55.9 years; 10,267 CVD events (6,277 men [59.0%]), median 8.1 years follow-up. The Figure summarizes results from LASSO stability selection. Jointly informative predictors for both men and women were cystatin C, apolipoprotein B, family history of coronary artery disease and polygenic risk score in addition to age, systolic blood pressure, antihypertensive use and current smoking used in pooled cohort equations. Other than variables already included in pooled cohort equations, cystatin C and apolipoprotein B ranked highest in random forests for men and for women. Conclusions: Use of two complementary data-driven variable selection methods identified variables more highly selected for CVD prediction beyond those included in pooled cohort equations.