Abstract 008: Machine Learning For Sudden Cardiac Death Prediction: The Artherosclerosis Risk In Communities Study
Introduction: Sudden cardiac death (SCD) is the leading cause of death in the US and has significant public health impact. However, effective risk stratification for SCD remains lacking as current prediction models do not address the dynamic impact of time-varying risk factors including interim clinical events on SCD risk. Hypothesis: A recently developed machine learning approach that uses time-dependent variables and incorporate complex interactions between risk factors, Random Forest for Survival, Longitudinal, and Multivariate Data (RF-SLAM), will be able to improve SCD risk prediction. Methods: ARIC study participants were followed for adjudicated SCD. RF-SLAM partitions the information for each individual into multiple units (analogous to risk sets) and uses Poisson regression log-likelihood as the split statistic thus allowing for modeling time-varying variables. It was compared to a Poisson regression model with stepwise selection to predict SCD. Time-varying variables collected at four visits were used as candidate predictors for both prediction models, including demographics and clinical characteristics, anthropometric variables, lifestyle factors, cardiac risk factors, medication, laboratory values and biomarkers, electrophysiologic variables, and other cardiac functional indices. Predictive accuracy was assessed by area under the receiver operating characteristic curve (AUC) through out-of-bag prediction for RF-SLAM model and 10-fold cross validation for Poisson regression model. Results: Over 25 years follow-up, 590 SCD events occurred among 15792 ARIC participants mean age 54 years (55% women). Compared to Poisson regression (cross-validated mean AUC 0.75), RF-SLAM model improved prediction (mean AUC 0.83). RF-SLAM model identified prior coronary heart disease (CHD) as the top predictor for SCD. Other predictors selected by RF-SLAM included clinical characteristics (diabetes, prior myocardial infarction, prior stroke, and prior heart failure), electrophysiologic variables (T wave abnormality in any of leads I, aVL, and V6, and ST junction & segment depression in any of leads I, aVL, or V6), medication (anti-hypertensive medications and anti-diabetic medications), biomarkers (N-terminal pro-B-type natriuretic peptide, troponin T, troponin I, and creatinine), subclinical atherosclerotic indices (carotid intima-media thickness), as well as race, sex and visit. Using the 17 predictors selected by RF-SLAM model to fit a Poisson regression model, a generalized linear model, resulted in a mean AUC of 0.73, suggesting that the interactions captured by random forest improve prediction performance. Conclusions: Applying a novel machine-learning approach with time-varying predictors improves the prediction of SCD. Clinical characteristics, especially prior CHD, are important for predicting SCD in the general population.