BACKGROUND
The Multidimensional Prognostic Index (MPI) is an aggregate comprehensive geriatric assessment scoring system derived from eight domains, that predicts adverse outcomes including 12-month mortality (12MM). However, prediction accuracy, using the 3 MPI categories (mild, moderate, severe risk) as per previous investigations was relatively poor in a recent study with older hospitalized Australian patients. Prediction modelling using the component domains of the MPI together with additional clinical features and Machine Learning (ML) algorithms might improve prediction accuracy
OBJECTIVE
To assess whether prediction accuracy for 12MM using logistic regression with maximum likelihood estimation (LR-MLE) with the 3-category MPI together with age and gender (feature-set 1) can be improved with the addition of 10 clinical features (sodium, haemoglobin, albumin, creatinine, urea, urea/creatinine ratio, estimated glomerular filtration rate, C-reactive protein, body mass index and anticholinergic risk score) (feature-set 2), and the replacement of the 3-category MPI in feature-sets 1 and 2 by the eight separate MPI domains (feature-sets 3 and 4 respectively). To also assess prediction accuracy of ML algorithms using the same feature-sets.
METHODS
MPI and clinical features were collected in patients aged ≥65 years admitted to either General Medical or Acute Care of the Elderly wards of a South Australian hospital between September 2015 and February 2017. The diagnostic accuracy of LR-MLE was assessed together with nine ML algorithms: decision-trees, random-forests, eXtreme gradient-boosting (XGBoost), support-vector-machines, naïve-bayes, k-nearest-neighbours, ridge regression, logistic regression without regularisation and neural-networks. A 70:30 Training:Test split of the data and a grid-search of hyper-parameters with 10-fold cross-validation was employed during model training of the ML algorithms. Area-under-curve (AUC) was used to assess prediction accuracy.
RESULTS
A total of 737 patients (F:M=50.2%/49.8%) with median (IQR) age 80 (72-86) years had complete MPI data recorded on admission and complete 12-month follow-up obtained. The area-under-the receiver-operating-curve (AUC) for LR-MLE was 0.632, 0.688, 0.738 and 0.757 for feature-sets 1 to 4 respectively. The best overall accuracy for the nine ML algorithms was obtained using the XGBoost algorithm (0.635, 0.706, 0.756 and 0.757 for feature-sets 1 to 4 respectively).
CONCLUSIONS
The use of MPI domains (feature-sets 3 and 4) with LR-MLE considerably improved prediction accuracy compared to that obtained using the traditional 3-category MPI. The XGBoost ML algorithm slightly improved accuracy compared to LR-MLE with feature-sets 1-3 but not with feature-set 4. Adding clinical data also provided small gains in accuracy for LR-MLE and some, but not all ML algorithms. Consideration should be given to using the underlying MPI domains of aggregate scoring systems, additional clinical data and ML algorithms when assessing the risk of 12MM.
CLINICALTRIAL
N/AMachine learning, Multidimensional Prognostic Index, mortality, diagnostic accuracy, XGBoost