Patient Similarity Analytics for Explainable Clinical Risk Prediction
Abstract Background: A Clinical Risk Prediction Model (CRPM) uses patient characteristics to estimate the probability about having or developing a particular disease and/or outcome. While CRPMs are gaining in popularity, they have yet to be adopted routinely in clinical practice. The lack of explainability and interpretability has limited its utility. Explainability is the extent of which a model’s prediction process can be described. Interpretability is the degree to which a user can understand the predictions made by a model.Methods: The study aimed to demonstrate utility of patient similarity analytics in developing an explainable and interpretable CPRM. Data was extracted from the electronic medical records of patients with type-2 diabetes mellitus, hypertension and dyslipidaemia in a Singapore public primary care clinic. We used various techniques, including patient similarity analytics, to develop various models on this real-world training dataset (n=7,041) and validated each of them on the same test dataset (n=3,018). The results were compared using logistic regression, random forest and support vector machine models from the same dataset. The CRPM was then implemented in a prototype system to demonstrate the identification, explainability and interpretability of similar patients and the prediction process.Results: The patient similarity model (AUROC=0.718) was comparable to the logistic regression (AUROC=0.695), random forest (AUROC=0.764) and support vector machine models (AUROC=0.766). We incorporated the patient similarity model in a prototype web application. A case study demonstrated how the application was provided both quantitative and qualitative information, in the form of patient narratives. This information was used to better inform and influence clinical decision-making, such as getting a patient to agree to start insulin therapy.Conclusions: A patient similarity approach is feasible to develop an explainable and interpretable CRPM. It is a general approach which can be used to develop locally relevant information, based on the database it searches. Ultimately, such an approach can generate a more informative CRPMs which can be deployed as part of clinical decision support tools to better facilitate shared decision-making in clinical practice.