Improving Clinical Translation of Machine Learning Approaches Through Clinician-Tailored Visual Displays of Black Box Algorithms: Development and Validation

Background Despite the promise of machine learning (ML) to inform individualized medical care, the clinical utility of ML in medicine has been limited by the minimal interpretability and black box nature of these algorithms. Objective The study aimed to demonstrate a general and simple framework for generating clinically relevant and interpretable visualizations of black box predictions to aid in the clinical translation of ML. Methods To obtain improved transparency of ML, simplified models and visual displays can be generated using common methods from clinical practice such as decision trees and effect plots. We illustrated the approach based on postprocessing of ML predictions, in this case random forest predictions, and applied the method to data from the Left Ventricular (LV) Structural Predictors of Sudden Cardiac Death (SCD) Registry for individualized risk prediction of SCD, a leading cause of death. Results With the LV Structural Predictors of SCD Registry data, SCD risk predictions are obtained from a random forest algorithm that identifies the most important predictors, nonlinearities, and interactions among a large number of variables while naturally accounting for missing data. The black box predictions are postprocessed using classification and regression trees into a clinically relevant and interpretable visualization. The method also quantifies the relative importance of an individual or a combination of predictors. Several risk factors (heart failure hospitalization, cardiac magnetic resonance imaging indices, and serum concentration of systemic inflammation) can be clearly visualized as branch points of a decision tree to discriminate between low-, intermediate-, and high-risk patients. Conclusions Through a clinically important example, we illustrate a general and simple approach to increase the clinical translation of ML through clinician-tailored visual displays of results from black box algorithms. We illustrate this general model-agnostic framework by applying it to SCD risk prediction. Although we illustrate the methods using SCD prediction with random forest, the methods presented are applicable more broadly to improving the clinical translation of ML, regardless of the specific ML algorithm or clinical application. As any trained predictive model can be summarized in this manner to a prespecified level of precision, we encourage the use of simplified visual displays as an adjunct to the complex predictive model. Overall, this framework can allow clinicians to peek inside the black box and develop a deeper understanding of the most important features from a model to gain trust in the predictions and confidence in applying them to clinical care.

Download Full-text

Improving Clinical Translation of Machine Learning Approaches Through Clinician-Tailored Visual Displays of Black Box Algorithms: Development and Validation (Preprint)

10.2196/preprints.15791 ◽

2019 ◽

Author(s):

Shannon Wongvibulsin ◽

Katherine C Wu ◽

Scott L Zeger

Keyword(s):

Machine Learning ◽

Random Forest ◽

Predictive Model ◽

Risk Prediction ◽

Clinical Care ◽

Left Ventricular ◽

Black Box ◽

Clinical Translation ◽

Learning Approaches ◽

Visual Displays

BACKGROUND Despite the promise of machine learning (ML) to inform individualized medical care, the clinical utility of ML in medicine has been limited by the minimal interpretability and black box nature of these algorithms. OBJECTIVE The study aimed to demonstrate a general and simple framework for generating clinically relevant and interpretable visualizations of black box predictions to aid in the clinical translation of ML. METHODS To obtain improved transparency of ML, simplified models and visual displays can be generated using common methods from clinical practice such as decision trees and effect plots. We illustrated the approach based on postprocessing of ML predictions, in this case random forest predictions, and applied the method to data from the Left Ventricular (LV) Structural Predictors of Sudden Cardiac Death (SCD) Registry for individualized risk prediction of SCD, a leading cause of death. RESULTS With the LV Structural Predictors of SCD Registry data, SCD risk predictions are obtained from a random forest algorithm that identifies the most important predictors, nonlinearities, and interactions among a large number of variables while naturally accounting for missing data. The black box predictions are postprocessed using classification and regression trees into a clinically relevant and interpretable visualization. The method also quantifies the relative importance of an individual or a combination of predictors. Several risk factors (heart failure hospitalization, cardiac magnetic resonance imaging indices, and serum concentration of systemic inflammation) can be clearly visualized as branch points of a decision tree to discriminate between low-, intermediate-, and high-risk patients. CONCLUSIONS Through a clinically important example, we illustrate a general and simple approach to increase the clinical translation of ML through clinician-tailored visual displays of results from black box algorithms. We illustrate this general model-agnostic framework by applying it to SCD risk prediction. Although we illustrate the methods using SCD prediction with random forest, the methods presented are applicable more broadly to improving the clinical translation of ML, regardless of the specific ML algorithm or clinical application. As any trained predictive model can be summarized in this manner to a prespecified level of precision, we encourage the use of simplified visual displays as an adjunct to the complex predictive model. Overall, this framework can allow clinicians to peek inside the black box and develop a deeper understanding of the most important features from a model to gain trust in the predictions and confidence in applying them to clinical care.

Download Full-text

Clinical risk prediction with random forests for survival, longitudinal, and multivariate (RF-SLAM) data analysis

BMC Medical Research Methodology ◽

10.1186/s12874-019-0863-0 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 8

Author(s):

Shannon Wongvibulsin ◽

Katherine C. Wu ◽

Scott L. Zeger

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Random Forest ◽

Risk Prediction ◽

Left Ventricular ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Time Varying ◽

Clinical Risk ◽

Multiple Variables

Abstract Background Clinical research and medical practice can be advanced through the prediction of an individual’s health state, trajectory, and responses to treatments. However, the majority of current clinical risk prediction models are based on regression approaches or machine learning algorithms that are static, rather than dynamic. To benefit from the increasing emergence of large, heterogeneous data sets, such as electronic health records (EHRs), novel tools to support improved clinical decision making through methods for individual-level risk prediction that can handle multiple variables, their interactions, and time-varying values are necessary. Methods We introduce a novel dynamic approach to clinical risk prediction for survival, longitudinal, and multivariate (SLAM) outcomes, called random forest for SLAM data analysis (RF-SLAM). RF-SLAM is a continuous-time, random forest method for survival analysis that combines the strengths of existing statistical and machine learning methods to produce individualized Bayes estimates of piecewise-constant hazard rates. We also present a method-agnostic approach for time-varying evaluation of model performance. Results We derive and illustrate the method by predicting sudden cardiac arrest (SCA) in the Left Ventricular Structural (LV) Predictors of Sudden Cardiac Death (SCD) Registry. We demonstrate superior performance relative to standard random forest methods for survival data. We illustrate the importance of the number of preceding heart failure hospitalizations as a time-dependent predictor in SCA risk assessment. Conclusions RF-SLAM is a novel statistical and machine learning method that improves risk prediction by incorporating time-varying information and accommodating a large number of predictors, their interactions, and missing values. RF-SLAM is designed to easily extend to simultaneous predictions of multiple, possibly competing, events and/or repeated measurements of discrete or continuous variables over time.Trial registration: LV Structural Predictors of SCD Registry (clinicaltrials.gov, NCT01076660), retrospectively registered 25 February 2010

Download Full-text

Machine Learning Approaches to Traffic Accident Analysis and Hotspot Prediction

Computers ◽

10.3390/computers10120157 ◽

2021 ◽

Vol 10 (12) ◽

pp. 157

Author(s):

Daniel Santos ◽

José Saias ◽

Paulo Quaresma ◽

Vítor Beires Nogueira

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Road Accident ◽

Influential Factors ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Economic Losses ◽

Learning Approaches ◽

Road Accidents ◽

Accident Data

Traffic accidents are one of the most important concerns of the world, since they result in numerous casualties, injuries, and fatalities each year, as well as significant economic losses. There are many factors that are responsible for causing road accidents. If these factors can be better understood and predicted, it might be possible to take measures to mitigate the damages and its severity. The purpose of this work is to identify these factors using accident data from 2016 to 2019 from the district of Setúbal, Portugal. This work aims at developing models that can select a set of influential factors that may be used to classify the severity of an accident, supporting an analysis on the accident data. In addition, this study also proposes a predictive model for future road accidents based on past data. Various machine learning approaches are used to create these models. Supervised machine learning methods such as decision trees (DT), random forests (RF), logistic regression (LR), and naive Bayes (NB) are used, as well as unsupervised machine learning techniques including DBSCAN and hierarchical clustering. Results show that a rule-based model using the C5.0 algorithm is capable of accurately detecting the most relevant factors describing a road accident severity. Further, the results of the predictive model suggests the RF model could be a useful tool for forecasting accident hotspots.

Download Full-text

Machine Learning Approaches to Retrieve High-Quality, Clinically Relevant, Evidence from the Biomedical Literature: A Systematic Review (Preprint)

10.2196/preprints.30401 ◽

2021 ◽

Author(s):

Wael Abdelkader ◽

Tamara Navarro ◽

Rick Parrish ◽

Chris Cotoi ◽

Federico Germini ◽

...

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Strong Evidence ◽

Clinical Care ◽

Biomedical Literature ◽

Supervised Machine Learning ◽

Machine Learning Techniques ◽

Learning Approaches ◽

High Quality ◽

Applied Machine Learning

BACKGROUND The rapid growth of the biomedical literature makes identifying strong evidence a time-consuming task. Applying machine learning to the process could be a viable solution that limits effort while maintaining accuracy. OBJECTIVE To summarize the nature and comparative performance of machine learning approaches that have been applied to retrieve high-quality evidence for clinical consideration from the biomedical literature. METHODS We conducted a systematic review of studies that applied machine learning techniques to identify high-quality clinical articles in the biomedical literature. Multiple databases were searched to July 2020. Extracted data focused on the applied machine learning model, steps in the development of the models, and model performance. RESULTS From 3918 retrieved studies, 10 met our inclusion criteria. All followed a supervised machine learning approach and applied, from a limited range of options, a high-quality standard for the training of their model. The results show that machine learning can achieve a sensitivity of 95% while maintaining a high precision of 86%. CONCLUSIONS Applying machine learning to distinguish studies with strong evidence for clinical care has the potential to decrease the workload of manually identifying these. The evidence base is active and evolving. Reported methods were variable across the studies but focused on supervised machine learning approaches. Performance may improve by applying more sophisticated approaches such as active learning, auto-machine learning, and unsupervised machine learning approaches.

Download Full-text

EAT-Rice: A predictive model for flanking gene expression of T-DNA insertion activation-tagged rice mutants by machine learning approaches

PLoS Computational Biology ◽

10.1371/journal.pcbi.1006942 ◽

2019 ◽

Vol 15 (5) ◽

pp. e1006942 ◽

Cited By ~ 1

Author(s):

Chi-Chou Liao ◽

Liang-Jwu Chen ◽

Shuen-Fang Lo ◽

Chi-Wei Chen ◽

Yen-Wei Chu

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Predictive Model ◽

Learning Approaches ◽

Dna Insertion

Download Full-text

Application of machine learning approaches for osteoporosis risk prediction in postmenopausal women

Archives of Osteoporosis ◽

10.1007/s11657-020-00802-8 ◽

2020 ◽

Vol 15 (1) ◽

Author(s):

Jae-Geum Shim ◽

Dong Woo Kim ◽

Kyoung-Ho Ryu ◽

Eun-Ah Cho ◽

Jin-Hee Ahn ◽

...

Keyword(s):

Machine Learning ◽

Postmenopausal Women ◽

Risk Prediction ◽

Learning Approaches ◽

Osteoporosis Risk

Download Full-text

Which PHQ-9 Items Can Effectively Screen for Suicide? Machine Learning Approaches

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18073339 ◽

2021 ◽

Vol 18 (7) ◽

pp. 3339

Author(s):

Sunhae Kim ◽

Hye-Kyung Lee ◽

Kounseok Lee

Keyword(s):

Machine Learning ◽

Primary Care ◽

Suicidal Ideation ◽

Random Forest ◽

Suicide Ideation ◽

Area Under The Curve ◽

Learning Approaches ◽

K Nearest Neighbors ◽

Predictive Values ◽

Linear Discriminant

(1) Background: The Patient Health Questionnaire-9 (PHQ-9) is a tool that screens patients for depression in primary care settings. In this study, we evaluated the efficacy of PHQ-9 in evaluating suicidal ideation (2) Methods: A total of 8760 completed questionnaires collected from college students were analyzed. The PHQ-9 was scored in combination with and evaluated against four categories (PHQ-2, PHQ-8, PHQ-9, and PHQ-10). Suicidal ideations were evaluated using the Mini-International Neuropsychiatric Interview suicidality module. Analyses used suicide ideation as the dependent variable, and machine learning (ML) algorithms, k-nearest neighbors, linear discriminant analysis (LDA), and random forest. (3) Results: Random forest application using the nine items of the PHQ-9 revealed an excellent area under the curve with a value of 0.841, with 94.3% accuracy. The positive and negative predictive values were 84.95% (95% CI = 76.03–91.52) and 95.54% (95% CI = 94.42–96.48), respectively. (4) Conclusion: This study confirmed that ML algorithms using PHQ-9 in the primary care field are reliably accurate in screening individuals with suicidal ideation.

Download Full-text

Predictive modelling of groundwater nitrate pollution at a regional scale using machine learning and feature selection

10.5194/egusphere-egu2020-5414 ◽

2020 ◽

Author(s):

Aaron Cardenas-Martinez ◽

Victor Rodriguez-Galiano ◽

Juan Antonio Luque-Espinar ◽

Maria Paula Mendes

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Random Forest ◽

Vegetation Index ◽

Regional Scale ◽

Developed Countries ◽

Quality Standard ◽

Nitrate Pollution ◽

Learning Approaches ◽

Environmental Measures

The establishment of the sources and driven-forces of groundwater nitrate pollution is of paramount importance, contributing to agro-environmental measures implementation and evaluation. High concentrations of nitrates in groundwater occur all around the world, in rich and less developed countries.In the case of Spain, 21.5% of the wells of the groundwater quality monitoring network showed mean concentrations above the quality standard (QS) of 50 mg/l. The objectives of this work were: i) to predict the current probability of having nitrate concentrations above the QS in Andalusian groundwater bodies (Spain) using past time features, being some of them obtained from satellite observations; ii) to assess the importance of features in the prediction; iii) to evaluate different machine learning approaches (ML) and feature selection techniques (FS).Several predictive models based on an ML algorithm, the Random Forest, were used, as well as, FS techniques. 321 nitrate samples and respective predictive features were obtained from different groundwater bodies. These predictive features were divided into three groups, regarding their focus: agricultural production (phenology); livestock pressure (excretion rates); and environmental settings (soil characteristics and texture, geomorphology, and local climate conditions). Models were trained with the features of a year [YEAR (t0)], and then applied to new features obtained for the next year &#8211; [YEAR(t0+1)], performing k-fold cross-validation. Additionally, a further prediction was carried out for a present time &#8211; [YEAR(t0+n)], validating with an independent test. This methodology examined the use of a model, trained with previous nitrates concentrations and predictive features, for the prediction of current nitrates concentrations based on present features. Our findings showed an improvement in the predictive performance when using a wrapper with sequential search for FS when compared to the use alone of the Random Forest algorithm. Phenology features, derived from remotely sensed variables, were the most explanative features, performing better than the use of static land-use maps or vegetation index images (e.g., NDVI). They also provided much more comprehensive information, and more importantly, employing only extrinsic features of groundwater bodies.

Download Full-text

Machine Learning Approaches as a Tool for Effective Offender Risk Prediction

Criminology & Public Policy ◽

10.1111/1745-9133.12060 ◽

2013 ◽

Vol 12 (3) ◽

pp. 507-510 ◽

Cited By ~ 4

Author(s):

William Rhodes

Keyword(s):

Machine Learning ◽

Risk Prediction ◽

Learning Approaches

Download Full-text

MITRE: predicting host status from microbiota time-series data

10.1101/447250 ◽

2018 ◽

Author(s):

Elijah Bogart ◽

Richard Creswell ◽

Georg K. Gerber

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Synthetic Data ◽

Black Box ◽

Series Data ◽

Learning Approaches ◽

Rule Engine ◽

Microbiome Composition ◽

Host Status

AbstractLongitudinal studies are crucial for discovering casual relationships between the microbiome and human disease. We present Microbiome Interpretable Temporal Rule Engine (MITRE), the first machine learning method specifically designed for predicting host status from microbiome time-series data. Our method maintains interpretability by learning predictive rules over automatically inferred time-periods and phylogenetically related microbes. We validate MITRE’s performance on semi-synthetic data, and five real datasets measuring microbiome composition over time in infant and adult cohorts. Our results demonstrate that MITRE performs on par or outperforms “black box” machine learning approaches, providing a powerful new tool enabling discovery of biologically interpretable relationships between microbiome and human host.

Download Full-text