scholarly journals A Path for Translation of Machine Learning Products into Healthcare Delivery

Despite enormous enthusiasm, machine learning models are rarely translated into clinical care and there is minimal evidence of clinical or economic impact. New conference venues and academic journals have emerged to promote the proliferating research; however, the translational path remains unclear. This review undertakes the first in-depth study to identify how machine learning models that ingest structured electronic health record data can be applied to clinical decision support tasks and translated into clinical practice. The authors complement their own work with the experience of 21 machine learning products that address problems across clinical domains and across geographic populations. Four phases of translation emerge: design and develop, evaluate and validate, diffuse and scale, and continuing monitoring and maintenance. The review highlights the varying approaches taken across each phase by teams building machine learning products and presents a discussion of challenges and opportunities. The translational path and associated findings are instructive to researchers and developers building machine learning products, policy makers regulating machine learning products, and health system leaders who are considering adopting a machine learning product.

Author(s):  
Emily Kogan ◽  
Kathryn Twyman ◽  
Jesse Heap ◽  
Dejan Milentijevic ◽  
Jennifer H. Lin ◽  
...  

Abstract Background Stroke severity is an important predictor of patient outcomes and is commonly measured with the National Institutes of Health Stroke Scale (NIHSS) scores. Because these scores are often recorded as free text in physician reports, structured real-world evidence databases seldom include the severity. The aim of this study was to use machine learning models to impute NIHSS scores for all patients with newly diagnosed stroke from multi-institution electronic health record (EHR) data. Methods NIHSS scores available in the Optum© de-identified Integrated Claims-Clinical dataset were extracted from physician notes by applying natural language processing (NLP) methods. The cohort analyzed in the study consists of the 7149 patients with an inpatient or emergency room diagnosis of ischemic stroke, hemorrhagic stroke, or transient ischemic attack and a corresponding NLP-extracted NIHSS score. A subset of these patients (n = 1033, 14%) were held out for independent validation of model performance and the remaining patients (n = 6116, 86%) were used for training the model. Several machine learning models were evaluated, and parameters optimized using cross-validation on the training set. The model with optimal performance, a random forest model, was ultimately evaluated on the holdout set. Results Leveraging machine learning we identified the main factors in electronic health record data for assessing stroke severity, including death within the same month as stroke occurrence, length of hospital stay following stroke occurrence, aphagia/dysphagia diagnosis, hemiplegia diagnosis, and whether a patient was discharged to home or self-care. Comparing the imputed NIHSS scores to the NLP-extracted NIHSS scores on the holdout data set yielded an R2 (coefficient of determination) of 0.57, an R (Pearson correlation coefficient) of 0.76, and a root-mean-squared error of 4.5. Conclusions Machine learning models built on EHR data can be used to determine proxies for stroke severity. This enables severity to be incorporated in studies of stroke patient outcomes using administrative and EHR databases.


2020 ◽  
Author(s):  
Tahmina Nasrin Poly ◽  
Md.Mohaimenul Islam ◽  
Muhammad Solihuddin Muhtar ◽  
Hsuan-Chia Yang ◽  
Phung Anh (Alex) Nguyen ◽  
...  

BACKGROUND Computerized physician order entry (CPOE) systems are incorporated into clinical decision support systems (CDSSs) to reduce medication errors and improve patient safety. Automatic alerts generated from CDSSs can directly assist physicians in making useful clinical decisions and can help shape prescribing behavior. Multiple studies reported that approximately 90%-96% of alerts are overridden by physicians, which raises questions about the effectiveness of CDSSs. There is intense interest in developing sophisticated methods to combat alert fatigue, but there is no consensus on the optimal approaches so far. OBJECTIVE Our objective was to develop machine learning prediction models to predict physicians’ responses in order to reduce alert fatigue from disease medication–related CDSSs. METHODS We collected data from a disease medication–related CDSS from a university teaching hospital in Taiwan. We considered prescriptions that triggered alerts in the CDSS between August 2018 and May 2019. Machine learning models, such as artificial neural network (ANN), random forest (RF), naïve Bayes (NB), gradient boosting (GB), and support vector machine (SVM), were used to develop prediction models. The data were randomly split into training (80%) and testing (20%) datasets. RESULTS A total of 6453 prescriptions were used in our model. The ANN machine learning prediction model demonstrated excellent discrimination (area under the receiver operating characteristic curve [AUROC] 0.94; accuracy 0.85), whereas the RF, NB, GB, and SVM models had AUROCs of 0.93, 0.91, 0.91, and 0.80, respectively. The sensitivity and specificity of the ANN model were 0.87 and 0.83, respectively. CONCLUSIONS In this study, ANN showed substantially better performance in predicting individual physician responses to an alert from a disease medication–related CDSS, as compared to the other models. To our knowledge, this is the first study to use machine learning models to predict physician responses to alerts; furthermore, it can help to develop sophisticated CDSSs in real-world clinical settings.


10.2196/19489 ◽  
2020 ◽  
Vol 8 (11) ◽  
pp. e19489
Author(s):  
Tahmina Nasrin Poly ◽  
Md.Mohaimenul Islam ◽  
Muhammad Solihuddin Muhtar ◽  
Hsuan-Chia Yang ◽  
Phung Anh (Alex) Nguyen ◽  
...  

Background Computerized physician order entry (CPOE) systems are incorporated into clinical decision support systems (CDSSs) to reduce medication errors and improve patient safety. Automatic alerts generated from CDSSs can directly assist physicians in making useful clinical decisions and can help shape prescribing behavior. Multiple studies reported that approximately 90%-96% of alerts are overridden by physicians, which raises questions about the effectiveness of CDSSs. There is intense interest in developing sophisticated methods to combat alert fatigue, but there is no consensus on the optimal approaches so far. Objective Our objective was to develop machine learning prediction models to predict physicians’ responses in order to reduce alert fatigue from disease medication–related CDSSs. Methods We collected data from a disease medication–related CDSS from a university teaching hospital in Taiwan. We considered prescriptions that triggered alerts in the CDSS between August 2018 and May 2019. Machine learning models, such as artificial neural network (ANN), random forest (RF), naïve Bayes (NB), gradient boosting (GB), and support vector machine (SVM), were used to develop prediction models. The data were randomly split into training (80%) and testing (20%) datasets. Results A total of 6453 prescriptions were used in our model. The ANN machine learning prediction model demonstrated excellent discrimination (area under the receiver operating characteristic curve [AUROC] 0.94; accuracy 0.85), whereas the RF, NB, GB, and SVM models had AUROCs of 0.93, 0.91, 0.91, and 0.80, respectively. The sensitivity and specificity of the ANN model were 0.87 and 0.83, respectively. Conclusions In this study, ANN showed substantially better performance in predicting individual physician responses to an alert from a disease medication–related CDSS, as compared to the other models. To our knowledge, this is the first study to use machine learning models to predict physician responses to alerts; furthermore, it can help to develop sophisticated CDSSs in real-world clinical settings.


Author(s):  
M. VALKEMA ◽  
H. LINGSMA ◽  
P. LAMBIN ◽  
J. VAN LANSCHOT

Biostatistics versus machine learning: from traditional prediction models to automated medical analysis Machine learning is increasingly applied to medical data to develop clinical prediction models. This paper discusses the application of machine learning in comparison with traditional biostatistical methods. Biostatistics is well-suited for structured datasets. The selection of variables for a biostatistical prediction model is primarily knowledge-driven. A similar approach is possible with machine learning. But in addition, machine learning allows for analysis of unstructured datasets, which are e.g. derived from medical imaging and written texts in patient records. In contrast to biostatistics, the selection of variables with machine learning is mainly data-driven. Complex machine learning models are able to detect nonlinear patterns and interactions in data. However, this requires large datasets to prevent overfitting. For both machine learning and biostatistics, external validation of a developed model in a comparable setting is required to evaluate a model’s reproducibility. Machine learning models are not easily implemented in clinical practice, since they are recognized as black boxes (i.e. non-intuitive). For this purpose, research initiatives are ongoing within the field of explainable artificial intelligence. Finally, the application of machine learning for automated imaging analysis and development of clinical decision support systems is discussed.


2018 ◽  
Vol 1 (1) ◽  
pp. 53-68 ◽  
Author(s):  
Juan M. Banda ◽  
Martin Seneviratne ◽  
Tina Hernandez-Boussard ◽  
Nigam H. Shah

With the widespread adoption of electronic health records (EHRs), large repositories of structured and unstructured patient data are becoming available to conduct observational studies. Finding patients with specific conditions or outcomes, known as phenotyping, is one of the most fundamental research problems encountered when using these new EHR data. Phenotyping forms the basis of translational research, comparative effectiveness studies, clinical decision support, and population health analyses using routinely collected EHR data. We review the evolution of electronic phenotyping, from the early rule-based methods to the cutting edge of supervised and unsupervised machine learning models. We aim to cover the most influential papers in commensurate detail, with a focus on both methodology and implementation. Finally, future research directions are explored.


Sign in / Sign up

Export Citation Format

Share Document