scholarly journals Machine Learning to Monitor Diagnostic Safety Risks in Emergency Departments: A Study Protocol (Preprint)

2020 ◽  
Author(s):  
Moein Enayati ◽  
Mustafa Sir ◽  
Xingyu Zhang ◽  
Sarah Parker ◽  
Elizabeth Duffy ◽  
...  

BACKGROUND Diagnostic decision-making, especially in emergency departments (EDs), is a highly complex cognitive process involving uncertainty and susceptibility to error. A combination of parameters including patient factors (e.g. history, behaviors, complexity, and comorbidity), provider/care-team factors (e.g. cognitive load, information gathering, and synthesis), and system factors (e.g. health information technology, crowding, shift-based work, and interruptions) may contribute to diagnostic errors. Records with potential diagnostic errors have been identified using electronic triggers that flag certain patterns of care (i.e., triggers), such as the escalation of care or death after ED discharge. Sophisticated data analytics and machine learning techniques that can be applied to existing electronic health record (EHR) datasets could shed light on potential risk factors influencing diagnostic decision-making. OBJECTIVE To identify variables contributing to potential diagnostic errors in the ED using large scale EHR data. METHODS We will apply trigger algorithms to EHR data repositories to generate a large dataset of trigger-positive and trigger-negative encounters. Samples from both sets will be validated using medical record reviews where we expect to find a higher number of diagnostic safety problems in the trigger positive subset. Advanced data mining and machine learning techniques will be used to evaluate relationships between certain patient, provider/care-team, and system risk factors and diagnostic safety signals in the statistically matched groups of trigger-positive and trigger-negative charts. RESULTS This study received funding in February 2019, and is approved by the Institutional Review Board at two health systems. Trigger queries are being developed at both organizations and sample cohorts are being labeled using the triggers. Once completed, study data can inform important parameters for future clinical decision support systems to help identify risks that contribute to diagnostic errors. CONCLUSIONS Using large datasets to investigate risk factors (patient, provider/care team, and system-level) in the diagnostic process can provide mechanisms for future monitoring of diagnostic safety.

2020 ◽  
Vol 79 (Suppl 1) ◽  
pp. 897.2-897
Author(s):  
M. Maurits ◽  
T. Huizinga ◽  
M. Reinders ◽  
S. Raychaudhuri ◽  
E. Karlson ◽  
...  

Background:Heterogeneity in disease populations complicates discovery of risk factors. To identify risk factors for subpopulations of diseases, we need analytical methods that can deal with unidentified disease subgroups.Objectives:Inspired by successful approaches from the Big Data field, we developed a high-throughput approach to identify subpopulations within patients with heterogeneous, complex diseases using the wealth of information available in Electronic Medical Records (EMRs).Methods:We extracted longitudinal healthcare-interaction records coded by 1,853 PheCodes[1] of the 64,819 patients from the Boston’s Partners-Biobank. Through dimensionality reduction using t-SNE[2] we created a 2D embedding of 32,424 of these patients (set A). We then identified distinct clusters post-t-SNE using DBscan[3] and visualized the relative importance of individual PheCodes within them using specialized spectrographs. We replicated this procedure in the remaining 32,395 records (set B).Results:Summary statistics of both sets were comparable (Table 1).Table 1.Summary statistics of the total Partners Biobank dataset and the 2 partitions.Set-Aset-BTotalEntries12,200,31112,177,13124,377,442Patients32,42432,39564,819Patientyears369,546.33368,597.92738,144.2unique ICD codes25,05624,95326,305unique Phecodes1,8511,8531,853We found 284 clusters in set A and 295 in set B, of which 63.4% from set A could be mapped to a cluster in set B with a median (range) correlation of 0.24 (0.03 – 0.58).Clusters represented similar yet distinct clinical phenotypes; e.g. patients diagnosed with “other headache syndrome” were separated into four distinct clusters characterized by migraines, neurofibromatosis, epilepsy or brain cancer, all resulting in patients presenting with headaches (Fig. 1 & 2). Though EMR databases tend to be noisy, our method was also able to differentiate misclassification from true cases; SLE patients with RA codes clustered separately from true RA cases.Figure 1.Two dimensional representation of Set A generated using dimensionality reduction (tSNE) and clustering (DBScan).Figure 2.Phenotype Spectrographs (PheSpecs) of four clusters characterized by “Other headache syndromes”, driven by codes relating to migraine, epilepsy, neurofibromatosis or brain cancer.Conclusion:We have shown that EMR data can be used to identify and visualize latent structure in patient categorizations, using an approach based on dimension reduction and clustering machine learning techniques. Our method can identify misclassified patients as well as separate patients with similar problems into subsets with different associated medical problems. Our approach adds a new and powerful tool to aid in the discovery of novel risk factors in complex, heterogeneous diseases.References:[1] Denny, J.C. et al. Bioinformatics (2010)[2]van der Maaten et al. Journal of Machine Learning Research (2008)[3] Ester, M. et al. Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. (1996)Disclosure of Interests:Marc Maurits: None declared, Thomas Huizinga Grant/research support from: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Consultant of: Ablynx, Bristol-Myers Squibb, Roche, Sanofi, Marcel Reinders: None declared, Soumya Raychaudhuri: None declared, Elizabeth Karlson: None declared, Erik van den Akker: None declared, Rachel Knevel: None declared


2018 ◽  
Vol 31 (3) ◽  
pp. 429-435 ◽  
Author(s):  
Kathryn Rendell ◽  
Irena Koprinska ◽  
Andre Kyme ◽  
Anja A Ebker‐White ◽  
Michael M Dinh

10.2196/16047 ◽  
2019 ◽  
Vol 8 (11) ◽  
pp. e16047 ◽  
Author(s):  
Don Roosan ◽  
Anandi V Law ◽  
Mazharul Karim ◽  
Moom Roosan

Background According to the September 2015 Institute of Medicine report, Improving Diagnosis in Health Care, each of us is likely to experience one diagnostic error in our lifetime, often with devastating consequences. Traditionally, diagnostic decision making has been the sole responsibility of an individual clinician. However, diagnosis involves an interaction among interprofessional team members with different training, skills, cultures, knowledge, and backgrounds. Moreover, diagnostic error is prevalent in the interruption-prone environment, such as the emergency department, where the loss of information may hinder a correct diagnosis. Objective The overall purpose of this protocol is to improve team-based diagnostic decision making by focusing on data analytics and informatics tools that improve collective information management. Methods To achieve this goal, we will identify the factors contributing to failures in team-based diagnostic decision making (aim 1), understand the barriers of using current health information technology tools for team collaboration (aim 2), and develop and evaluate a collaborative decision-making prototype that can improve team-based diagnostic decision making (aim 3). Results Between 2019 to 2020, we are collecting data for this study. The results are anticipated to be published between 2020 and 2021. Conclusions The results from this study can shed light on improving diagnostic decision making by incorporating diagnostics rationale from team members. We believe a positive direction to move forward in solving diagnostic errors is by incorporating all team members, and using informatics. International Registered Report Identifier (IRRID) DERR1-10.2196/16047


2016 ◽  
Author(s):  
Ευτύχιος Πρωτοπαπαδάκης

Ο όρος μάθηση με μερική επίβλεψη αναφέρεται σε ένα ευρύ πεδίο τεχνικών μηχανικής μάθησης, οι οποίες χρησιμοποιούν τα μη τιτλοφορημένα δεδομένα για να εξάγουν επιπλέον ωφέλιμη πληροφορία. Η μερική επίβλεψη αντιμετωπίζει προβλήματα που σχετίζονται με την επεξεργασία και την αξιοποίηση μεγάλου όγκου δεδομένων και τα όποια κόστη σχετίζονται με αυτά (π.χ. χρόνος επεξεργασίας, ανθρώπινα λάθη). Απώτερος σκοπός είναι η ασφαλή εξαγωγή συμπερασμάτων, κανόνων ή προτάσεων. Τα μοντέλα λήψης απόφασης που χρησιμοποιούν τεχνικές μερικής μάθησης έχουν ποικίλα πλεονεκτήματα. Σε πρώτη φάση, χρειάζονται μικρό πλήθος τιτλοφορημένων δεδομένων για την αρχικοποίηση τους. Στη συνέχεια, τα νέα δεδομένα που θα εμφανιστούν αξιοποιούνται και τροποποιούν κατάλληλα το μοντέλο. Ως εκ τούτου, έχουμε ένα συνεχώς εξελισσόμενο μοντέλο λήψης αποφάσεων, με την ελάχιστη δυνατή προσπάθεια.Τεχνικές που προσαρμόζονται εύκολα και οικονομικά είναι οι κατεξοχήν κατάλληλες για τον έλεγχο συστημάτων, στα οποία παρατηρούνται συχνές αλλαγές στον τρόπο λειτουργίας. Ενδεικτικά πεδία εφαρμογής εφαρμογής ευέλικτων συστημάτων υποστήριξης λήψης αποφάσεων με μερική μάθηση είναι: η επίβλεψη γραμμών παραγωγής, η επιτήρηση θαλάσσιων συνόρων, η φροντίδα ηλικιωμένων, η εκτίμηση χρηματοπιστωτικού κινδύνου, ο έλεγχος για δομικές ατέλειες και η διαφύλαξη της πολιτιστικής κληρονομιάς.


2021 ◽  
Vol 11 (2) ◽  
pp. 38-52
Author(s):  
Abhinav Juneja ◽  
Sapna Juneja ◽  
Sehajpreet Kaur ◽  
Vivek Kumar

Diabetes has become one of the common health issues in people of all age groups. The disease is responsible for many difficulties in lifestyle and is represented by imbalance in hyperglycemia. If kept untreated, diabetes can raise the chance of heart attack, diabetic nephropathy, and other disorders. Early diagnosis of diabetes helps to maintain a healthy lifestyle. Machine learning is a capability of machine to learn from past pattern and occurrences and converge with experience to optimise and give decision. In the current research, the authors have employed machine learning techniques and used multi-criteria decision-making approach in Pima Indian diabetes dataset. To classify the patients, they examined several different supervised and unsupervised predictive models. After detailed analysis, it has been observed that the supervised learning algorithms outweigh the unsupervised algorithms due to the output class being a nominal classified domain.


2021 ◽  
Author(s):  
Serkan Varol ◽  
Serkan Catma ◽  
Diana Reindl ◽  
Elizabeth Serieux

BACKGROUND Vaccine refusal still poses a risk to reaching herd immunity in the United States. The existing literature focuses on identifying the predictors that would impact the willingness to accept (WTA) vaccines using survey data. These variables range from the socio-demographic characteristics of the participants to the perceptions and attitudes towards the vaccines so each variable’s statistical relationship with the WTA a vaccine can be investigated. However, while the results of these studies may have important implications for understanding vaccine hesitancy by offering interpretation of the statistical relationships, the prediction of vaccine decision-making has rarely been investigated OBJECTIVE We aimed to identify the factors that contribute to the prediction of COVID-19 vaccine acceptors and refusers using machine learning METHODS A nationwide survey was administered online in November, 2020 to assess American public perceptions and attitudes towards COVID-19 vaccines. Seven machine learning techniques were utilized to identify the model with the highest predictive power. Moreover, a set of variables that would contribute the most to the predictions of vaccine acceptors and refusers was identified using Gini importance based on Random Forest structure RESULTS The resulting machine learning algorithm has better prediction ability for willingness to accept (82%) versus reject (51%) a COVID-19 vaccine. In terms of predictive success, the Random Forest model outperformed the other machine learning techniques with a 69.52% accuracy rate. Worrying about (re) contracting Covid 19 and opinions regarding mandatory face covering were identified as the most important predictors of vaccine decision-making CONCLUSIONS The complexity of vaccine hesitancy needs to be investigated thoroughly before the threshold needed to reach population immunity can be achieved. Predictive analytics can help the public health officials design and deliver individually tailored vaccination programs that would increase the overall vaccine uptake.


2020 ◽  
Vol 20 (1) ◽  
Author(s):  
Georgios Kantidakis ◽  
Hein Putter ◽  
Carlo Lancia ◽  
Jacob de Boer ◽  
Andries E. Braat ◽  
...  

Abstract Background Predicting survival of recipients after liver transplantation is regarded as one of the most important challenges in contemporary medicine. Hence, improving on current prediction models is of great interest.Nowadays, there is a strong discussion in the medical field about machine learning (ML) and whether it has greater potential than traditional regression models when dealing with complex data. Criticism to ML is related to unsuitable performance measures and lack of interpretability which is important for clinicians. Methods In this paper, ML techniques such as random forests and neural networks are applied to large data of 62294 patients from the United States with 97 predictors selected on clinical/statistical grounds, over more than 600, to predict survival from transplantation. Of particular interest is also the identification of potential risk factors. A comparison is performed between 3 different Cox models (with all variables, backward selection and LASSO) and 3 machine learning techniques: a random survival forest and 2 partial logistic artificial neural networks (PLANNs). For PLANNs, novel extensions to their original specification are tested. Emphasis is given on the advantages and pitfalls of each method and on the interpretability of the ML techniques. Results Well-established predictive measures are employed from the survival field (C-index, Brier score and Integrated Brier Score) and the strongest prognostic factors are identified for each model. Clinical endpoint is overall graft-survival defined as the time between transplantation and the date of graft-failure or death. The random survival forest shows slightly better predictive performance than Cox models based on the C-index. Neural networks show better performance than both Cox models and random survival forest based on the Integrated Brier Score at 10 years. Conclusion In this work, it is shown that machine learning techniques can be a useful tool for both prediction and interpretation in the survival context. From the ML techniques examined here, PLANN with 1 hidden layer predicts survival probabilities the most accurately, being as calibrated as the Cox model with all variables. Trial registration Retrospective data were provided by the Scientific Registry of Transplant Recipients under Data Use Agreement number 9477 for analysis of risk factors after liver transplantation.


Sign in / Sign up

Export Citation Format

Share Document