scholarly journals Predicting Inpatient Falls Using Natural Language Processing of Nursing Records Obtained From Japanese Electronic Medical Records: Case-Control Study (Preprint)

2019 ◽  
Author(s):  
Hayao Nakatani ◽  
Masatoshi Nakao ◽  
Hidefumi Uchiyama ◽  
Hiroyoshi Toyoshiba ◽  
Chikayuki Ochiai

BACKGROUND Falls in hospitals are the most common risk factor that affects the safety of inpatients and can result in severe harm. Therefore, preventing falls is one of the most important areas of risk management for health care organizations. However, existing methods for predicting falls are laborious and costly. OBJECTIVE The objective of this study is to verify whether hospital inpatient falls can be predicted through the analysis of a single input—unstructured nursing records obtained from Japanese electronic medical records (EMRs)—using a natural language processing (NLP) algorithm and machine learning. METHODS The nursing records of 335 fallers and 408 nonfallers for a 12-month period were extracted from the EMRs of an acute care hospital and randomly divided into a learning data set and test data set. The former data set was subjected to NLP and machine learning to extract morphemes that contributed to separating fallers from nonfallers to construct a model for predicting falls. Then, the latter data set was used to determine the predictive value of the model using receiver operating characteristic (ROC) analysis. RESULTS The prediction of falls using the test data set showed high accuracy, with an area under the ROC curve, sensitivity, specificity, and odds ratio of mean 0.834 (SD 0.005), mean 0.769 (SD 0.013), mean 0.785 (SD 0.020), and mean 12.27 (SD 1.11) for five independent experiments, respectively. The morphemes incorporated into the final model included many words closely related to known risk factors for falls, such as the use of psychotropic drugs, state of consciousness, and mobility, thereby demonstrating that an NLP algorithm combined with machine learning can effectively extract risk factors for falls from nursing records. CONCLUSIONS We successfully established that falls among hospital inpatients can be predicted by analyzing nursing records using an NLP algorithm and machine learning. Therefore, it may be possible to develop a fall risk monitoring system that analyzes nursing records daily and alerts health care professionals when the fall risk of an inpatient is increased.

10.2196/16970 ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. e16970 ◽  
Author(s):  
Hayao Nakatani ◽  
Masatoshi Nakao ◽  
Hidefumi Uchiyama ◽  
Hiroyoshi Toyoshiba ◽  
Chikayuki Ochiai

Background Falls in hospitals are the most common risk factor that affects the safety of inpatients and can result in severe harm. Therefore, preventing falls is one of the most important areas of risk management for health care organizations. However, existing methods for predicting falls are laborious and costly. Objective The objective of this study is to verify whether hospital inpatient falls can be predicted through the analysis of a single input—unstructured nursing records obtained from Japanese electronic medical records (EMRs)—using a natural language processing (NLP) algorithm and machine learning. Methods The nursing records of 335 fallers and 408 nonfallers for a 12-month period were extracted from the EMRs of an acute care hospital and randomly divided into a learning data set and test data set. The former data set was subjected to NLP and machine learning to extract morphemes that contributed to separating fallers from nonfallers to construct a model for predicting falls. Then, the latter data set was used to determine the predictive value of the model using receiver operating characteristic (ROC) analysis. Results The prediction of falls using the test data set showed high accuracy, with an area under the ROC curve, sensitivity, specificity, and odds ratio of mean 0.834 (SD 0.005), mean 0.769 (SD 0.013), mean 0.785 (SD 0.020), and mean 12.27 (SD 1.11) for five independent experiments, respectively. The morphemes incorporated into the final model included many words closely related to known risk factors for falls, such as the use of psychotropic drugs, state of consciousness, and mobility, thereby demonstrating that an NLP algorithm combined with machine learning can effectively extract risk factors for falls from nursing records. Conclusions We successfully established that falls among hospital inpatients can be predicted by analyzing nursing records using an NLP algorithm and machine learning. Therefore, it may be possible to develop a fall risk monitoring system that analyzes nursing records daily and alerts health care professionals when the fall risk of an inpatient is increased.


2015 ◽  
Vol 23 (2) ◽  
pp. 276-282 ◽  
Author(s):  
Allison Marier ◽  
Lauren E.W. Olsho ◽  
William Rhodes ◽  
William D. Spector

Abstract Objective Falls are physically and financially costly, but may be preventable with targeted intervention. The Minimum Data Set (MDS) is one potential source of information on fall risk factors among nursing home residents, but its limited breadth and relatively infrequent updates may limit its practical utility. Richer, more frequently updated data from electronic medical records (EMRs) may improve ability to identify individuals at highest risk for falls. Methods The authors applied a repeated events survival model to analyze MDS 3.0 and EMR data for 5129 residents in 13 nursing homes within a single large California chain that uses a centralized EMR system from a leading vendor. Estimated regression parameters were used to project resident fall probability. The authors examined the proportion of observed falls within each projected fall risk decile to assess improvements in predictive power from including EMR data. Results In a model incorporating fall risk factors from the MDS only, 28.6% of observed falls occurred among residents in the highest projected risk decile. In an alternative specification incorporating more frequently updated measures for the same risk factors from the EMR data, 32.3% of observed falls occurred among residents in the highest projected risk decile, a 13% increase over the base MDS-only specification. Conclusions Incorporating EMR data improves ability to identify those at highest risk for falls relative to prediction using MDS data alone. These improvements stem chiefly from the greater frequency with which EMR data are updated, with minimal additional gains from availability of additional risk factor variables.


2019 ◽  
Vol 3 (Supplement_1) ◽  
pp. S480-S480
Author(s):  
Robert Lucero ◽  
Ragnhildur Bjarnadottir

Abstract Two hundred and fifty thousand older adults die annually in United States hospitals because of iatrogenic conditions (ICs). Clinicians, aging experts, patient advocates and federal policy makers agree that there is a need to enhance the safety of hospitalized older adults through improved identification and prevention of ICs. To this end, we are building a research program with the goal of enhancing the safety of hospitalized older adults by reducing ICs through an effective learning health system. Leveraging unique electronic data and healthcare system and human resources at the University of Florida, we are applying a state-of-the-art practice-based data science approach to identify risk factors of ICs (e.g., falls) from structured (i.e., nursing, clinical, administrative) and unstructured or text (i.e., registered nurse’s progress notes) data. Our interdisciplinary academic-clinical partnership includes scientific and clinical experts in patient safety, care quality, health outcomes, nursing and health informatics, natural language processing, data science, aging, standardized terminology, clinical decision support, statistics, machine learning, and hospital operations. Results to date have uncovered previously unknown fall risk factors within nursing (i.e., physical therapy initiation), clinical (i.e., number of fall risk increasing drugs, hemoglobin level), and administrative (i.e., Charlson Comorbidity Index, nurse skill mix, and registered nurse staffing ratio) structured data as well as patient cognitive, environmental, workflow, and communication factors in text data. The application of data science methods (i.e., machine learning and text-mining) and findings from this research will be used to develop text-mining pipelines to support sustained data-driven interdisciplinary aging studies to reduce ICs.


10.2196/29120 ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. e29120
Author(s):  
Bruna Stella Zanotto ◽  
Ana Paula Beck da Silva Etges ◽  
Avner dal Bosco ◽  
Eduardo Gabriel Cortes ◽  
Renata Ruschel ◽  
...  

Background With the rapid adoption of electronic medical records (EMRs), there is an ever-increasing opportunity to collect data and extract knowledge from EMRs to support patient-centered stroke management. Objective This study aims to compare the effectiveness of state-of-the-art automatic text classification methods in classifying data to support the prediction of clinical patient outcomes and the extraction of patient characteristics from EMRs. Methods Our study addressed the computational problems of information extraction and automatic text classification. We identified essential tasks to be considered in an ischemic stroke value-based program. The 30 selected tasks were classified (manually labeled by specialists) according to the following value agenda: tier 1 (achieved health care status), tier 2 (recovery process), care related (clinical management and risk scores), and baseline characteristics. The analyzed data set was retrospectively extracted from the EMRs of patients with stroke from a private Brazilian hospital between 2018 and 2019. A total of 44,206 sentences from free-text medical records in Portuguese were used to train and develop 10 supervised computational machine learning methods, including state-of-the-art neural and nonneural methods, along with ontological rules. As an experimental protocol, we used a 5-fold cross-validation procedure repeated 6 times, along with subject-wise sampling. A heatmap was used to display comparative result analyses according to the best algorithmic effectiveness (F1 score), supported by statistical significance tests. A feature importance analysis was conducted to provide insights into the results. Results The top-performing models were support vector machines trained with lexical and semantic textual features, showing the importance of dealing with noise in EMR textual representations. The support vector machine models produced statistically superior results in 71% (17/24) of tasks, with an F1 score >80% regarding care-related tasks (patient treatment location, fall risk, thrombolytic therapy, and pressure ulcer risk), the process of recovery (ability to feed orally or ambulate and communicate), health care status achieved (mortality), and baseline characteristics (diabetes, obesity, dyslipidemia, and smoking status). Neural methods were largely outperformed by more traditional nonneural methods, given the characteristics of the data set. Ontological rules were also effective in tasks such as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and the Rankin scale. The complementarity in effectiveness among models suggests that a combination of models could enhance the results and cover more tasks in the future. Conclusions Advances in information technology capacity are essential for scalability and agility in measuring health status outcomes. This study allowed us to measure effectiveness and identify opportunities for automating the classification of outcomes of specific tasks related to clinical conditions of stroke victims, and thus ultimately assess the possibility of proactively using these machine learning techniques in real-world situations.


Author(s):  
Mohamed Abdalla ◽  
Hong Lu ◽  
Bogdan Pinzaru ◽  
Liisa Jaakkimainen

IntroductionReliable information about the time spent waiting for health care services is a critical metric for measuring health system performance. Wait times are a useful measure of access to various health care sectors. Alongside the increased adoption of electronic medical records (EMR) by Canadian family physicians (FP), is the secondary use of FP EMR data for research. However, using FP EMR data can be challenging in its unstructured, free-text format. Objectives and ApproachOur objective was to identify the target specialist physician type from the EMR FP referral note and then calculate wait times from a FP referral to a specialist physician visit. We used FP EMR data and linked to Ontario, Canada health administrative data (called EMRPC). EMRPC collects the entire clinical record from patients including the content of FP referral notes. We used machine learning (ML) methods to identify the type of specialist physician in which the referral was intended. Labels to test the ML methods were created from physicians’ claims data. Wait times were calculated from the FP EMR referral note date to the specialist physician claim date in administrative data. ResultsOur ML models’ ability to classify 2016 FP EMR referral notes to selected medical and surgical specialists achieved sensitivity and positive predictive values ranging from the high 70s to low 80s.Compared to earlier analyses from 2008, we observed a similar relative ordering to see specific specialist physicians. Overall, the median wait times have increased by 14 days on average, with a maximum increase of 28 days to see a gastroenterologist. Conclusion / ImplicationsThe accuracy of ML on unstructured FP EMR data is high, thereby providing a mechanism to “codifying” information in a timely manner. This information can help inform decision makers and providers about which patients or FP practices are experiencing long wait times in seeing specialist physicians.


2021 ◽  
Author(s):  
Bruna Stella Zanotto ◽  
Ana Paula Beck da Silva Etges ◽  
Avner dal Bosco ◽  
Eduardo Gabriel Cortes ◽  
Renata Ruschel ◽  
...  

BACKGROUND With the rapid adoption of electronic medical records (EMRs), there is an ever-increasing opportunity to collect data and extract knowledge from EMRs to support patient-centered stroke management. OBJECTIVE This study aims to compare the effectiveness of state-of-the-art automatic text classification methods in classifying data to support the prediction of clinical patient outcomes and the extraction of patient characteristics from EMRs. METHODS Our study addressed the computational problems of information extraction and automatic text classification. We identified essential tasks to be considered in an ischemic stroke value-based program. The 30 selected tasks were classified (manually labeled by specialists) according to the following value agenda: tier 1 (achieved health care status), tier 2 (recovery process), care related (clinical management and risk scores), and baseline characteristics. The analyzed data set was retrospectively extracted from the EMRs of patients with stroke from a private Brazilian hospital between 2018 and 2019. A total of 44,206 sentences from free-text medical records in Portuguese were used to train and develop 10 supervised computational machine learning methods, including state-of-the-art neural and nonneural methods, along with ontological rules. As an experimental protocol, we used a 5-fold cross-validation procedure repeated 6 times, along with <i>subject-wise sampling</i>. A heatmap was used to display comparative result analyses according to the best algorithmic effectiveness (F1 score), supported by statistical significance tests. A feature importance analysis was conducted to provide insights into the results. RESULTS The top-performing models were support vector machines trained with lexical and semantic textual features, showing the importance of dealing with noise in EMR textual representations. The support vector machine models produced statistically superior results in 71% (17/24) of tasks, with an F1 score &gt;80% regarding care-related tasks (patient treatment location, fall risk, thrombolytic therapy, and pressure ulcer risk), the process of recovery (ability to feed orally or ambulate and communicate), health care status achieved (mortality), and baseline characteristics (diabetes, obesity, dyslipidemia, and smoking status). Neural methods were largely outperformed by more traditional nonneural methods, given the characteristics of the data set. Ontological rules were also effective in tasks such as baseline characteristics (alcoholism, atrial fibrillation, and coronary artery disease) and the Rankin scale. The complementarity in effectiveness among models suggests that a combination of models could enhance the results and cover more tasks in the future. CONCLUSIONS Advances in information technology capacity are essential for scalability and agility in measuring health status outcomes. This study allowed us to measure effectiveness and identify opportunities for automating the classification of outcomes of specific tasks related to clinical conditions of stroke victims, and thus ultimately assess the possibility of proactively using these machine learning techniques in real-world situations.


Sign in / Sign up

Export Citation Format

Share Document