scholarly journals Ascertaining Framingham heart failure phenotype from inpatient electronic health record data using natural language processing: a multicentre Atherosclerosis Risk in Communities (ARIC) validation study

BMJ Open ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. e047356
Author(s):  
Carlton R Moore ◽  
Saumya Jain ◽  
Stephanie Haas ◽  
Harish Yadav ◽  
Eric Whitsel ◽  
...  

ObjectivesUsing free-text clinical notes and reports from hospitalised patients, determine the performance of natural language processing (NLP) ascertainment of Framingham heart failure (HF) criteria and phenotype.Study designA retrospective observational study design of patients hospitalised in 2015 from four hospitals participating in the Atherosclerosis Risk in Communities (ARIC) study was used to determine NLP performance in the ascertainment of Framingham HF criteria and phenotype.SettingFour ARIC study hospitals, each representing an ARIC study region in the USA.ParticipantsA stratified random sample of hospitalisations identified using a broad range of International Classification of Disease, ninth revision, diagnostic codes indicative of an HF event and occurring during 2015 was drawn for this study. A randomly selected set of 394 hospitalisations was used as the derivation dataset and 406 hospitalisations was used as the validation dataset.InterventionUse of NLP on free-text clinical notes and reports to ascertain Framingham HF criteria and phenotype.Primary and secondary outcome measuresNLP performance as measured by sensitivity, specificity, positive-predictive value (PPV) and agreement in ascertainment of Framingham HF criteria and phenotype. Manual medical record review by trained ARIC abstractors was used as the reference standard.ResultsOverall, performance of NLP ascertainment of Framingham HF phenotype in the validation dataset was good, with 78.8%, 81.7%, 84.4% and 80.0% for sensitivity, specificity, PPV and agreement, respectively.ConclusionsBy decreasing the need for manual chart review, our results on the use of NLP to ascertain Framingham HF phenotype from free-text electronic health record data suggest that validated NLP technology holds the potential for significantly improving the feasibility and efficiency of conducting large-scale epidemiologic surveillance of HF prevalence and incidence.

Circulation ◽  
2018 ◽  
Vol 137 (suppl_1) ◽  
Author(s):  
Brittany M Bogle ◽  
Wayne D Rosamond ◽  
Aaron R Folsom ◽  
Paul Sorlie ◽  
Elsayed Z Soliman ◽  
...  

Background: Accurate community surveillance of cardiovascular disease requires hospital record abstraction, which is typically a manual process. The costly and time-intensive nature of manual abstraction precludes its use on a regional or national scale in the US. Whether an efficient system can accurately reproduce traditional community surveillance methods by processing electronic health records (EHRs) has not been established. Objective: We sought to develop and test an EHR-based system to reproduce abstraction and classification procedures for acute myocardial infarction (MI) as defined by the Atherosclerosis Risk in Communities (ARIC) Study. Methods: Records from hospitalizations in 2014 within ARIC community surveillance areas were sampled using a broad set of ICD discharge codes likely to harbor MI. These records were manually abstracted by ARIC study personnel and used to classify MI according to ARIC protocols. We requested EHRs in a unified data structure for the same hospitalizations at 6 hospitals and built programs to convert free text and structured data into the ARIC criteria elements necessary for MI classification. Per ARIC protocol, MI was classified based on cardiac biomarkers, cardiac pain, and Minnesota-coded electrocardiogram abnormalities. We compared MI classified from manually abstracted data to (1) EHR-based classification and (2) final ICD-9 coded discharge diagnoses (410-414). Results: These preliminary results are based on hospitalizations from 1 hospital. Of 684 hospitalizations, 355 qualified for full manual abstraction; 83 (23%) of these were classified as definite MI and 78 (22%) as probable MI. Our EHR-based abstraction is sensitive (>75%) and highly specific (>83%) in classifying ARIC-defined definite MI and definite or probable MI (Table). Conclusions: Our results support the potential of a process to extract comprehensive sets of data elements from EHR from different hospitals, with completeness and accuracy sufficient for a standardized definition of hospitalized MI.


Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 1398-P
Author(s):  
MARY R. ROONEY ◽  
OLIVE TANG ◽  
B. GWEN WINDHAM ◽  
JUSTIN B. ECHOUFFO TCHEUGUI ◽  
PAMELA LUTSEY ◽  
...  

2020 ◽  
Vol 41 (S1) ◽  
pp. s39-s39
Author(s):  
Pontus Naucler ◽  
Suzanne D. van der Werff ◽  
John Valik ◽  
Logan Ward ◽  
Anders Ternhag ◽  
...  

Background: Healthcare-associated infection (HAI) surveillance is essential for most infection prevention programs and continuous epidemiological data can be used to inform healthcare personal, allocate resources, and evaluate interventions to prevent HAIs. Many HAI surveillance systems today are based on time-consuming and resource-intensive manual reviews of patient records. The objective of HAI-proactive, a Swedish triple-helix innovation project, is to develop and implement a fully automated HAI surveillance system based on electronic health record data. Furthermore, the project aims to develop machine-learning–based screening algorithms for early prediction of HAI at the individual patient level. Methods: The project is performed with support from Sweden’s Innovation Agency in collaboration among academic, health, and industry partners. Development of rule-based and machine-learning algorithms is performed within a research database, which consists of all electronic health record data from patients admitted to the Karolinska University Hospital. Natural language processing is used for processing free-text medical notes. To validate algorithm performance, manual annotation was performed based on international HAI definitions from the European Center for Disease Prevention and Control, Centers for Disease Control and Prevention, and Sepsis-3 criteria. Currently, the project is building a platform for real-time data access to implement the algorithms within Region Stockholm. Results: The project has developed a rule-based surveillance algorithm for sepsis that continuously monitors patients admitted to the hospital, with a sensitivity of 0.89 (95% CI, 0.85–0.93), a specificity of 0.99 (0.98–0.99), a positive predictive value of 0.88 (0.83–0.93), and a negative predictive value of 0.99 (0.98–0.99). The healthcare-associated urinary tract infection surveillance algorithm, which is based on free-text analysis and negations to define symptoms, had a sensitivity of 0.73 (0.66–0.80) and a positive predictive value of 0.68 (0.61–0.75). The sensitivity and positive predictive value of an algorithm based on significant bacterial growth in urine culture only was 0.99 (0.97–1.00) and 0.39 (0.34–0.44), respectively. The surveillance system detected differences in incidences between hospital wards and over time. Development of surveillance algorithms for pneumonia, catheter-related infections and Clostridioides difficile infections, as well as machine-learning–based models for early prediction, is ongoing. We intend to present results from all algorithms. Conclusions: With access to electronic health record data, we have shown that it is feasible to develop a fully automated HAI surveillance system based on algorithms using both structured data and free text for the main healthcare-associated infections.Funding: Sweden’s Innovation Agency and Stockholm County CouncilDisclosures: None


Neurology ◽  
2011 ◽  
Vol 78 (2) ◽  
pp. 102-108 ◽  
Author(s):  
D. C. Bezerra ◽  
A. R. Sharrett ◽  
K. Matsushita ◽  
R. F. Gottesman ◽  
D. Shibata ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document