Ascertaining Framingham heart failure phenotype from inpatient electronic health record data using natural language processing: a multicentre Atherosclerosis Risk in Communities (ARIC) validation study

ObjectivesUsing free-text clinical notes and reports from hospitalised patients, determine the performance of natural language processing (NLP) ascertainment of Framingham heart failure (HF) criteria and phenotype.Study designA retrospective observational study design of patients hospitalised in 2015 from four hospitals participating in the Atherosclerosis Risk in Communities (ARIC) study was used to determine NLP performance in the ascertainment of Framingham HF criteria and phenotype.SettingFour ARIC study hospitals, each representing an ARIC study region in the USA.ParticipantsA stratified random sample of hospitalisations identified using a broad range of International Classification of Disease, ninth revision, diagnostic codes indicative of an HF event and occurring during 2015 was drawn for this study. A randomly selected set of 394 hospitalisations was used as the derivation dataset and 406 hospitalisations was used as the validation dataset.InterventionUse of NLP on free-text clinical notes and reports to ascertain Framingham HF criteria and phenotype.Primary and secondary outcome measuresNLP performance as measured by sensitivity, specificity, positive-predictive value (PPV) and agreement in ascertainment of Framingham HF criteria and phenotype. Manual medical record review by trained ARIC abstractors was used as the reference standard.ResultsOverall, performance of NLP ascertainment of Framingham HF phenotype in the validation dataset was good, with 78.8%, 81.7%, 84.4% and 80.0% for sensitivity, specificity, PPV and agreement, respectively.ConclusionsBy decreasing the need for manual chart review, our results on the use of NLP to ascertain Framingham HF phenotype from free-text electronic health record data suggest that validated NLP technology holds the potential for significantly improving the feasibility and efficiency of conducting large-scale epidemiologic surveillance of HF prevalence and incidence.

Download Full-text

Abstract MP21: Feasibility of Electronic Health Records-based community surveillance of cardiovascular disease: Findings from the Atherosclerosis Risk in Communities Study.

Circulation ◽

10.1161/circ.137.suppl_1.mp21 ◽

2018 ◽

Vol 137 (suppl_1) ◽

Author(s):

Brittany M Bogle ◽

Wayne D Rosamond ◽

Aaron R Folsom ◽

Paul Sorlie ◽

Elsayed Z Soliman ◽

...

Keyword(s):

Cardiovascular Disease ◽

Electronic Health Records ◽

Cardiac Biomarkers ◽

Free Text ◽

Health Records ◽

Efficient System ◽

Atherosclerosis Risk In Communities ◽

Atherosclerosis Risk ◽

Electronic Health ◽

Aric Study

Background: Accurate community surveillance of cardiovascular disease requires hospital record abstraction, which is typically a manual process. The costly and time-intensive nature of manual abstraction precludes its use on a regional or national scale in the US. Whether an efficient system can accurately reproduce traditional community surveillance methods by processing electronic health records (EHRs) has not been established. Objective: We sought to develop and test an EHR-based system to reproduce abstraction and classification procedures for acute myocardial infarction (MI) as defined by the Atherosclerosis Risk in Communities (ARIC) Study. Methods: Records from hospitalizations in 2014 within ARIC community surveillance areas were sampled using a broad set of ICD discharge codes likely to harbor MI. These records were manually abstracted by ARIC study personnel and used to classify MI according to ARIC protocols. We requested EHRs in a unified data structure for the same hospitalizations at 6 hospitals and built programs to convert free text and structured data into the ARIC criteria elements necessary for MI classification. Per ARIC protocol, MI was classified based on cardiac biomarkers, cardiac pain, and Minnesota-coded electrocardiogram abnormalities. We compared MI classified from manually abstracted data to (1) EHR-based classification and (2) final ICD-9 coded discharge diagnoses (410-414). Results: These preliminary results are based on hospitalizations from 1 hospital. Of 684 hospitalizations, 355 qualified for full manual abstraction; 83 (23%) of these were classified as definite MI and 78 (22%) as probable MI. Our EHR-based abstraction is sensitive (>75%) and highly specific (>83%) in classifying ARIC-defined definite MI and definite or probable MI (Table). Conclusions: Our results support the potential of a process to extract comprehensive sets of data elements from EHR from different hospitals, with completeness and accuracy sufficient for a standardized definition of hospitalized MI.

Download Full-text

Global Initiative on Obstructive Lung Disease (GOLD) classification of lung disease and mortality: findings from the Atherosclerosis Risk in Communities (ARIC) study

Yearbook of Medicine ◽

10.1016/s0084-3873(08)70163-9 ◽

2007 ◽

Vol 2007 ◽

pp. 243-244

Author(s):

J.R. Maurer

Keyword(s):

Lung Disease ◽

Obstructive Lung Disease ◽

Atherosclerosis Risk In Communities ◽

Gold Classification ◽

Atherosclerosis Risk ◽

Global Initiative ◽

Aric Study

Download Full-text

1398-P: Mortality Implications of Comorbid Health Status in Older Adults with Diabetes: The Atherosclerosis Risk in Communities (ARIC) Study

Diabetes ◽

10.2337/db20-1398-p ◽

2020 ◽

Vol 69 (Supplement 1) ◽

pp. 1398-P

Author(s):

MARY R. ROONEY ◽

OLIVE TANG ◽

B. GWEN WINDHAM ◽

JUSTIN B. ECHOUFFO TCHEUGUI ◽

PAMELA LUTSEY ◽

...

Keyword(s):

Older Adults ◽

Health Status ◽

Atherosclerosis Risk In Communities ◽

Atherosclerosis Risk ◽

Aric Study

Download Full-text

Faculty Opinions recommendation of Race and sex differences in the incidence and prognostic significance of silent myocardial infarction in the atherosclerosis risk in communities (ARIC) study.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726360307.793520035 ◽

2016 ◽

Author(s):

Wilbert Aronow

Keyword(s):

Myocardial Infarction ◽

Sex Differences ◽

Prognostic Significance ◽

Atherosclerosis Risk In Communities ◽

Atherosclerosis Risk ◽

Aric Study ◽

Silent Myocardial Infarction

Download Full-text

Faculty Opinions recommendation of Lifetime Risk and Risk Factors for Abdominal Aortic Aneurysm in a 24-Year Prospective Study: The ARIC Study (Atherosclerosis Risk in Communities).

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726956841.793535495 ◽

2017 ◽

Author(s):

Norman Hertzer

Keyword(s):

Risk Factors ◽

Abdominal Aortic Aneurysm ◽

Aortic Aneurysm ◽

Prospective Study ◽

Lifetime Risk ◽

Atherosclerosis Risk In Communities ◽

Abdominal Aortic ◽

Atherosclerosis Risk ◽

Aric Study

Download Full-text

Faculty Opinions recommendation of Relation of Elevated Resting Heart Rate in Mid-Life to Cognitive Decline Over 20 Years (from the Atherosclerosis Risk in Communities [ARIC] Study).

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.734445964.793556643 ◽

2019 ◽

Author(s):

Wilbert Aronow

Keyword(s):

Heart Rate ◽

Cognitive Decline ◽

Resting Heart Rate ◽

Atherosclerosis Risk In Communities ◽

Atherosclerosis Risk ◽

Aric Study

Download Full-text

HAI-Proactive: Development of an Automated Surveillance System for Healthcare-Associated Infections in Sweden

Infection Control and Hospital Epidemiology ◽

10.1017/ice.2020.519 ◽

2020 ◽

Vol 41 (S1) ◽

pp. s39-s39

Author(s):

Pontus Naucler ◽

Suzanne D. van der Werff ◽

John Valik ◽

Logan Ward ◽

Anders Ternhag ◽

...

Keyword(s):

Positive Predictive Value ◽

Electronic Health Record ◽

Predictive Value ◽

Surveillance System ◽

Free Text ◽

Health Record ◽

Electronic Health Record Data ◽

Record Data ◽

Electronic Health ◽

Healthcare Associated

Background: Healthcare-associated infection (HAI) surveillance is essential for most infection prevention programs and continuous epidemiological data can be used to inform healthcare personal, allocate resources, and evaluate interventions to prevent HAIs. Many HAI surveillance systems today are based on time-consuming and resource-intensive manual reviews of patient records. The objective of HAI-proactive, a Swedish triple-helix innovation project, is to develop and implement a fully automated HAI surveillance system based on electronic health record data. Furthermore, the project aims to develop machine-learning–based screening algorithms for early prediction of HAI at the individual patient level. Methods: The project is performed with support from Sweden’s Innovation Agency in collaboration among academic, health, and industry partners. Development of rule-based and machine-learning algorithms is performed within a research database, which consists of all electronic health record data from patients admitted to the Karolinska University Hospital. Natural language processing is used for processing free-text medical notes. To validate algorithm performance, manual annotation was performed based on international HAI definitions from the European Center for Disease Prevention and Control, Centers for Disease Control and Prevention, and Sepsis-3 criteria. Currently, the project is building a platform for real-time data access to implement the algorithms within Region Stockholm. Results: The project has developed a rule-based surveillance algorithm for sepsis that continuously monitors patients admitted to the hospital, with a sensitivity of 0.89 (95% CI, 0.85–0.93), a specificity of 0.99 (0.98–0.99), a positive predictive value of 0.88 (0.83–0.93), and a negative predictive value of 0.99 (0.98–0.99). The healthcare-associated urinary tract infection surveillance algorithm, which is based on free-text analysis and negations to define symptoms, had a sensitivity of 0.73 (0.66–0.80) and a positive predictive value of 0.68 (0.61–0.75). The sensitivity and positive predictive value of an algorithm based on significant bacterial growth in urine culture only was 0.99 (0.97–1.00) and 0.39 (0.34–0.44), respectively. The surveillance system detected differences in incidences between hospital wards and over time. Development of surveillance algorithms for pneumonia, catheter-related infections and Clostridioides difficile infections, as well as machine-learning–based models for early prediction, is ongoing. We intend to present results from all algorithms. Conclusions: With access to electronic health record data, we have shown that it is feasible to develop a fully automated HAI surveillance system based on algorithms using both structured data and free text for the main healthcare-associated infections.Funding: Sweden’s Innovation Agency and Stockholm County CouncilDisclosures: None

Download Full-text