The Use of Natural Language Processing to Ascertain Suicide Ideation/Attempt from Clinical Notes Within a Large Integrated Healthcare System (Preprint)

Mapping Intimacies ◽

10.2196/preprints.28060 ◽

2021 ◽

Author(s):

Fagen Xie ◽

Deborah S Ling-Grant ◽

John Chang ◽

Britta I Amundsen ◽

Rulin C Hechter

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Suicide Ideation ◽

Validation Dataset ◽

Health Record ◽

Integrated Healthcare ◽

Entire Study ◽

Clinical Notes

UNSTRUCTURED Purpose: Identifying risk factors for suicide using progress notes and administrative data is time consuming and usually requires manual case review. In this study, a natural language processing computerized algorithm was developed and implemented to automatically ascertain suicide ideation/attempt from clinical notes in a large integrated healthcare system, Kaiser Permanente Southern California. Methods: Clinical notes containing prespecified relevant keywords and phrases related to suicidal ideation/attempt between 2010 and 2018 were extracted from our organization’s electronic health record system. A random sample of 864 clinical notes was selected and equally divided into four subsets. These subsets were reviewed and classified as one of the following three suicide ideation/attempt categories: “Current”, “Historical” and “No” for each note by experienced research chart abstractors. The first three training datasets were used to develop the rule-based computerized algorithm sequentially and the fourth validation dataset was used to evaluate the algorithm performance. The validated algorithm was then applied to the entire study sample of clinical notes. Results: The computerized algorithm ascertained 23 of the 26 confirmed “Current” suicide ideation/attempt events and all 10 confirmed “Historical” suicide ideation/attempt events in the validation dataset. This algorithm produced a 88.5% sensitivity and 100.0% positive predictive value (PPV) for “Current” suicide ideation/attempt, and a 100.0% sensitivity and 100.0% PPV for “Historical” suicide ideation/attempt. After applying the computerized process to the entire study population sample, we identified a total of 1,050,289 “Current” ideation/attempt events and 293,038 “Historical” ideation/attempt events during the study period. Among the 400,436 individuals who were identified as having a “Current” suicide ideation/attempt event, 115,197 (28.8%) were 15-24 years old at the first event, 234,924 (58.7%) were female, 165,084 (41.7%) were Hispanic, and 150,645 (37.6%) had two or more events in the study period. Conclusions: Our study demonstrated that a natural language processing computerized algorithm can effectively ascertain suicide ideation/attempt from the free-text clinical notes in the electronic health record of a diverse patient population. This algorithm can be utilized in support of suicide prevention programs and patient care management.

Download Full-text

Extraction of Geriatric Syndromes From Electronic Health Record Clinical Notes: Assessment of Statistical Natural Language Processing Methods

JMIR Medical Informatics ◽

10.2196/13039 ◽

2019 ◽

Vol 7 (1) ◽

pp. e13039 ◽

Cited By ~ 7

Author(s):

Tao Chen ◽

Mark Dredze ◽

Jonathan P Weiner ◽

Leilani Hernandez ◽

Joe Kimura ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Health Record ◽

Geriatric Syndromes ◽

Processing Methods ◽

Clinical Notes ◽

Statistical Natural Language Processing ◽

Electronic Health

Download Full-text

Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocy166 ◽

2019 ◽

Vol 26 (3) ◽

pp. 254-261 ◽

Cited By ~ 12

Author(s):

Majid Afshar ◽

Andrew Phillips ◽

Niranjan Karnik ◽

Jeanne Mueller ◽

Daniel To ◽

...

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Alcohol Misuse ◽

Health Record ◽

Trauma Patients ◽

Clinical Notes ◽

Electronic Health

AbstractObjectiveAlcohol misuse is present in over a quarter of trauma patients. Information in the clinical notes of the electronic health record of trauma patients may be used for phenotyping tasks with natural language processing (NLP) and supervised machine learning. The objective of this study is to train and validate an NLP classifier for identifying patients with alcohol misuse.Materials and MethodsAn observational cohort of 1422 adult patients admitted to a trauma center between April 2013 and November 2016. Linguistic processing of clinical notes was performed using the clinical Text Analysis and Knowledge Extraction System. The primary analysis was the binary classification of alcohol misuse. The Alcohol Use Disorders Identification Test served as the reference standard.ResultsThe data corpus comprised 91 045 electronic health record notes and 16 091 features. In the final machine learning classifier, 16 features were selected from the first 24 hours of notes for identifying alcohol misuse. The classifier’s performance in the validation cohort had an area under the receiver-operating characteristic curve of 0.78 (95% confidence interval [CI], 0.72 to 0.85). Sensitivity and specificity were at 56.0% (95% CI, 44.1% to 68.0%) and 88.9% (95% CI, 84.4% to 92.8%). The Hosmer-Lemeshow goodness-of-fit test demonstrates the classifier fits the data well (P = .17). A simpler rule-based keyword approach had a decrease in sensitivity when compared with the NLP classifier from 56.0% to 18.2%.ConclusionsThe NLP classifier has adequate predictive validity for identifying alcohol misuse in trauma centers. External validation is needed before its application to augment screening.

Download Full-text

PMH52 Use of a Natural Language Processing-Based Approach to Extract Suicide Ideation and Behavior from Clinical Notes to Support Depression Research

Value in Health ◽

10.1016/j.jval.2021.04.674 ◽

2021 ◽

Vol 24 ◽

pp. S137

Author(s):

N. Palmon ◽

S. Momen ◽

M. Leavy ◽

G. Curhan ◽

C. Boussios ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Suicide Ideation ◽

Clinical Notes ◽

And Behavior

Download Full-text

Development and Validation of a Natural Language Processing Algorithm to Extract Descriptors of Microbial Keratitis From the Electronic Health Record

Cornea ◽

10.1097/ico.0000000000002755 ◽

2021 ◽

Vol Publish Ahead of Print ◽

Author(s):

Maria A. Woodward ◽

Nenita Maganti ◽

Leslie M. Niziol ◽

Sejal Amin ◽

Andrew Hou ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Processing Algorithm ◽

Health Record ◽

Microbial Keratitis ◽

Electronic Health ◽

Development And Validation ◽

Natural Language Processing Algorithm

Download Full-text

Semantic computational analysis of anticoagulation use in atrial fibrillation from real world data

10.1101/19011643 ◽

2019 ◽

Author(s):

Daniel M. Bean ◽

James Teo ◽

Honghan Wu ◽

Ricardo Oliveira ◽

Raj Patel ◽

...

Keyword(s):

Atrial Fibrillation ◽

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Open Source ◽

Language Processing ◽

Risk Scores ◽

Free Text ◽

Health Record ◽

Electronic Health

AbstractAtrial fibrillation (AF) is the most common arrhythmia and significantly increases stroke risk. This risk is effectively managed by oral anticoagulation. Recent studies using national registry data indicate increased use of anticoagulation resulting from changes in guidelines and the availability of newer drugs.The aim of this study is to develop and validate an open source risk scoring pipeline for free-text electronic health record data using natural language processing.AF patients discharged from 1st January 2011 to 1st October 2017 were identified from discharge summaries (N=10,030, 64.6% male, average age 75.3 ± 12.3 years). A natural language processing pipeline was developed to identify risk factors in clinical text and calculate risk for ischaemic stroke (CHA2DS2-VASc) and bleeding (HAS-BLED). Scores were validated vs two independent experts for 40 patients.Automatic risk scores were in strong agreement with the two independent experts for CHA2DS2-VASc (average kappa 0.78 vs experts, compared to 0.85 between experts). Agreement was lower for HAS-BLED (average kappa 0.54 vs experts, compared to 0.74 between experts).In high-risk patients (CHA2DS2-VASc ≥2) OAC use has increased significantly over the last 7 years, driven by the availability of DOACs and the transitioning of patients from AP medication alone to OAC. Factors independently associated with OAC use included components of the CHA2DS2-VASc and HAS-BLED scores as well as discharging specialty and frailty. OAC use was highest in patients discharged under cardiology (69%).Electronic health record text can be used for automatic calculation of clinical risk scores at scale. Open source tools are available today for this task but require further validation. Analysis of routinely-collected EHR data can replicate findings from large-scale curated registries.

Download Full-text

Performance of a Natural Language Processing Method to Extract Stone Composition From the Electronic Health Record

Urology ◽

10.1016/j.urology.2019.07.007 ◽

2019 ◽

Vol 132 ◽

pp. 56-62 ◽

Cited By ~ 1

Author(s):

Cosmin A. Bejan ◽

Daniel J. Lee ◽

Yaomin Xu ◽

Ryan S. Hsi

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Processing Method ◽

Health Record ◽

Stone Composition ◽

Electronic Health

Download Full-text

O-037: Surveillance of adverse events in elderly patients: A study on the accuracy of applying natural language processing techniques to electronic health record data

European Geriatric Medicine ◽

10.1016/s1878-7649(15)30050-4 ◽

2015 ◽

Vol 6 ◽

pp. S15 ◽

Cited By ~ 2

Author(s):

C. Rochefort ◽

A. Verma ◽

T. Eguale ◽

D. Buckeridge

Keyword(s):

Natural Language Processing ◽

Adverse Events ◽

Elderly Patients ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Health Record ◽

Electronic Health Record Data ◽

Record Data ◽

Processing Techniques

Download Full-text

Early recognition of multiple sclerosis using natural language processing of the electronic health record

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-017-0418-4 ◽

2017 ◽

Vol 17 (1) ◽

Cited By ~ 18

Author(s):

Herbert S. Chase ◽

Lindsey R. Mitrani ◽

Gabriel G. Lu ◽

Dominick J. Fulgieri

Keyword(s):

Multiple Sclerosis ◽

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Early Recognition ◽

Health Record ◽

Electronic Health

Download Full-text

Using Natural Language Processing and the Electronic Health Record for Appendicitis Risk Stratification

2012 IEEE Second International Conference on Healthcare Informatics, Imaging and Systems Biology ◽

10.1109/hisb.2012.15 ◽

2012 ◽

Author(s):

Louise Deleger ◽

Holly Brodzinski ◽

Haijun Zhai ◽

Qi Li ◽

Todd Lingren ◽

...

Keyword(s):

Natural Language Processing ◽

Risk Stratification ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Health Record ◽

Electronic Health

Download Full-text

PNS98 Natural Language Processing (NLP)-Based Detection of Transgender and Gender Non-Conforming Patients in Electronic Health Record (EHR)-Derived Data

Value in Health ◽

10.1016/j.jval.2021.04.952 ◽

2021 ◽

Vol 24 ◽

pp. S190

Author(s):

I. Hooley ◽

K. Maignan ◽

D. Ngai ◽

B. Ackerman

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Health Record ◽

Electronic Health ◽

And Gender ◽

Derived Data

Download Full-text