Rule-based Natural Language Processing for Automation of Stroke Data Extraction: A Validation Study (Preprint)

BACKGROUND Data extraction from radiology free-text reports is time-consuming when performed manually. Recently, more automated extraction methods using natural language processing (NLP) are proposed. A previously developed rule-based NLP algorithm showed promise in its ability to extract stroke-related data from radiology reports. OBJECTIVE We aimed to externally validate the accuracy of CHARTextract, a rule-based NLP algorithm, to extract stroke-related data from free-text radiology reports. METHODS Free-text reports of CT angiography (CTA) and perfusion (CTP) studies of consecutive patients with acute ischemic stroke admitted to a regional Stroke center for endovascular thrombectomy were analyzed from January 2015 - 2021. Stroke-related variables were manually extracted (reference standard) from the reports, including proximal and distal anterior circulation occlusion, posterior circulation occlusion, presence of ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status. These variables were simultaneously extracted using a rule-based NLP algorithm. The NLP algorithm's accuracy, specificity, sensitivity, positive predictive value (PPV), and negative predictive value (NPV) were assessed. RESULTS The NLP algorithm's accuracy was >90% for identifying distal anterior occlusion, posterior circulation occlusion, hemorrhage, and ASPECTS. Accuracy was 85%, 74%, and 79% for proximal anterior circulation occlusion, presence of ischemia, and collateral status respectively. The algorithm had an accuracy of 87-100% for the detection of variables not reported in radiology reports. CONCLUSIONS Rule-based NLP has a moderate to good performance for stroke-related data extraction from free-text imaging reports. The algorithm's accuracy was affected by inconsistent report styles and lexicon among reporting radiologists.

Download Full-text

Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study (Preprint)

10.2196/preprints.24381 ◽

2020 ◽

Author(s):

Amy Y X Yu ◽

Zhongyu A Liu ◽

Chloe Pou-Prom ◽

Kaitlyn Lopes ◽

Moira K Kapral ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Predictive Value ◽

Large Vessel ◽

Free Text ◽

Imaging Data ◽

Large Vessel Occlusion ◽

Vessel Occlusion ◽

Radiology Reports

BACKGROUND Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. OBJECTIVE We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and location of vascular occlusions as well as other stroke-related attributes based on free-text reports. METHODS From the full reports of 1320 consecutive computed tomography (CT), CT angiography, and CT perfusion scans of the head and neck performed at a tertiary stroke center between October 2017 and January 2019, we manually extracted data on the presence of proximal large vessel occlusion (primary outcome), as well as distal vessel occlusion, ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status (secondary outcomes). Reports were randomly split into training (n=921) and validation (n=399) sets, and attributes were extracted using rule-based NLP. We reported the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the overall accuracy of the NLP approach relative to the manually extracted data. RESULTS The overall prevalence of large vessel occlusion was 12.2%. In the training sample, the NLP approach identified this attribute with an overall accuracy of 97.3% (95.5% sensitivity, 98.1% specificity, 84.1% PPV, and 99.4% NPV). In the validation set, the overall accuracy was 95.2% (90.0% sensitivity, 97.4% specificity, 76.3% PPV, and 98.5% NPV). The accuracy of identifying distal or basilar occlusion as well as hemorrhage was also high, but there were limitations in identifying cerebral ischemia, ASPECTS, and collateral status. CONCLUSIONS NLP may improve the efficiency of large-scale imaging data collection for stroke surveillance and research.

Download Full-text

Automating Stroke Data Extraction From Free-Text Radiology Reports Using Natural Language Processing: Instrument Validation Study

JMIR Medical Informatics ◽

10.2196/24381 ◽

2021 ◽

Vol 9 (5) ◽

pp. e24381

Author(s):

Amy Y X Yu ◽

Zhongyu A Liu ◽

Chloe Pou-Prom ◽

Kaitlyn Lopes ◽

Moira K Kapral ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Predictive Value ◽

Large Vessel ◽

Free Text ◽

Imaging Data ◽

Large Vessel Occlusion ◽

Vessel Occlusion ◽

Radiology Reports

Background Diagnostic neurovascular imaging data are important in stroke research, but obtaining these data typically requires laborious manual chart reviews. Objective We aimed to determine the accuracy of a natural language processing (NLP) approach to extract information on the presence and location of vascular occlusions as well as other stroke-related attributes based on free-text reports. Methods From the full reports of 1320 consecutive computed tomography (CT), CT angiography, and CT perfusion scans of the head and neck performed at a tertiary stroke center between October 2017 and January 2019, we manually extracted data on the presence of proximal large vessel occlusion (primary outcome), as well as distal vessel occlusion, ischemia, hemorrhage, Alberta stroke program early CT score (ASPECTS), and collateral status (secondary outcomes). Reports were randomly split into training (n=921) and validation (n=399) sets, and attributes were extracted using rule-based NLP. We reported the sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and the overall accuracy of the NLP approach relative to the manually extracted data. Results The overall prevalence of large vessel occlusion was 12.2%. In the training sample, the NLP approach identified this attribute with an overall accuracy of 97.3% (95.5% sensitivity, 98.1% specificity, 84.1% PPV, and 99.4% NPV). In the validation set, the overall accuracy was 95.2% (90.0% sensitivity, 97.4% specificity, 76.3% PPV, and 98.5% NPV). The accuracy of identifying distal or basilar occlusion as well as hemorrhage was also high, but there were limitations in identifying cerebral ischemia, ASPECTS, and collateral status. Conclusions NLP may improve the efficiency of large-scale imaging data collection for stroke surveillance and research.

Download Full-text

A natural language processing tool for automatic identification of new disease and disease progression: Parsing text in multi-institutional radiology reports to facilitate clinical trial eligibility screening.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.1555 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. 1555-1555

Author(s):

Eric J. Clayton ◽

Imon Banerjee ◽

Patrick J. Ward ◽

Maggie D Howell ◽

Beth Lohmueller ◽

...

Keyword(s):

Clinical Trial ◽

Clinical Trials ◽

Natural Language Processing ◽

Natural Language ◽

Disease Progression ◽

Language Processing ◽

Free Text ◽

New Disease ◽

Radiology Reports ◽

Precision And Accuracy

1555 Background: Screening every patient for clinical trials is time-consuming, costly and inefficient. Developing an automated method for identifying patients who have potential disease progression, at the point where the practice first receives their radiology reports, but prior to the patient’s office visit, would greatly increase the efficiency of clinical trial operations and likely result in more patients being offered trial opportunities. Methods: Using Natural Language Processing (NLP) methodology, we developed a text parsing algorithm to automatically extract information about potential new disease or disease progression from multi-institutional, free-text radiology reports (CT, PET, bone scan, MRI or x-ray). We combined semantic dictionary mapping and machine learning techniques to normalize the linguistic and formatting variations in the text, training the XGBoost model particularly to achieve a high precision and accuracy to satisfy clinical trial screening requirements. In order to be comprehensive, we enhanced the model vocabulary using a multi-institutional dataset which includes reports from two academic institutions. Results: A dataset of 732 de-identified radiology reports were curated (two MDs agreed on potential new disease/dz progression vs stable) and the model was repeatedly re-trained for each fold where the folds were randomly selected. The final model achieved consistent precision (>0.87 precision) and accuracy (>0.87 accuracy). See the table for a summary of the results, by radiology report type. We are continuing work on the model to validate accuracy and precision using a new and unique set of reports. Conclusions: NLP systems can be used to identify patients who potentially have suffered new disease or disease progression and reduce the human effort in screening or clinical trials. Efforts are ongoing to integrate the NLP process into existing EHR reporting. New imaging reports sent via interface to the EHR will be extracted daily using a database query and will be provided via secure electronic transport to the NLP system. Patients with higher likelihood of disease progression will be automatically identified, and their reports routed to the clinical trials office for clinical trial screening parallel to physician EHR mailbox reporting. The over-arching goal of the project is to increase clinical trial enrollment. 5-fold cross-validation performance of the NLP model in terms of accuracy, precision and recall averaged across all the folds.[Table: see text]

Download Full-text

1180 The Use Of Natural Language Processing To Extract Data From Psg Sleep Study Reports Using National Vha Electronic Medical Record Data

SLEEP ◽

10.1093/sleep/zsaa056.1174 ◽

2020 ◽

Vol 43 (Supplement_1) ◽

pp. A450-A451

Author(s):

S Nowakowski ◽

J Razjouyan ◽

A D Naik ◽

R Agrawal ◽

K Velamuri ◽

...

Keyword(s):

Natural Language Processing ◽

Language Processing ◽

Veterans Health Administration ◽

Free Text ◽

Veterans Health ◽

Health Administration ◽

Rule Based ◽

Veteran Affairs ◽

Department Of Veteran Affairs ◽

Sleep Parameters

Abstract Introduction In 2007, Congress asked the Department of Veteran Affairs to pay closer attention to the incidence of sleep disorders among veterans. We aimed to use natural language processing (NLP), a method that applies algorithms to understand the meaning and structure of sentences within Electronic Health Record (EHR) patient free-text notes, to identify the number of attended polysomnography (PSG) studies conducted in the Veterans Health Administration (VHA) and to evaluate the performance of NLP in extracting sleep data from the notes. Methods We identified 481,115 sleep studies using CPT code 95810 from 2000-19 in the national VHA. We used rule-based regular expression method (phrases: “sleep stage” and “arousal index”) to identify attended PSG reports in the patient free-text notes in the EHR, of which 69,847 records met the rule-based criteria. We randomly selected 178 notes to compare the accuracy of the algorithm in mining sleep parameters: total sleep time (TST), sleep efficiency (SE) and sleep onset latency (SOL) compared to human manual chart review. Results The number of documented PSG studies increased each year from 963 in 2000 to 14,209 in 2018. System performance of NLP compared to manually annotated reference standard in detecting sleep parameters was 83% for TST, 87% for SE, and 81% for SOL (accuracy benchmark ≥ 80%). Conclusion This study showed that NLP is a useful technique to mine EHR and extract data from patients’ free-text notes. Reasons that NLP is not 100% accurate included, the note authors used different phrasing (e.g., “recording duration”) which the NLP algorithm did not detect/extract or authors omitting sleep continuity variables from the notes. Nevertheless, this automated strategy to identify and extract sleep data can serve as an effective tool in large health care systems to be used for research and evaluation to improve sleep medicine patient care and outcomes. Support This material is based upon work supported in part by the Department of Veteran Affairs, Veterans Health Administration, Office of Research and Development, and the Center for Innovations in Quality, Effectiveness and Safety (CIN 13-413). Dr. Nowakowski is also supported by a National Institutes of Health (NIH) Grant (R01NR018342).

Download Full-text

Deep Learning-Based Natural Language Processing in Radiology: The Impact of Report Complexity, Disease Prevalence, Dataset Size, and Algorithm Type on Model Performance

Journal of Medical Systems ◽

10.1007/s10916-021-01761-4 ◽

2021 ◽

Vol 45 (10) ◽

Author(s):

A. W. Olthof ◽

P. M. A. van Ooijen ◽

L. J. Cornelissen

Keyword(s):

Neural Network ◽

Deep Learning ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Predictive Value ◽

Training Data ◽

Radiology Reports ◽

Dataset Size ◽

The Impact

AbstractIn radiology, natural language processing (NLP) allows the extraction of valuable information from radiology reports. It can be used for various downstream tasks such as quality improvement, epidemiological research, and monitoring guideline adherence. Class imbalance, variation in dataset size, variation in report complexity, and algorithm type all influence NLP performance but have not yet been systematically and interrelatedly evaluated. In this study, we investigate these factors on the performance of four types [a fully connected neural network (Dense), a long short-term memory recurrent neural network (LSTM), a convolutional neural network (CNN), and a Bidirectional Encoder Representations from Transformers (BERT)] of deep learning-based NLP. Two datasets consisting of radiologist-annotated reports of both trauma radiographs (n = 2469) and chest radiographs and computer tomography (CT) studies (n = 2255) were split into training sets (80%) and testing sets (20%). The training data was used as a source to train all four model types in 84 experiments (Fracture-data) and 45 experiments (Chest-data) with variation in size and prevalence. The performance was evaluated on sensitivity, specificity, positive predictive value, negative predictive value, area under the curve, and F score. After the NLP of radiology reports, all four model-architectures demonstrated high performance with metrics up to > 0.90. CNN, LSTM, and Dense were outperformed by the BERT algorithm because of its stable results despite variation in training size and prevalence. Awareness of variation in prevalence is warranted because it impacts sensitivity and specificity in opposite directions.

Download Full-text

A Highly Generalizable Natural Language Processing Algorithm for the Diagnosis of Pulmonary Embolism from Radiology Reports

10.1101/2020.10.13.20211961 ◽

2020 ◽

Author(s):

Jacob Johnson ◽

Grace Qiu ◽

Christine Lamoureux ◽

Jennifer Ngo ◽

Lawrence Ngo

Keyword(s):

Pulmonary Embolism ◽

Deep Learning ◽

Natural Language Processing ◽

Sample Size ◽

Language Processing ◽

High Accuracy ◽

Free Text ◽

Radiology Reports ◽

Natural Language Processing Algorithm

AbstractThough sophisticated algorithms have been developed for the classification of free-text radiology reports for pulmonary embolism (PE), their overall generalizability remains unvalidated given limitations in sample size and data homogeneity. We developed and validated a highly generalizable deep-learning based NLP algorithm for this purpose with data sourced from over 2,000 hospital sites and 500 radiologists. The algorithm achieved an AUCROC of 0.995 on chest angiography studies and 0.994 on non-angiography studies for the presence or absence of PE. The high accuracy achieved on this large and heterogeneous dataset allows for the possibility of application in large multi-center radiology practices as well as for deployment at novel sites without significant degradation in performance.

Download Full-text

Classifying abnormalities in computed tomography radiology reports with rule-based and natural language processing models

Medical Imaging 2019: Computer-Aided Diagnosis ◽

10.1117/12.2513577 ◽

2019 ◽

Author(s):

Songyue Han ◽

James Tian ◽

Mark Kelly ◽

Vignesh Selvakumaran ◽

Ricardo Henao ◽

...

Keyword(s):

Computed Tomography ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Rule Based ◽

Radiology Reports

Download Full-text

Natural Language Processing for Automated Quantification of Brain Metastases Reported in Free-Text Radiology Reports

JCO Clinical Cancer Informatics ◽

10.1200/cci.18.00138 ◽

2019 ◽

pp. 1-9 ◽

Cited By ~ 8

Author(s):

Joeky T. Senders ◽

Aditya V. Karhade ◽

David J. Cote ◽

Alireza Mehrtash ◽

Nayan Lamba ◽

...

Keyword(s):

Natural Language Processing ◽

Brain Metastases ◽

Regression Model ◽

Language Processing ◽

Text Analysis ◽

Free Text ◽

Bag Of Words ◽

Lasso Regression ◽

Radiology Reports ◽

Overall Performance

PURPOSE Although the bulk of patient-generated health data are increasing exponentially, their use is impeded because most data come in unstructured format, namely as free-text clinical reports. A variety of natural language processing (NLP) methods have emerged to automate the processing of free text ranging from statistical to deep learning–based models; however, the optimal approach for medical text analysis remains to be determined. The aim of this study was to provide a head-to-head comparison of novel NLP techniques and inform future studies about their utility for automated medical text analysis. PATIENTS AND METHODS Magnetic resonance imaging reports of patients with brain metastases treated in two tertiary centers were retrieved and manually annotated using a binary classification (single metastasis v two or more metastases). Multiple bag-of-words and sequence-based NLP models were developed and compared after randomly splitting the annotated reports into training and test sets in an 80:20 ratio. RESULTS A total of 1,479 radiology reports of patients diagnosed with brain metastases were retrieved. The least absolute shrinkage and selection operator (LASSO) regression model demonstrated the best overall performance on the hold-out test set with an area under the receiver operating characteristic curve of 0.92 (95% CI, 0.89 to 0.94), accuracy of 83% (95% CI, 80% to 87%), calibration intercept of –0.06 (95% CI, –0.14 to 0.01), and calibration slope of 1.06 (95% CI, 0.95 to 1.17). CONCLUSION Among various NLP techniques, the bag-of-words approach combined with a LASSO regression model demonstrated the best overall performance in extracting binary outcomes from free-text clinical reports. This study provides a framework for the development of machine learning-based NLP models as well as a clinical vignette of patients diagnosed with brain metastases.

Download Full-text

A Natural Language Processing Pipeline of Chinese Free-Text Radiology Reports for Liver Cancer Diagnosis

IEEE Access ◽

10.1109/access.2020.3020138 ◽

2020 ◽

Vol 8 ◽

pp. 159110-159119

Author(s):

Honglei Liu ◽

Yan Xu ◽

Zhiqiang Zhang ◽

Ni Wang ◽

Yanqun Huang ◽

...

Keyword(s):

Natural Language Processing ◽

Liver Cancer ◽

Natural Language ◽

Language Processing ◽

Cancer Diagnosis ◽

Free Text ◽

Processing Pipeline ◽

Radiology Reports

Download Full-text

Data for registry and quality review can be retrospectively collected using natural language processing from unstructured charts of arthroplasty patients

The Bone & Joint Journal ◽

10.1302/0301-620x.102b7.bjj-2019-1574.r1 ◽

2020 ◽

Vol 102-B (7_Supple_B) ◽

pp. 99-104

Author(s):

Romil F. Shah ◽

Stefano Bini ◽

Thomas Vail

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Data Extraction ◽

Test Sample ◽

Free Text ◽

Range Of Movement ◽

Random Test ◽

Automated Method ◽

Primary Arthroplasty

Aims Natural Language Processing (NLP) offers an automated method to extract data from unstructured free text fields for arthroplasty registry participation. Our objective was to investigate how accurately NLP can be used to extract structured clinical data from unstructured clinical notes when compared with manual data extraction. Methods A group of 1,000 randomly selected clinical and hospital notes from eight different surgeons were collected for patients undergoing primary arthroplasty between 2012 and 2018. In all, 19 preoperative, 17 operative, and two postoperative variables of interest were manually extracted from these notes. A NLP algorithm was created to automatically extract these variables from a training sample of these notes, and the algorithm was tested on a random test sample of notes. Performance of the NLP algorithm was measured in Statistical Analysis System (SAS) by calculating the accuracy of the variables collected, the ability of the algorithm to collect the correct information when it was indeed in the note (sensitivity), and the ability of the algorithm to not collect a certain data element when it was not in the note (specificity). Results The NLP algorithm performed well at extracting variables from unstructured data in our random test dataset (accuracy = 96.3%, sensitivity = 95.2%, and specificity = 97.4%). It performed better at extracting data that were in a structured, templated format such as range of movement (ROM) (accuracy = 98%) and implant brand (accuracy = 98%) than data that were entered with variation depending on the author of the note such as the presence of deep-vein thrombosis (DVT) (accuracy = 90%). Conclusion The NLP algorithm used in this study was able to identify a subset of variables from randomly selected unstructured notes in arthroplasty with an accuracy above 90%. For some variables, such as objective exam data, the accuracy was very high. Our findings suggest that automated algorithms using NLP can help orthopaedic practices retrospectively collect information for registries and quality improvement (QI) efforts. Cite this article: Bone Joint J 2020;102-B(7 Supple B):99–104.

Download Full-text