scholarly journals Classification of Noisy Free-Text Prostate Cancer Pathology Reports Using Natural Language Processing

Author(s):  
Anjani Dhrangadhariya ◽  
Sebastian Otálora ◽  
Manfredo Atzori ◽  
Henning Müller
2017 ◽  
Vol 35 (8_suppl) ◽  
pp. 232-232 ◽  
Author(s):  
Tina Hernandez-Boussard ◽  
Panagiotis Kourdis ◽  
Rajendra Dulal ◽  
Michelle Ferrari ◽  
Solomon Henry ◽  
...  

232 Background: Electronic health records (EHRs) are a widely adopted but underutilized source of data for systematic assessment of healthcare quality. Barriers for use of this data source include its vast complexity, lack of structure, and the lack of use of standardized vocabulary and terminology by clinicians. This project aims to develop generalizable algorithms to extract useful knowledge regarding prostate cancer quality metrics from EHRs. Methods: We used EHR ICD-9/10 codes to identify prostate cancer patients receiving care at our academic medical center. Patients were confirmed in the California Cancer Registry (CCR), which provided data on tumor characteristics, treatment data, treatment outcomes and survival. We focused on three potential pretreatment process quality measures, which included documentation within 6 months prior to initial treatment of prostate-specific antigen (PSA), digital rectal exam (DRE) performance, and Gleason score. Each quality metric was defined using target terms and concepts to extract from the EHRs. Terms were mapped to a standardized medical vocabulary or ontology, enabling us to represent the metric elements by a concept domain and its permissible values. The structured representation of the quality metric included rules that accounted for the temporal order of the metric components. Our algorithms used natural language processing for free text annotation and negation, to ensure terms such as ‘DRE deferred’ are appropriately categorized. Results: We identified 2,123 patients receiving prostate cancer treatment between 2008-2016, of whom 1413 (67%) were matched in the CCR. We compared accuracy of our data mining algorithm, a random sample of manual chart review, and the CCR. (See Table.) Conclusions: EHR systems can be used to assess and report quality metrics systematically, efficiently, and with high accuracy. The development of such systems can improve and reduce the burden of quality reporting and potentially reduce costs of measuring quality metrics through automation. [Table: see text]


2018 ◽  
pp. 1-8 ◽  
Author(s):  
Alexander P. Glaser ◽  
Brian J. Jordan ◽  
Jason Cohen ◽  
Anuj Desai ◽  
Philip Silberman ◽  
...  

Purpose Bladder cancer is initially diagnosed and staged with a transurethral resection of bladder tumor (TURBT). Patient survival is dependent on appropriate sampling of layers of the bladder, but pathology reports are dictated as free text, making large-scale data extraction for quality improvement challenging. We sought to automate extraction of stage, grade, and quality information from TURBT pathology reports using natural language processing (NLP). Methods Patients undergoing TURBT were retrospectively identified using the Northwestern Enterprise Data Warehouse. An NLP algorithm was then created to extract information from free-text pathology reports and was iteratively improved using a training set of manually reviewed TURBTs. NLP accuracy was then validated using another set of manually reviewed TURBTs, and reliability was calculated using Cohen’s κ. Results Of 3,042 TURBTs identified from 2006 to 2016, 39% were classified as benign, 35% as Ta, 11% as T1, 4% as T2, and 10% as isolated carcinoma in situ. Of 500 randomly selected manually reviewed TURBTs, NLP correctly staged 88% of specimens (κ = 0.82; 95% CI, 0.78 to 0.86). Of 272 manually reviewed T1 tumors, NLP correctly categorized grade in 100% of tumors (κ = 1), correctly categorized if muscularis propria was reported by the pathologist in 98% of tumors (κ = 0.81; 95% CI, 0.62 to 0.99), and correctly categorized if muscularis propria was present or absent in the resection specimen in 82% of tumors (κ = 0.62; 95% CI, 0.55 to 0.73). Discrepancy analysis revealed pathologist notes and deeper resection specimens as frequent reasons for NLP misclassifications. Conclusion We developed an NLP algorithm that demonstrates a high degree of reliability in extracting stage, grade, and presence of muscularis propria from TURBT pathology reports. Future iterations can continue to improve performance, but automated extraction of oncologic information is promising in improving quality and assisting physicians in delivery of care.


2020 ◽  
Vol 6 (4) ◽  
pp. 192-198
Author(s):  
Joeky T Senders ◽  
David J Cote ◽  
Alireza Mehrtash ◽  
Robert Wiemann ◽  
William B Gormley ◽  
...  

IntroductionAlthough clinically derived information could improve patient care, its full potential remains unrealised because most of it is stored in a format unsuitable for traditional methods of analysis, free-text clinical reports. Various studies have already demonstrated the utility of natural language processing algorithms for medical text analysis. Yet, evidence on their learning efficiency is still lacking. This study aimed to compare the learning curves of various algorithms and develop an open-source framework for text mining in healthcare.MethodsDeep learning and regressions-based models were developed to determine the histopathological diagnosis of patients with brain tumour based on free-text pathology reports. For each model, we characterised the learning curve and the minimal required training examples to reach the area under the curve (AUC) performance thresholds of 0.95 and 0.98.ResultsIn total, we retrieved 7000 reports on 5242 patients with brain tumour (2316 with glioma, 1412 with meningioma and 1514 with cerebral metastasis). Conventional regression and deep learning-based models required 200–400 and 800–1500 training examples to reach the AUC performance thresholds of 0.95 and 0.98, respectively. The deep learning architecture utilised in the current study required 100 and 200 examples, respectively, corresponding to a learning capacity that is two to eight times more efficient.ConclusionsThis open-source framework enables the development of high-performing and fast learning natural language processing models. The steep learning curve can be valuable for contexts with limited training examples (eg, rare diseases and events or institutions with lower patient volumes). The resultant models could accelerate retrospective chart review, assemble clinical registries and facilitate a rapid learning healthcare system.


2019 ◽  
Vol 5 (suppl) ◽  
pp. 49-49
Author(s):  
Christi French ◽  
Dax Kurbegov ◽  
David R. Spigel ◽  
Maciek Makowski ◽  
Samantha Terker ◽  
...  

49 Background: Pulmonary nodule incidental findings challenge providers to balance resource efficiency and high clinical quality. Incidental findings tend to be under evaluated with studies reporting appropriate follow-up rates as low as 29%. The efficient identification of patients with high risk nodules is foundational to ensuring appropriate follow-up and requires the clinical reading and classification of radiology reports. We tested the feasibility of automating this process with natural language processing (NLP) and machine learning (ML). Methods: In cooperation with Sarah Cannon, the Cancer Institute of HCA Healthcare, we conducted a series of experiments on 8,879 free-text, narrative CT radiology reports. A representative sample of health system ED, IP, and OP reports dated from Dec 2015 - April 2017 were divided into a development set for model training and validation, and a test set to evaluate model performance. A “Nodule Model” was trained to detect the reported presence of a pulmonary nodule and a rules-based “Size Model” was developed to extract the size of the nodule in mms. Reports were bucketed into three prediction groups: ≥ 6 mm, <6 mm, and no size indicated. Nodules were placed in a queue for follow-up if the nodule was predicted ≥ 6 mm, or if the nodule had no size indicated and the report contained the word “mass.” The Fleischner Society Guidelines and clinical review informed these definitions. Results: Precision and recall metrics were calculated for multiple model thresholds. A threshold was selected based on the validation set calculations and a success criterion of 90% queue precision was selected to minimize false positives. On the test dataset, the F1 measure of the entire pipeline was 72.9%, recall was 60.3%, and queue precision was 90.2%, exceeding success criteria. Conclusions: The experiments demonstrate the feasibility of technology to automate the detection and classification of pulmonary nodule incidental findings in radiology reports. This approach promises to improve healthcare quality by increasing the rate of appropriate lung nodule incidental finding follow-up and treatment without excessive labor or risking overutilization.


Information ◽  
2021 ◽  
Vol 12 (11) ◽  
pp. 451
Author(s):  
Okechinyere J. Achilonu ◽  
Victor Olago ◽  
Elvira Singh ◽  
René M. J. C. Eijkemans ◽  
Gideon Nimako ◽  
...  

A cancer pathology report is a valuable medical document that provides information for clinical management of the patient and evaluation of health care. However, there are variations in the quality of reporting in free-text style formats, ranging from comprehensive to incomplete reporting. Moreover, the increasing incidence of cancer has generated a high throughput of pathology reports. Hence, manual extraction and classification of information from these reports can be intrinsically complex and resource-intensive. This study aimed to (i) evaluate the quality of over 80,000 breast, colorectal, and prostate cancer free-text pathology reports and (ii) assess the effectiveness of random forest (RF) and variants of support vector machine (SVM) in the classification of reports into benign and malignant classes. The study approach comprises data preprocessing, visualisation, feature selections, text classification, and evaluation of performance metrics. The performance of the classifiers was evaluated across various feature sizes, which were jointly selected by four filter feature selection methods. The feature selection methods identified established clinical terms, which are synonymous with each of the three cancers. Uni-gram tokenisation using the classifiers showed that the predictive power of RF model was consistent across various feature sizes, with overall F-scores of 95.2%, 94.0%, and 95.3% for breast, colorectal, and prostate cancer classification, respectively. The radial SVM achieved better classification performance compared with its linear variant for most of the feature sizes. The classifiers also achieved high precision, recall, and accuracy. This study supports a nationally agreed standard in pathology reporting and the use of text mining for encoding, classifying, and production of high-quality information abstractions for cancer prognosis and research.


Sign in / Sign up

Export Citation Format

Share Document