Development of a Natural Language Processing Algorithm for the Classification of Suspicious Liver Lesions from Radiology Reports

Natural Language ◽

Language Processing ◽

Processing Algorithm ◽

Liver Lesions ◽

Radiology Reports ◽

Here, we developed and validated a highly generalizable natural language processing algorithm based on deep-learning. Our algorithm was trained and tested on a highly diverse dataset from over 2,000 hospital sites and 500 radiologists. The resulting algorithm achieved an AUROC of 0.96 for the presence or absence of liver lesions while achieving a specificity of 0.99 and a sensitivity of 0.6.

Automated Detection of Measurements and Their Descriptors in Radiology Reports Using a Hybrid Natural Language Processing Algorithm

Journal of Digital Imaging ◽

10.1007/s10278-019-00237-9 ◽

2019 ◽

Vol 32 (4) ◽

pp. 544-553 ◽

Cited By ~ 5

Author(s):

Selen Bozkurt ◽

Emel Alkim ◽

Imon Banerjee ◽

Daniel L. Rubin

Keyword(s):

Natural Language ◽

Language Processing ◽

Automated Detection ◽

Processing Algorithm ◽

Radiology Reports ◽

A Highly Generalizable Natural Language Processing Algorithm for the Diagnosis of Pulmonary Embolism from Radiology Reports

10.1101/2020.10.13.20211961 ◽

2020 ◽

Author(s):

Jacob Johnson ◽

Grace Qiu ◽

Christine Lamoureux ◽

Jennifer Ngo ◽

Lawrence Ngo

Keyword(s):

Pulmonary Embolism ◽

Deep Learning ◽

Sample Size ◽

Language Processing ◽

High Accuracy ◽

Free Text ◽

Radiology Reports ◽

AbstractThough sophisticated algorithms have been developed for the classification of free-text radiology reports for pulmonary embolism (PE), their overall generalizability remains unvalidated given limitations in sample size and data homogeneity. We developed and validated a highly generalizable deep-learning based NLP algorithm for this purpose with data sourced from over 2,000 hospital sites and 500 radiologists. The algorithm achieved an AUCROC of 0.995 on chest angiography studies and 0.994 on non-angiography studies for the presence or absence of PE. The high accuracy achieved on this large and heterogeneous dataset allows for the possibility of application in large multi-center radiology practices as well as for deployment at novel sites without significant degradation in performance.

Developing A Deep Learning Natural Language Processing Algorithm For Automated Reporting Of Adverse Drug Reactions

10.1101/2021.12.11.21267504 ◽

2021 ◽

Author(s):

Christopher McMaster ◽

Julia Chan ◽

David FL Liew ◽

Elizabeth Su ◽

Albert G Frauman ◽

...

Keyword(s):

Deep Learning ◽

Adverse Events ◽

Natural Language ◽

Adverse Drug Reactions ◽

Language Processing ◽

Processing Algorithm ◽

Drug Reactions ◽

Discharge Summaries ◽

The detection of adverse drug reactions (ADRs) is critical to our understanding of the safety and risk-benefit profile of medications. With an incidence that has not changed over the last 30 years, ADRs are a significant source of patient morbidity, responsible for 5-10% of acute care hospital admissions worldwide. Spontaneous reporting of ADRs has long been the standard method of reporting, however this approach is known to have high rates of under-reporting, a problem that limits pharmacovigilance efforts. Automated ADR reporting presents an alternative pathway to increase reporting rates, although this may be limited by over-reporting of other drug-related adverse events. We developed a deep learning natural language processing algorithm to identify ADRs in discharge summaries at a single academic hospital centre. Our model was developed in two stages: first, a pre-trained model (DeBERTa) was further pre-trained on 150,000 unlabelled discharge summaries; secondly, this model was fine-tuned to detect ADR mentions in a corpus of 861 annotated discharge summaries. To ensure that our algorithm could differentiate ADRs from other drug-related adverse events, the annotated corpus was enriched for both validated ADR reports and confounding drug-related adverse events using. The final model demonstrated good performance with a ROC-AUC of 0.934 (95% CI 0.931 - 0.955) for the task of identifying discharge summaries containing ADR mentions.

Entity Extraction of Electrical Equipment Malfunction Text by a Hybrid Natural Language Processing Algorithm

IEEE Access ◽

10.1109/access.2021.3063354 ◽

2021 ◽

Vol 9 ◽

pp. 40216-40226

Author(s):

Zhe Kong ◽

Changxi Yue ◽

Ying Shi ◽

Jicheng Yu ◽

Changjun Xie ◽

...

Keyword(s):

Natural Language ◽

Language Processing ◽

Electrical Equipment ◽

Processing Algorithm ◽

Entity Extraction ◽

BERT for the Processing of Radiological Reports: An Attention-based Natural Language Processing Algorithm

Academic Radiology ◽

10.1016/j.acra.2021.03.036 ◽

2021 ◽

Author(s):

Shelly Soffer ◽

Benjamin S. Glicksberg ◽

Eyal Zimlichman ◽

Eyal Klang

Keyword(s):

Natural Language ◽

Language Processing ◽

Processing Algorithm ◽

Development and Validation of a Natural Language Processing Algorithm to Extract Descriptors of Microbial Keratitis From the Electronic Health Record

Cornea ◽

10.1097/ico.0000000000002755 ◽

2021 ◽

Vol Publish Ahead of Print ◽

Author(s):

Maria A. Woodward ◽

Nenita Maganti ◽

Leslie M. Niziol ◽

Sejal Amin ◽

Andrew Hou ◽

...

Keyword(s):

Natural Language ◽

Electronic Health Record ◽

Language Processing ◽

Processing Algorithm ◽

Health Record ◽

Microbial Keratitis ◽

Electronic Health ◽

Development And Validation ◽

Natural Language Processing for Surveillance of Cervical and Anal Cancer and Precancer: Algorithm Development and Split-Validation Study (Preprint)

10.2196/preprints.20826 ◽

2020 ◽

Author(s):

Carlos R Oliveira ◽

Patrick Niccolai ◽

Anette Michelle Ortiz ◽

Sangini S Sheth ◽

Eugene D Shapiro ◽

...

Keyword(s):

Human Papillomavirus ◽

Natural Language ◽

Language Processing ◽

Medical Records ◽

Processing Algorithm ◽

Accurate Identification ◽

Pathology Reports ◽

Manual Review ◽

BACKGROUND Accurate identification of new diagnoses of human papillomavirus–associated cancers and precancers is an important step toward the development of strategies that optimize the use of human papillomavirus vaccines. The diagnosis of human papillomavirus cancers hinges on a histopathologic report, which is typically stored in electronic medical records as free-form, or unstructured, narrative text. Previous efforts to perform surveillance for human papillomavirus cancers have relied on the manual review of pathology reports to extract diagnostic information, a process that is both labor- and resource-intensive. Natural language processing can be used to automate the structuring and extraction of clinical data from unstructured narrative text in medical records and may provide a practical and effective method for identifying patients with vaccine-preventable human papillomavirus disease for surveillance and research. OBJECTIVE This study's objective was to develop and assess the accuracy of a natural language processing algorithm for the identification of individuals with cancer or precancer of the cervix and anus. METHODS A pipeline-based natural language processing algorithm was developed, which incorporated machine learning and rule-based methods to extract diagnostic elements from the narrative pathology reports. To test the algorithm’s classification accuracy, we used a split-validation study design. Full-length cervical and anal pathology reports were randomly selected from 4 clinical pathology laboratories. Two study team members, blinded to the classifications produced by the natural language processing algorithm, manually and independently reviewed all reports and classified them at the document level according to 2 domains (diagnosis and human papillomavirus testing results). Using the manual review as the gold standard, the algorithm’s performance was evaluated using standard measurements of accuracy, recall, precision, and F-measure. RESULTS The natural language processing algorithm’s performance was validated on 949 pathology reports. The algorithm demonstrated accurate identification of abnormal cytology, histology, and positive human papillomavirus tests with accuracies greater than 0.91. Precision was lowest for anal histology reports (0.87, 95% CI 0.59-0.98) and highest for cervical cytology (0.98, 95% CI 0.95-0.99). The natural language processing algorithm missed 2 out of the 15 abnormal anal histology reports, which led to a relatively low recall (0.68, 95% CI 0.43-0.87). CONCLUSIONS This study outlines the development and validation of a freely available and easily implementable natural language processing algorithm that can automate the extraction and classification of clinical data from cervical and anal cytology and histology.

Understanding Legal Documents: Classification of Rhetorical Role of Sentences Using Deep Learning and Natural Language Processing

2020 IEEE 14th International Conference on Semantic Computing (ICSC) ◽

10.1109/icsc.2020.00089 ◽

2020 ◽

Author(s):

Syed Rameel Ahmad ◽

Deborah Harris ◽

Ibrahim Sahibzada

Keyword(s):

Deep Learning ◽

Natural Language ◽

Language Processing ◽

Legal Documents

279 – Validation of a Natural Language Processing Algorithm to Identify Colonic Adenomas Across a Health System

Gastroenterology ◽

10.1016/s0016-5085(19)36923-9 ◽

2019 ◽

Vol 156 (6) ◽

pp. S-56

Author(s):

David G. Morgan ◽

Kathy Chorneyko ◽

Deepak Swain ◽

Barbara Bowes ◽

Vicki Lee ◽

...

Keyword(s):

Natural Language ◽

Health System ◽

Language Processing ◽

Processing Algorithm ◽

Colonic Adenomas ◽

SentiMental: An emotional profiling algorithm for identifying affect patterns in text

10.31219/osf.io/cun5x ◽

2018 ◽

Author(s):

Massimo Stella

Keyword(s):

Natural Language ◽

Language Processing ◽

Processing Algorithm ◽

The Novel ◽

Novel Approach ◽

Technical Report ◽

Potential Applications ◽

This technical report outlines the mechanisms and potential applications of SentiMental, a suite of natural language processing algorithm designed and implemented by Massimo Stella, Complex Science Consulting. The following technical report briefly outlines the novel approach of SentiMental in performing sentiment and emotional analysis by directly harnessing the whole structure of the mental lexicon rather than by using affect norms. Furthermore, this technical report outlines the direct emotional profiling and the visualisations currently implemented in version 0.1 of SentiMental. Features under development and current limitations are also outlined and discussed.This technical report is not meant as a publication. The author holds full copyright and any reproduction of parts of this report must be authorised by the copyright holder. SentiMental represents a work in progress, so do not hesitate to get in touch with the author for any potential feedback.