scholarly journals A fast, accurate, and generalisable heuristic-based negation detection algorithm for clinical text

2021 ◽  
Vol 130 ◽  
pp. 104216
Author(s):  
Luke T. Slater ◽  
William Bradlow ◽  
Dino FA. Motti ◽  
Robert Hoehndorf ◽  
Simon Ball ◽  
...  
Author(s):  
Luke T Slater ◽  
William Bradlow ◽  
Dino FA Motti ◽  
Robert Hoehndorf ◽  
Simon Ball ◽  
...  

AbstractBackgroundNegation detection is an important task in biomedical text mining. Particularly in clinical settings, it is of critical importance to determine whether findings mentioned in text are present or absent. Rule-based negation detection algorithms are a common approach to the task, and more recent investigations have resulted in the development of rule-based systems utilising the rich grammatical information afforded by typed dependency graphs. However, interacting with these complex representations inevitably necessitates complex rules, which are time-consuming to develop and do not generalise well. We hypothesise that a heuristic approach to determining negation via dependency graphs could offer a powerful alternative.ResultsWe describe and implement an algorithm for negation detection based on grammatical distance from a negatory construct in a typed dependency graph. To evaluate the algorithm, we develop two testing corpora comprised of sentences of clinical text extracted from the MIMIC-III database and documents related to hypertrophic cardiomyopathy patients routinely collected at University Hospitals Birmingham NHS trust. Gold-standard validation datasets were built by a combination of human annotation and examination of algorithm error. Finally, we compare the performance of our approach with four other rule-based algorithms on both gold-standard corpora.ConclusionsThe presented algorithm exhibits the best performance by f-measure over the MIMIC-III dataset, and a similar performance to the syntactic negation detection systems over the HCM dataset. It is also the fastest of the dependency-based negation systems explored in this study. Our results show that while a single heuristic approach to dependency-based negation detection is ignorant to certain advanced cases, it nevertheless forms a powerful and stable method, requiring minimal training and adaptation between datasets. As such, it could present a drop-in replacement or augmentation for many-rule negation approaches in clinical text-mining pipelines, particularly for cases where adaptation and rule development is not required or possible.


2017 ◽  
Vol 13 (4) ◽  
Author(s):  
J. Manimaran ◽  
T. Velmurugan

AbstractBackground:Clinical Text Analysis and Knowledge Extraction System (cTAKES) is an open-source natural language processing (NLP) system. In recent development modules of cTAKES, a negation detection (ND) algorithm is used to improve annotation capabilities and simplify automatic identification of negative context in large clinical documents. In this research, the two types of ND algorithms used are lexicon and syntax, which are analyzed using a database made openly available by the National Center for Biomedical Computing. The aim of this analysis is to find the pros and cons of these algorithms.Methods:Patient medical reports were collected from three institutions included the 2010 i2b2/VA Clinical NLP Challenge, which is the input data for this analysis. This database includes patient discharge summaries and progress notes. The patient data is fed into five ND algorithms: NegEx, ConText, pyConTextNLP, DEEPEN and Negation Resolution (NR). NegEx, ConText and pyConTextNLP are lexicon-based, whereas DEEPEN and NR are syntax-based. The results from these five ND algorithms are post-processed and compared with the annotated data. Finally, the performance of these ND algorithms is evaluated by computing standard measures including F-measure, kappa statistics and ROC, among others, as well as the execution time of each algorithm.Results:This research is tested through practical implementation based on the accuracy of each algorithm’s results and computational time to evaluate its performance in order to find a robust and reliable ND algorithm.Conclusions:The performance of the chosen ND algorithms is analyzed based on the results produced by this research approach. The time and accuracy of each algorithm are calculated and compared to suggest the best method.


2020 ◽  
Vol 27 (4) ◽  
pp. 584-591 ◽  
Author(s):  
Chen Lin ◽  
Steven Bethard ◽  
Dmitriy Dligach ◽  
Farig Sadeque ◽  
Guergana Savova ◽  
...  

Abstract Introduction Classifying whether concepts in an unstructured clinical text are negated is an important unsolved task. New domain adaptation and transfer learning methods can potentially address this issue. Objective We examine neural unsupervised domain adaptation methods, introducing a novel combination of domain adaptation with transformer-based transfer learning methods to improve negation detection. We also want to better understand the interaction between the widely used bidirectional encoder representations from transformers (BERT) system and domain adaptation methods. Materials and Methods We use 4 clinical text datasets that are annotated with negation status. We evaluate a neural unsupervised domain adaptation algorithm and BERT, a transformer-based model that is pretrained on massive general text datasets. We develop an extension to BERT that uses domain adversarial training, a neural domain adaptation method that adds an objective to the negation task, that the classifier should not be able to distinguish between instances from 2 different domains. Results The domain adaptation methods we describe show positive results, but, on average, the best performance is obtained by plain BERT (without the extension). We provide evidence that the gains from BERT are likely not additive with the gains from domain adaptation. Discussion Our results suggest that, at least for the task of clinical negation detection, BERT subsumes domain adaptation, implying that BERT is already learning very general representations of negation phenomena such that fine-tuning even on a specific corpus does not lead to much overfitting. Conclusion Despite being trained on nonclinical text, the large training sets of models like BERT lead to large gains in performance for the clinical negation detection task.


2015 ◽  
Author(s):  
Chaitanya Shivade ◽  
Marie-Catherine de Marneffe ◽  
Eric Fosler-Lussier ◽  
Albert M. Lai

2015 ◽  
Vol 54 ◽  
pp. 213-219 ◽  
Author(s):  
Saeed Mehrabi ◽  
Anand Krishnan ◽  
Sunghwan Sohn ◽  
Alexandra M. Roch ◽  
Heidi Schmidt ◽  
...  

Author(s):  
Mike Conway ◽  
Howard Burkom ◽  
Amy Ising

ObjectiveThis abstract describes an ISDS initiative to bring together public health practitioners and analytics solution developers from both academia and industry to define a roadmap for the development of algorithms, tools, and datasets to improve the capabilities of current text processing algorithms to identify negated terms (i.e. negation detection).IntroductionDespite considerable effort since the turn of the century to develop Natural Language Processing (NLP) methods and tools for detecting negated terms in chief complaints, few standardised methods have emerged. Those methods that have emerged (e.g. the NegEx algorithm [1]) are confined to local implementations with customised solutions. Important reasons for this lack of progress include (a) limited shareable datasets for developing and testing methods (b) jurisdictional data silos, and (c) the gap between resource-constrained public health practitioners and technical solution developers, typically university researchers and industry developers.To address these three problems ISDS, funded by a grant from the Defense Threat Reduction Agency, organized a consultancy meeting at the University of Utah designed to bring together (a) representatives from public health departments, (b) university researchers focused on the development of computational methods for public health surveillance, (c) members of public health oriented non-governmental organisations, and (d) industry representatives, with the goal of developing a roadmap for the development of validated, standardised and portable resources (methods and data sets) for negation detection in clinical text used for public health surveillance.MethodsFree-text chief complaints remain a vital resource for syndromic surveillance. However, the widespread adoption of Electronic Health Records (and federal Meaningful Use requirements) has brought changes to the syndromic surveillance practice ecosystem. These changes have included the widespread use of EHR-generated chief complaint “pick lists” (i.e. pre-defined chief complaints that are selected by the user, rather than text strings input by the user at a keyboard), triage note templated text, and triage note free-text (typically much more comprehensive than traditional chief complaints). A key requirement for a negation detection algorithm is the ability to successfully and accurately process these new and challenging data streams.Preparations for the consultancy included an email thread and a shared website for published articles and data samples leading to a structured pre-consultancy call designed to inform participants regarding the purpose of the consultancy and to align expectations. Then, health department users were requested to provide data samples exemplifying negation issues in the classification process. Presenting developers were asked to explain their underlying ideas, details of method implementation, size and composition of corpora used for evaluation, and classification performance results.ResultsThe consultancy was held on January 19th & 20th 2017 at the University of Utah’s Department of Biomedical Informatics, and consisted of 25 participants. Participants were drawn from various different sectors, with representation from ISDS (2), the Defense Threat Reduction Agency (1), universities and research institutes (10), public health departments (5), the Department of Veterans Affairs (4), non-profit organisations (2), and technology firms (1). Participants were drawn from a variety of different professional backgrounds, including research scientists, software developers, public health executives, epidemiologists, and analysts.Day 1 of the consultancy was devoted to providing an overview of NLP and current trends in negation detection, including a detailed description of widely used algorithms and tools for the negation detection task. Key questions included: Should our focus be chief complaints only, or should we widen our scope to emergency department triage notes?, How many other NLP tasks (e.g. reliable concept recognition) is it necessary to address on the road to improved negation detection? With this background established, Day 2 centered on presentations from five different United States local and regional health departments (King County WA, Boston MA, North Carolina, Georgia, and Tennessee) on the various approaches to text processing and negation detection utilized across several jurisdictions.Several key areas of focus emerged as a result of the consultancy discussion. First, there is a clear need for a large, easily accessible corpus of free-text chief complaints that can form a standardised testbed for negation detection algorithm development and evaluation. Annotated data, in this context, consists of chief complaints annotated for concepts (e.g. vomiting, pain in chest) and the negation status of those concepts. It is important that the annotation include both annotated clinical concepts and negation status to allow for the uniform evaluation and performance comparison of candidate negation detection algorithms. Further, the annotated corpus should consist of several thousand (as opposed to several hundred) distinct and representative chief complaints in order to compare algorithms against a sufficient variety and volume of negation patterns.ConclusionsThe consultancy was stimulating and eye-opening for both public health practitioner and technology developer attendees. Developers unfamiliar with the everyday health-monitoring context gained an appreciation of the difficulty of deriving useful indicators from chief complaints. Also highlighted was the challenge of processing triage notes and other free-text fields that are often unused for surveillance purposes. Practitioners were provided with concise explanations and evaluations of recent NLP approaches applicable to negation processing. The event afforded direct dialogue important for communication across professional cultures.Please note that a journal paper describing the consultancy has recently been published in the Online Journal of Public Health Informatics [2].References[1] Chapman W, Bridewell W, Henbury P, Cooper G, Buchanan B. A simple algorithm for identifying negated findings and diseases in discharge summaries. J Biomed Inform. 2001, 34(5):301-310.[2] Conway M, Mowery D, Ising A, Velupillai S, Doan S, Gunn J, Donovan M, Wiedeman C, Ballester L, Soetebier K, Tong C, Burkom H. Cross-disciplinary consultance to breige public health technical needs and analytic developers: negation detection use case. Online Journal of Public Health Informatics. 2018, 10(2) 


Sign in / Sign up

Export Citation Format

Share Document