scholarly journals Improved Identification of Noun Phrases in Clinical Radiology Reports Using a High-Performance Statistical Natural Language Parser Augmented with the UMLS Specialist Lexicon

2005 ◽  
Vol 12 (3) ◽  
pp. 275-285 ◽  
Author(s):  
Y. Huang
Radiology ◽  
2018 ◽  
Vol 287 (2) ◽  
pp. 570-580 ◽  
Author(s):  
John Zech ◽  
Margaret Pain ◽  
Joseph Titano ◽  
Marcus Badgeley ◽  
Javin Schefflein ◽  
...  

2011 ◽  
Vol 37 (4) ◽  
pp. 753-809 ◽  
Author(s):  
David Vadas ◽  
James R. Curran

Noun phrases (nps) are a crucial part of natural language, and can have a very complex structure. However, this np structure is largely ignored by the statistical parsing field, as the most widely used corpus is not annotated with it. This lack of gold-standard data has restricted previous efforts to parse nps, making it impossible to perform the supervised experiments that have achieved high performance in so many Natural Language Processing (nlp) tasks. We comprehensively solve this problem by manually annotating np structure for the entire Wall Street Journal section of the Penn Treebank. The inter-annotator agreement scores that we attain dispel the belief that the task is too difficult, and demonstrate that consistent np annotation is possible. Our gold-standard np data is now available for use in all parsers. We experiment with this new data, applying the Collins (2003) parsing model, and find that its recovery of np structure is significantly worse than its overall performance. The parser's F-score is up to 5.69% lower than a baseline that uses deterministic rules. Through much experimentation, we determine that this result is primarily caused by a lack of lexical information. To solve this problem we construct a wide-coverage, large-scale np Bracketing system. With our Penn Treebank data set, which is orders of magnitude larger than those used previously, we build a supervised model that achieves excellent results. Our model performs at 93.8% F-score on the simple task that most previous work has undertaken, and extends to bracket longer, more complex nps that are rarely dealt with in the literature. We attain 89.14% F-score on this much more difficult task. Finally, we implement a post-processing module that brackets nps identified by the Bikel (2004) parser. Our np Bracketing model includes a wide variety of features that provide the lexical information that was missing during the parser experiments, and as a result, we outperform the parser's F-score by 9.04%. These experiments demonstrate the utility of the corpus, and show that many nlp applications can now make use of np structure.


2020 ◽  
Author(s):  
Shintaro Tsuji ◽  
Andrew Wen ◽  
Naoki Takahashi ◽  
Hongjian Zhang ◽  
Katsuhiko Ogasawara ◽  
...  

BACKGROUND Named entity recognition (NER) plays an important role in extracting the features of descriptions for mining free-text radiology reports. However, the performance of existing NER tools is limited because the number of entities depends on its dictionary lookup. Especially, the recognition of compound terms is very complicated because there are a variety of patterns. OBJECTIVE The objective of the study is to develop and evaluate a NER tool concerned with compound terms using the RadLex for mining free-text radiology reports. METHODS We leveraged the clinical Text Analysis and Knowledge Extraction System (cTAKES) to develop customized pipelines using both RadLex and SentiWordNet (a general-purpose dictionary, GPD). We manually annotated 400 of radiology reports for compound terms (Cts) in noun phrases and used them as the gold standard for the performance evaluation (precision, recall, and F-measure). Additionally, we also created a compound-term-enhanced dictionary (CtED) by analyzing false negatives (FNs) and false positives (FPs), and applied it for another 100 radiology reports for validation. We also evaluated the stem terms of compound terms, through defining two measures: an occurrence ratio (OR) and a matching ratio (MR). RESULTS The F-measure of the cTAKES+RadLex+GPD was 32.2% (Precision 92.1%, Recall 19.6%) and that of combined the CtED was 67.1% (Precision 98.1%, Recall 51.0%). The OR indicated that stem terms of “effusion”, "node", "tube", and "disease" were used frequently, but it still lacks capturing Cts. The MR showed that 71.9% of stem terms matched with that of ontologies and RadLex improved about 22% of the MR from the cTAKES default dictionary. The OR and MR revealed that the characteristics of stem terms would have the potential to help generate synonymous phrases using ontologies. CONCLUSIONS We developed a RadLex-based customized pipeline for parsing radiology reports and demonstrated that CtED and stem term analysis has the potential to improve dictionary-based NER performance toward expanding vocabularies.


CHEST Journal ◽  
2021 ◽  
Author(s):  
Chengyi Zheng ◽  
Brian Z. Huang ◽  
Andranik A. Agazaryan ◽  
Beth Creekmur ◽  
Thearis Osuj ◽  
...  

2021 ◽  
Vol 16 (1) ◽  
Author(s):  
Nithin Kolanu ◽  
A Shane Brown ◽  
Amanda Beech ◽  
Jacqueline R. Center ◽  
Christopher P. White

Author(s):  
Friederike Moltmann

Natural language, it appears, reflects in part our conception of the world. Natural language displays a great range of types of referential noun phrases that seem to stand for objects of various ontological categories and types, and it also involves constructions, categories, and expressions that appear to convey ontological or notions. Natural language reflects its own ontology, an ontology that may differ from the ontology a philosopher may be willing to accept or even a nonphilosopher when thinking about what there is, and of course it may differ from the ontology of what there really is. This chapter gives a characterization of the ontology implicit in natural language and the entities it involves, situates natural language ontology within metaphysics, discusses what sorts of data may be considered reflective of the ontology of natural language, and addresses Chomsky’s dismissal of externalist semantics.


2020 ◽  
Vol 33 (5) ◽  
pp. 1194-1201
Author(s):  
Andrew L. Callen ◽  
Sara M. Dupont ◽  
Adi Price ◽  
Ben Laguna ◽  
David McCoy ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document