Constituency Parser for Clinical Narratives using NLP

Clinical parsing is useful in medical domain .Clinical narratives are difficult to understand as it is in unstructured format .Medical Natural language processing systems are used to make these clinical narratives in readable format. Clinical Parser is the combination of natural language processing and medical lexicon .For making clinical narrative understandable parsing technique is used .In this paper we are discussing about constituency parser for clinical narratives, which is based on phrase structured grammar. This parser convert unstructured clinical narratives into structured report. This paper focus on clinical sentences which is in unstructured format after parsing convert into structured format. For each sentence recall ,precision and bracketing f- measure are calculated .

Download Full-text

Comparison of a semi-automatic annotation tool and a natural language processing application for the generation of clinical statement entries

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2014-002991 ◽

2014 ◽

Vol 22 (1) ◽

pp. 132-142 ◽

Cited By ~ 2

Author(s):

Ching-Heng Lin ◽

Nai-Yuan Wu ◽

Wei-Shao Lai ◽

Der-Ming Liou

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Automatic Annotation ◽

Processing Application ◽

Entry Level ◽

Novel Approach ◽

Clinical Document ◽

F Measure ◽

Document Architecture

Abstract Background and objective Electronic medical records with encoded entries should enhance the semantic interoperability of document exchange. However, it remains a challenge to encode the narrative concept and to transform the coded concepts into a standard entry-level document. This study aimed to use a novel approach for the generation of entry-level interoperable clinical documents. Methods Using HL7 clinical document architecture (CDA) as the example, we developed three pipelines to generate entry-level CDA documents. The first approach was a semi-automatic annotation pipeline (SAAP), the second was a natural language processing (NLP) pipeline, and the third merged the above two pipelines. We randomly selected 50 test documents from the i2b2 corpora to evaluate the performance of the three pipelines. Results The 50 randomly selected test documents contained 9365 words, including 588 Observation terms and 123 Procedure terms. For the Observation terms, the merged pipeline had a significantly higher F-measure than the NLP pipeline (0.89 vs 0.80, p<0.0001), but a similar F-measure to that of the SAAP (0.89 vs 0.87). For the Procedure terms, the F-measure was not significantly different among the three pipelines. Conclusions The combination of a semi-automatic annotation approach and the NLP application seems to be a solution for generating entry-level interoperable clinical documents.

Download Full-text

Tweet Classification Toward Twitter-Based Disease Surveillance: New Data, Methods, and Evaluations (Preprint)

10.2196/preprints.12783 ◽

2018 ◽

Author(s):

Shoko Wakamiya ◽

Mizuki Morita ◽

Yoshinobu Kano ◽

Tomoko Ohkuma ◽

Eiji Aramaki

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Disease Surveillance ◽

Information Access ◽

Related Information ◽

Types Of Information ◽

F Measure ◽

Match Accuracy

BACKGROUND The amount of medical and clinical-related information on the Web is increasing. Among the different types of information available, social media–based data obtained directly from people are particularly valuable and are attracting significant attention. To encourage medical natural language processing (NLP) research exploiting social media data, the 13th NII Testbeds and Community for Information access Research (NTCIR-13) Medical natural language processing for Web document (MedWeb) provides pseudo-Twitter messages in a cross-language and multi-label corpus, covering 3 languages (Japanese, English, and Chinese) and annotated with 8 symptom labels (such as cold, fever, and flu). Then, participants classify each tweet into 1 of the 2 categories: those containing a patient’s symptom and those that do not. OBJECTIVE This study aimed to present the results of groups participating in a Japanese subtask, English subtask, and Chinese subtask along with discussions, to clarify the issues that need to be resolved in the field of medical NLP. METHODS In summary, 8 groups (19 systems) participated in the Japanese subtask, 4 groups (12 systems) participated in the English subtask, and 2 groups (6 systems) participated in the Chinese subtask. In total, 2 baseline systems were constructed for each subtask. The performance of the participant and baseline systems was assessed using the exact match accuracy, F-measure based on precision and recall, and Hamming loss. RESULTS The best system achieved exactly 0.880 match accuracy, 0.920 F-measure, and 0.019 Hamming loss. The averages of match accuracy, F-measure, and Hamming loss for the Japanese subtask were 0.720, 0.820, and 0.051; those for the English subtask were 0.770, 0.850, and 0.037; and those for the Chinese subtask were 0.810, 0.880, and 0.032, respectively. CONCLUSIONS This paper presented and discussed the performance of systems participating in the NTCIR-13 MedWeb task. As the MedWeb task settings can be formalized as the factualization of text, the achievement of this task could be directly applied to practical clinical applications.

Download Full-text

Avaliando o Desempenho da Abordagem de Comitê na Análise de Sentimentos na Língua Portuguesa

10.14210/cotb.v11n1.p025-027 ◽

2020 ◽

Author(s):

Matheus Henrique Cardoso ◽

Anita Maria da Rocha Fernandes

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Subjective Information ◽

Language Context ◽

Traditional Approaches ◽

F Measure

Sentiment Analysis aims extract subjective information from texts that, when written in Portuguese face a number of difficulties related to grammatical nature and vocabulary diversity. In order to collaborate with researches in this area, this paper presents the proposal to evaluate the performance of the committee approach in relation to the traditional approaches of sentiment analysis in the Portuguese language context. As object of application we choose tweets about volleyball theme that will serve as basis for the approaches application. These texts will be treated using Natural Language Processing for better performance of the algorithms. Other approaches will also be used in this study to assist in the evaluation of the committee along with the aid of metrics such as accuracy, precision, recall and F-measure.

Download Full-text

Psychiatric stressor recognition from clinical notes to reveal association with suicide

Health Informatics Journal ◽

10.1177/1460458218796598 ◽

2018 ◽

Vol 25 (4) ◽

pp. 1846-1862 ◽

Cited By ~ 3

Author(s):

Yaoyun Zhang ◽

Olivia R Zhang ◽

Rui Li ◽

Aaron Flores ◽

Salih Selek ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Large Scale ◽

Suicidal Behaviors ◽

Statistical Association ◽

Clinical Text ◽

Clinical Notes ◽

Electronic Health ◽

F Measure

Suicide takes the lives of nearly a million people each year and it is a tremendous economic burden globally. One important type of suicide risk factor is psychiatric stress. Prior studies mainly use survey data to investigate the association between suicide and stressors. Very few studies have investigated stressor data in electronic health records, mostly due to the data being recorded in narrative text. This study takes the initiative to automatically extract and classify psychiatric stressors from clinical text using natural language processing–based methods. Suicidal behaviors were also identified by keywords. Then, a statistical association analysis between suicide ideations/attempts and stressors extracted from a clinical corpus is conducted. Experimental results show that our natural language processing method could recognize stressor entities with an F-measure of 89.01 percent. Mentions of suicidal behaviors were identified with an F-measure of 97.3 percent. The top three significant stressors associated with suicide are health, pressure, and death, which are similar to previous studies. This study demonstrates the feasibility of using natural language processing approaches to unlock information from psychiatric notes in electronic health record, to facilitate large-scale studies about associations between suicide and psychiatric stressors.

Download Full-text

Disease Named Entity Recognition (D-NER) Evaluation

10.21203/rs.3.rs-911654/v1 ◽

2021 ◽

Author(s):

Xie-Yuan Xie

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Case ◽

Case Reports ◽

Named Entity Recognition ◽

Entity Recognition ◽

Biomedical Domain ◽

Named Entity ◽

Medical Domain

Abstract Named Entity Recognition (NER) is a key task in Natural Language Processing (NLP). In medical domain, NER is very important phase in all end-to-end systems. In this paper, we investigate the performance of NER for disease (D-NER). TaggerOne was evaluated on 52 cardiovascular-related clinical case reports against hand annotation for diseases. Different training sets have been used to evaluate the performance of TaggerOne as a famous tool for NER in biomedical domain.

Download Full-text

New Phrase Chunking Algorithm for Myanmar Natural Language Processing

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.695.548 ◽

2014 ◽

Vol 695 ◽

pp. 548-552 ◽

Cited By ~ 2

Author(s):

Myintzu Phyo Aung ◽

Aung Lwin Moe

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Good Accuracy ◽

F Measure

Chunking is the subdivision of sentences into non recursive regular syntactical groups: verbal chunks, nominal chunks, adjective chunks, adverbial chunks and propositional chunks etc. The chunker can operate as a preprocessor for Natural Language Processing systems. This study aims to proposed new phrase chunking algorithm for Myanmar natural language processing. The developed new algorithm accepts Myanmar tagged sentence as input and generates chunks as output. Input Myanmar sentence is split into chunks by using chunk markers such as postpositions, particles and conjunction and define the type of chunks as noun chunk, verb chunk, adjective chunk, adverb chunk and conjunction chunk. The algorithm was evaluated with POS tagged Myanmar sentences based on three measures parameters. According to the results, good accuracy of Precision, Recall and F-measure were obtained with new developed algorithm.

Download Full-text

Automated Modeling of Clinical Narrative with High Definition Natural Language Processing Using Solor and Analysis Normal Form

10.3233/shti210822 ◽

2021 ◽

Author(s):

Melissa P. Resnick ◽

Frank LeHouillier ◽

Steven H. Brown ◽

Keith E. Campbell ◽

Diane Montella ◽

...

Keyword(s):

Natural Language Processing ◽

Decision Support ◽

Normal Form ◽

Natural Language ◽

Language Processing ◽

Gold Standard ◽

Snomed Ct ◽

High Definition ◽

Automated Modeling ◽

Clinical Narrative

Objective: One important concept in informatics is data which meets the principles of Findability, Accessibility, Interoperability and Reusability (FAIR). Standards, such as terminologies (findability), assist with important tasks like interoperability, Natural Language Processing (NLP) (accessibility) and decision support (reusability). One terminology, Solor, integrates SNOMED CT, LOINC and RxNorm. We describe Solor, HL7 Analysis Normal Form (ANF), and their use with the high definition natural language processing (HD-NLP) program. Methods: We used HD-NLP to process 694 clinical narratives prior modeled by human experts into Solor and ANF. We compared HD-NLP output to the expert gold standard for 20% of the sample. Each clinical statement was judged “correct” if HD-NLP output matched ANF structure and Solor concepts, or “incorrect” if any ANF structure or Solor concepts were missing or incorrect. Judgements were summed to give totals for “correct” and “incorrect”. Results: 113 (80.7%) correct, 26 (18.6%) incorrect, and 1 error. Inter-rater reliability was 97.5% with Cohen’s kappa of 0.948. Conclusion: The HD-NLP software provides useable complex standards-based representations for important clinical statements designed to drive CDS.

Download Full-text

Natural Language Processing and Enhanced Clinical Decision Making Radiology and VINCI

PsycEXTRA Dataset ◽

10.1037/e615572012-015 ◽

2012 ◽

Author(s):

Eliot Siegel

Keyword(s):

Decision Making ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Decision Making ◽

Clinical Decision

Download Full-text

Natural Language Processing in the Clinical Setting

PsycEXTRA Dataset ◽

10.1037/e615572012-013 ◽

2012 ◽

Author(s):

Thomas H. Payne

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Clinical Setting

Download Full-text

A Review and evaluation of Machine Translation methods for Lumasaaba

Journal of Digital Science ◽

10.33847/2686-8296.2.1_1 ◽

2020 ◽

pp. 3-17

Author(s):

Peter Nabende

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Machine Translation ◽

Language Processing ◽

Research Area ◽

Data Driven ◽

East African ◽

Data Set ◽

African Languages ◽

Translation Methods

Natural Language Processing for under-resourced languages is now a mainstream research area. However, there are limited studies on Natural Language Processing applications for many indigenous East African languages. As a contribution to covering the current gap of knowledge, this paper focuses on evaluating the application of well-established machine translation methods for one heavily under-resourced indigenous East African language called Lumasaaba. Specifically, we review the most common machine translation methods in the context of Lumasaaba including both rule-based and data-driven methods. Then we apply a state of the art data-driven machine translation method to learn models for automating translation between Lumasaaba and English using a very limited data set of parallel sentences. Automatic evaluation results show that a transformer-based Neural Machine Translation model architecture leads to consistently better BLEU scores than the recurrent neural network-based models. Moreover, the automatically generated translations can be comprehended to a reasonable extent and are usually associated with the source language input.

Download Full-text