Enriching contextualized language model from knowledge graph for biomedical information extraction

Briefings in Bioinformatics ◽

10.1093/bib/bbaa110 ◽

2020 ◽

Author(s):

Hao Fei ◽

Yafeng Ren ◽

Yue Zhang ◽

Donghong Ji ◽

Xiaohui Liang

Keyword(s):

Information Extraction ◽

Large Scale ◽

Language Model ◽

Relation Extraction ◽

Event Extraction ◽

Entity Recognition ◽

Language Models ◽

Training Procedure ◽

Biomedical Knowledge ◽

Biomedical Texts

Abstract Biomedical information extraction (BioIE) is an important task. The aim is to analyze biomedical texts and extract structured information such as named entities and semantic relations between them. In recent years, pre-trained language models have largely improved the performance of BioIE. However, they neglect to incorporate external structural knowledge, which can provide rich factual information to support the underlying understanding and reasoning for biomedical information extraction. In this paper, we first evaluate current extraction methods, including vanilla neural networks, general language models and pre-trained contextualized language models on biomedical information extraction tasks, including named entity recognition, relation extraction and event extraction. We then propose to enrich a contextualized language model by integrating a large scale of biomedical knowledge graphs (namely, BioKGLM). In order to effectively encode knowledge, we explore a three-stage training procedure and introduce different fusion strategies to facilitate knowledge injection. Experimental results on multiple tasks show that BioKGLM consistently outperforms state-of-the-art extraction models. A further analysis proves that BioKGLM can capture the underlying relations between biomedical knowledge concepts, which are crucial for BioIE.

Download Full-text

Deep learning with language models improves named entity recognition for PharmaCoNER

BMC Bioinformatics ◽

10.1186/s12859-021-04260-y ◽

2021 ◽

Vol 22 (S1) ◽

Author(s):

Cong Sun ◽

Zhihao Yang ◽

Lei Wang ◽

Yin Zhang ◽

Hongfei Lin ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Domain Knowledge ◽

Named Entity Recognition ◽

Model Performance ◽

Relation Extraction ◽

Entity Recognition ◽

Language Models ◽

Named Entity ◽

Biomedical Texts

Abstract Background The recognition of pharmacological substances, compounds and proteins is essential for biomedical relation extraction, knowledge graph construction, drug discovery, as well as medical question answering. Although considerable efforts have been made to recognize biomedical entities in English texts, to date, only few limited attempts were made to recognize them from biomedical texts in other languages. PharmaCoNER is a named entity recognition challenge to recognize pharmacological entities from Spanish texts. Because there are currently abundant resources in the field of natural language processing, how to leverage these resources to the PharmaCoNER challenge is a meaningful study. Methods Inspired by the success of deep learning with language models, we compare and explore various representative BERT models to promote the development of the PharmaCoNER task. Results The experimental results show that deep learning with language models can effectively improve model performance on the PharmaCoNER dataset. Our method achieves state-of-the-art performance on the PharmaCoNER dataset, with a max F1-score of 92.01%. Conclusion For the BERT models on the PharmaCoNER dataset, biomedical domain knowledge has a greater impact on model performance than the native language (i.e., Spanish). The BERT models can obtain competitive performance by using WordPiece to alleviate the out of vocabulary limitation. The performance on the BERT model can be further improved by constructing a specific vocabulary based on domain knowledge. Moreover, the character case also has a certain impact on model performance.

Download Full-text

Eliciting Attribute-Level User Needs from Online Reviews with Deep Language Models and Information Extraction

Journal of Mechanical Design ◽

10.1115/1.4048819 ◽

2020 ◽

pp. 1-34

Author(s):

Yi Han ◽

Mohsen Moghaddam

Keyword(s):

Sentiment Analysis ◽

Large Scale ◽

User Behavior ◽

Language Model ◽

Named Entity Recognition ◽

Online Reviews ◽

Entity Recognition ◽

Language Models ◽

Attribute Level ◽

User Needs

Abstract Eliciting user needs for individual components and features of a product or a service on a large scale is a key requirement for innovative design. Gathering and analyzing data as an initial discovery phase of a design process is usually accomplished with a small number of participants, employing qualitative research methods such as observations, focus groups, and interviews. This leaves an entire swath of pertinent user behavior, preferences, and opinions not captured. Sentiment analysis is a key enabler for large-scale need finding from online user reviews generated on a regular basis. A major limitation of current sentiment analysis approaches used in design sciences, however, is the need for laborious labeling and annotation of large review datasets for training, which in turn hinders their scalability and transferability across different domains. This article proposes an efficient and scalable methodology for automated and large-scale elicitation of attribute-level user needs. The methodology builds on the state-of-the-art pretrained deep language model, BERT (Bidirectional Encoder Representations from Transformers), with new convolutional net and named-entity recognition (NER) layers for extracting attribute, description, and sentiment words from online user review corpora. The machine translation algorithm BLEU (BiLingual Evaluation Understudy) is utilized to extract need expressions in the form of predefined part-of-speech combinations (e.g., adjective-noun, verb-noun). Numerical experiments are conducted on a large dataset scraped from a major e-commerce retail store for apparel and footwear to demonstrate the performance, feasibility, and potentials of the developed methodology.

Download Full-text

From Tokenization to Self-Supervision: Building a High-Performance Information Extraction System for Chemical Reactions in Patents

Frontiers in Research Metrics and Analytics ◽

10.3389/frma.2021.691105 ◽

2021 ◽

Vol 6 ◽

Author(s):

Jingqi Wang ◽

Yuankai Ren ◽

Zhi Zhang ◽

Hua Xu ◽

Yaoyun Zhang

Keyword(s):

Information Extraction ◽

Chemical Reactions ◽

Chemical Reaction ◽

High Performance ◽

Event Extraction ◽

Entity Recognition ◽

Language Models ◽

Accurate Information ◽

Free Text ◽

Semantic Roles

Chemical reactions and experimental conditions are fundamental information for chemical research and pharmaceutical applications. However, the latest information of chemical reactions is usually embedded in the free text of patents. The rapidly accumulating chemical patents urge automatic tools based on natural language processing (NLP) techniques for efficient and accurate information extraction. This work describes the participation of the Melax Tech team in the CLEF 2020—ChEMU Task of Chemical Reaction Extraction from Patent. The task consisted of two subtasks: (1) named entity recognition to identify compounds and different semantic roles in the chemical reaction and (2) event extraction to identify event triggers of chemical reaction and their relations with the semantic roles recognized in subtask 1. To build an end-to-end system with high performance, multiple strategies tailored to chemical patents were applied and evaluated, ranging from optimizing the tokenization, pre-training patent language models based on self-supervision, to domain knowledge-based rules. Our hybrid approaches combining different strategies achieved state-of-the-art results in both subtasks, with the top-ranked F1 of 0.957 for entity recognition and the top-ranked F1 of 0.9536 for event extraction, indicating that the proposed approaches are promising.

Download Full-text

An Attention-Based Model Using Character Composition of Entities in Chinese Relation Extraction

Information ◽

10.3390/info11020079 ◽

2020 ◽

Vol 11 (2) ◽

pp. 79 ◽

Cited By ~ 2

Author(s):

Xiaoyu Han ◽

Yue Zhang ◽

Wenkai Zhang ◽

Tinglei Huang

Keyword(s):

Language Processing ◽

Large Scale ◽

Named Entity Recognition ◽

Relation Extraction ◽

Entity Recognition ◽

Additional Information ◽

Named Entity ◽

Proposed Model ◽

The Relationship ◽

Crucial Part

Relation extraction is a vital task in natural language processing. It aims to identify the relationship between two specified entities in a sentence. Besides information contained in the sentence, additional information about the entities is verified to be helpful in relation extraction. Additional information such as entity type getting by NER (Named Entity Recognition) and description provided by knowledge base both have their limitations. Nevertheless, there exists another way to provide additional information which can overcome these limitations in Chinese relation extraction. As Chinese characters usually have explicit meanings and can carry more information than English letters. We suggest that characters that constitute the entities can provide additional information which is helpful for the relation extraction task, especially in large scale datasets. This assumption has never been verified before. The main obstacle is the lack of large-scale Chinese relation datasets. In this paper, first, we generate a large scale Chinese relation extraction dataset based on a Chinese encyclopedia. Second, we propose an attention-based model using the characters that compose the entities. The result on the generated dataset shows that these characters can provide useful information for the Chinese relation extraction task. By using this information, the attention mechanism we used can recognize the crucial part of the sentence that can express the relation. The proposed model outperforms other baseline models on our Chinese relation extraction dataset.

Download Full-text

Overview of CCKS 2020 Task 3: Named Entity Recognition and Event Extraction in Chinese Electronic Medical Records

Data Intelligence ◽

10.1162/dint_a_00093 ◽

2021 ◽

pp. 1-13

Author(s):

Xia Li ◽

Qinghua Wen ◽

Zengtao Jiao ◽

Jiangtao Zhang

Keyword(s):

Electronic Medical Records ◽

Medical Records ◽

Named Entity Recognition ◽

Event Extraction ◽

Entity Recognition ◽

Language Models ◽

Data Sets ◽

External Resources ◽

Named Entity ◽

Evaluation Task

Abstract The China Conference on Knowledge Graph and Semantic Computing (CCKS) 2020 Evaluation Task 3 presented clinical named entity recognition and event extraction for the Chinese electronic medical records. Two annotated data sets and some other additional resources for these two subtasks were provided for participators. This evaluation competition attracted 354 teams and 46 of them successfully submitted the valid results. The pre-trained language models are widely applied in this evaluation task. Data argumentation and external resources are also helpful.

Download Full-text

Information Extraction from Electronic Medical Records Using Multitask Recurrent Neural Network with Contextual Word Embedding

Applied Sciences ◽

10.3390/app9183658 ◽

2019 ◽

Vol 9 (18) ◽

pp. 3658 ◽

Cited By ~ 6

Author(s):

Jianliang Yang ◽

Yuenan Liu ◽

Minghui Qian ◽

Chenghua Guan ◽

Xiangfei Yuan

Keyword(s):

Electronic Medical Records ◽

Medical Records ◽

Large Scale ◽

Short Term Memory ◽

Conditional Random Field ◽

Named Entity Recognition ◽

Recognition Task ◽

Entity Recognition ◽

Language Models ◽

Named Entity

Clinical named entity recognition is an essential task for humans to analyze large-scale electronic medical records efficiently. Traditional rule-based solutions need considerable human effort to build rules and dictionaries; machine learning-based solutions need laborious feature engineering. For the moment, deep learning solutions like Long Short-term Memory with Conditional Random Field (LSTM–CRF) achieved considerable performance in many datasets. In this paper, we developed a multitask attention-based bidirectional LSTM–CRF (Att-biLSTM–CRF) model with pretrained Embeddings from Language Models (ELMo) in order to achieve better performance. In the multitask system, an additional task named entity discovery was designed to enhance the model’s perception of unknown entities. Experiments were conducted on the 2010 Informatics for Integrating Biology & the Bedside/Veterans Affairs (I2B2/VA) dataset. Experimental results show that our model outperforms the state-of-the-art solution both on the single model and ensemble model. Our work proposes an approach to improve the recall in the clinical named entity recognition task based on the multitask mechanism.

Download Full-text

Pushdown Automata in Statistical Machine Translation

Computational Linguistics ◽

10.1162/coli_a_00197 ◽

2014 ◽

Vol 40 (3) ◽

pp. 687-723 ◽

Cited By ~ 3

Author(s):

Cyril Allauzen ◽

Bill Byrne ◽

Adrià de Gispert ◽

Gonzalo Iglesias ◽

Michael Riley

Keyword(s):

Machine Translation ◽

Large Scale ◽

Complexity Analysis ◽

Statistical Machine Translation ◽

Language Model ◽

General Purpose ◽

Language Models ◽

Experimental Conditions ◽

Context Free ◽

Pushdown Automata

This article describes the use of pushdown automata (PDA) in the context of statistical machine translation and alignment under a synchronous context-free grammar. We use PDAs to compactly represent the space of candidate translations generated by the grammar when applied to an input sentence. General-purpose PDA algorithms for replacement, composition, shortest path, and expansion are presented. We describe HiPDT, a hierarchical phrase-based decoder using the PDA representation and these algorithms. We contrast the complexity of this decoder with a decoder based on a finite state automata representation, showing that PDAs provide a more suitable framework to achieve exact decoding for larger synchronous context-free grammars and smaller language models. We assess this experimentally on a large-scale Chinese-to-English alignment and translation task. In translation, we propose a two-pass decoding strategy involving a weaker language model in the first-pass to address the results of PDA complexity analysis. We study in depth the experimental conditions and tradeoffs in which HiPDT can achieve state-of-the-art performance for large-scale SMT.

Download Full-text

Injecting Event Knowledge into Pre-Trained Language Models for Event Extraction

10.5121/csit.2020.101404 ◽

2020 ◽

Author(s):

Zining Yang ◽

Siyu Zhan ◽

Mengshu Hou ◽

Xiaoyang Zeng ◽

Hao Zhu

Keyword(s):

Language Model ◽

Empirical Evaluation ◽

Event Extraction ◽

Training Data ◽

Language Models ◽

Extraction System ◽

Training Dataset ◽

Great Success ◽

Event Knowledge ◽

Event Trigger

The recent pre-trained language model has made great success in many NLP tasks. In this paper, we propose an event extraction system based on the novel pre-trained language model BERT to extract both event trigger and argument. As a deep-learningbased method, the size of the training dataset has a crucial impact on performance. To address the lacking training data problem for event extraction, we further train the pretrained language model with a carefully constructed in-domain corpus to inject event knowledge to our event extraction system with minimal efforts. Empirical evaluation on the ACE2005 dataset shows that injecting event knowledge can significantly improve the performance of event extraction.

Download Full-text

Building Graph for Events and Time in Natural Language Text

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8419.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 581-586

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Information Extraction ◽

Language Processing ◽

Question Answering ◽

Relation Extraction ◽

Event Extraction ◽

Event Time ◽

Time Graph ◽

Question Answering Systems

Events and time are two major key terms in natural language processing due to the various event-oriented tasks these are become an essential terms in information extraction. In natural language processing and information extraction or retrieval event and time leads to several applications like text summaries, documents summaries, and question answering systems. In this paper, we present events-time graph as a new way of construction for event-time based information from text. In this event-time graph nodes are events, whereas edges represent the temporal and co-reference relations between events. In many of the previous researches of natural language processing mainly individually focused on extraction tasks and in domain-specific way but in this work we present extraction and representation of the relationship between events- time by representing with event time graph construction. Our overall system construction is in three-step process that performs event extraction, time extraction, and representing relation extraction. Each step is at a performance level comparable with the state of the art. We present Event extraction on MUC data corpus annotated with events mentions on which we train and evaluate our model. Next, we present time extraction the model of times tested for several news articles from Wikipedia corpus. Next is to represent event time relation by representation by next constructing event time graphs. Finally, we evaluate the overall quality of event graphs with the evaluation metrics and conclude the observations of the entire work

Download Full-text

Mining microbe–disease interactions from literature via a transfer learning model

BMC Bioinformatics ◽

10.1186/s12859-021-04346-7 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Chengkun Wu ◽

Xinyi Xiao ◽

Canqun Yang ◽

JinXiang Chen ◽

Jiacai Yi ◽

...

Keyword(s):

Text Mining ◽

Large Scale ◽

Named Entity Recognition ◽

Learning Model ◽

Biomedical Literature ◽

Fine Tuning ◽

Entity Recognition ◽

Interaction Extraction ◽

Biomedical Texts ◽

Data Browsing

Abstract Background Interactions of microbes and diseases are of great importance for biomedical research. However, large-scale of microbe–disease interactions are hidden in the biomedical literature. The structured databases for microbe–disease interactions are in limited amounts. In this paper, we aim to construct a large-scale database for microbe–disease interactions automatically. We attained this goal via applying text mining methods based on a deep learning model with a moderate curation cost. We also built a user-friendly web interface that allows researchers to navigate and query required information. Results Firstly, we manually constructed a golden-standard corpus and a sliver-standard corpus (SSC) for microbe–disease interactions for curation. Moreover, we proposed a text mining framework for microbe–disease interaction extraction based on a pretrained model BERE. We applied named entity recognition tools to detect microbe and disease mentions from the free biomedical texts. After that, we fine-tuned the pretrained model BERE to recognize relations between targeted entities, which was originally built for drug–target interactions or drug–drug interactions. The introduction of SSC for model fine-tuning greatly improved detection performance for microbe–disease interactions, with an average reduction in error of approximately 10%. The MDIDB website offers data browsing, custom searching for specific diseases or microbes, and batch downloading. Conclusions Evaluation results demonstrate that our method outperform the baseline model (rule-based PKDE4J) with an average $$F_1$$ F 1 -score of 73.81%. For further validation, we randomly sampled nearly 1000 predicted interactions by our model, and manually checked the correctness of each interaction, which gives a 73% accuracy. The MDIDB webiste is freely avaliable throuth http://dbmdi.com/index/

Download Full-text