Integrating Deep Learning with Logic Fusion for Information Extraction

Information extraction (IE) aims to produce structured information from an input text, e.g., Named Entity Recognition and Relation Extraction. Various attempts have been proposed for IE via feature engineering or deep learning. However, most of them fail to associate the complex relationships inherent in the task itself, which has proven to be especially crucial. For example, the relation between 2 entities is highly dependent on their entity types. These dependencies can be regarded as complex constraints that can be efficiently expressed as logical rules. To combine such logic reasoning capabilities with learning capabilities of deep neural networks, we propose to integrate logical knowledge in the form of first-order logic into a deep learning system, which can be trained jointly in an end-to-end manner. The integrated framework is able to enhance neural outputs with knowledge regularization via logic rules, and at the same time update the weights of logic rules to comply with the characteristics of the training data. We demonstrate the effectiveness and generalization of the proposed model on multiple IE tasks.

Download Full-text

Named Entity Recognition and Relation Extraction

ACM Computing Surveys ◽

10.1145/3445965 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-39

Author(s):

Zara Nasar ◽

Syed Waqar Jaffry ◽

Muhammad Kamran Malik

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

Relation Extraction ◽

The State ◽

Entity Recognition ◽

Joint Models ◽

Named Entity ◽

Textual Data ◽

Benchmark Datasets

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.

Download Full-text

A step towards information extraction: Named entity recognition in Bangla using deep learning

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-179349 ◽

2019 ◽

Vol 37 (6) ◽

pp. 7401-7413 ◽

Cited By ~ 1

Author(s):

Redwanul Karim ◽

M. A. Muhiminul Islam ◽

Sazid Rahman Simanto ◽

Saif Ahmed Chowdhury ◽

Kalyan Roy ◽

...

Keyword(s):

Deep Learning ◽

Information Extraction ◽

Named Entity Recognition ◽

Entity Recognition ◽

Named Entity

Download Full-text

Machine learning-based named entity recognition via effective integration of various evidences

Natural Language Engineering ◽

10.1017/s1351324904003559 ◽

2005 ◽

Vol 11 (2) ◽

pp. 189-206 ◽

Cited By ~ 4

Author(s):

GUODONG ZHOU ◽

JIAN SU

Keyword(s):

Machine Learning ◽

Named Entity Recognition ◽

Learning System ◽

Training Data ◽

Entity Recognition ◽

Named Entity ◽

Data Sparseness ◽

Constraint Relaxation ◽

Text Document ◽

F Measure

Named entity recognition identifies and classifies entity names in a text document into some predefined categories. It resolves the “who”, “where” and “how much” problems in information extraction and leads to the resolution of the “what” and “how” problems in further processing. This paper presents a Hidden Markov Model (HMM) and proposes a HMM-based named entity recognizer implemented as the system PowerNE. Through the HMM and an effective constraint relaxation algorithm to deal with the data sparseness problem, PowerNE is able to effectively apply and integrate various internal and external evidences of entity names. Currently, four evidences are included: (1) a simple deterministic internal feature of the words, such as capitalization and digitalization; (2) an internal semantic feature of the important triggers; (3) an internal gazetteer feature, which determines the appearance of the current word string in the provided gazetteer list; and (4) an external macro context feature, which deals with the name alias phenomena. In this way, the named entity recognition problem is resolved effectively. PowerNE has been benchmarked with the Message Understanding Conferences (MUC) data. The evaluation shows that, using the formal training and test data of the MUC-6 and MUC-7 English named entity tasks, and it achieves the F-measures of 96.6 and 94.1, respectively. Compared with the best reported machine learning system, it achieves a 1.7 higher F-measure with one quarter of the training data on MUC-6, and a 3.6 higher F-measure with one ninth of the training data on MUC-7. In addition, it performs slightly better than the best reported handcrafted rule-based systems on MUC-6 and MUC-7.

Download Full-text

An Evaluation of State-of-the-Art Approaches to Relation Extraction for Usage on Domain-Specific Corpora

10.5121/csit.2021.112006 ◽

2021 ◽

Author(s):

Christoph Brandl ◽

Jens Albrecht ◽

Renato Budinich

Keyword(s):

State Of The Art ◽

Extraction Procedure ◽

Named Entity Recognition ◽

Relation Extraction ◽

Building Blocks ◽

Training Data ◽

Entity Recognition ◽

Data Set ◽

Named Entity ◽

Domain Specific

The task of relation extraction aims at classifying the semantic relations between entities in a text. When coupled with named-entity recognition these can be used as the building blocks for an information extraction procedure that results in the construction of a Knowledge Graph. While many NLP libraries support named-entity recognition, there is no off-the-shelf solution for relation extraction. In this paper, we evaluate and compare several state-of-the-art approaches on a subset of the FewRel data set as well as a manually annotated corpus. The custom corpus contains six relations from the area of market research and is available for public use. Our approach provides guidance for the selection of models and training data for relation extraction in realworld projects.

Download Full-text

Deep learning with language models improves named entity recognition for PharmaCoNER

BMC Bioinformatics ◽

10.1186/s12859-021-04260-y ◽

2021 ◽

Vol 22 (S1) ◽

Author(s):

Cong Sun ◽

Zhihao Yang ◽

Lei Wang ◽

Yin Zhang ◽

Hongfei Lin ◽

...

Keyword(s):

Deep Learning ◽

Language Processing ◽

Domain Knowledge ◽

Named Entity Recognition ◽

Model Performance ◽

Relation Extraction ◽

Entity Recognition ◽

Language Models ◽

Named Entity ◽

Biomedical Texts

Abstract Background The recognition of pharmacological substances, compounds and proteins is essential for biomedical relation extraction, knowledge graph construction, drug discovery, as well as medical question answering. Although considerable efforts have been made to recognize biomedical entities in English texts, to date, only few limited attempts were made to recognize them from biomedical texts in other languages. PharmaCoNER is a named entity recognition challenge to recognize pharmacological entities from Spanish texts. Because there are currently abundant resources in the field of natural language processing, how to leverage these resources to the PharmaCoNER challenge is a meaningful study. Methods Inspired by the success of deep learning with language models, we compare and explore various representative BERT models to promote the development of the PharmaCoNER task. Results The experimental results show that deep learning with language models can effectively improve model performance on the PharmaCoNER dataset. Our method achieves state-of-the-art performance on the PharmaCoNER dataset, with a max F1-score of 92.01%. Conclusion For the BERT models on the PharmaCoNER dataset, biomedical domain knowledge has a greater impact on model performance than the native language (i.e., Spanish). The BERT models can obtain competitive performance by using WordPiece to alleviate the out of vocabulary limitation. The performance on the BERT model can be further improved by constructing a specific vocabulary based on domain knowledge. Moreover, the character case also has a certain impact on model performance.

Download Full-text

DocNER: A Deep Learning System for Named Entity Recognition in Handwritten Document Images

10.1007/978-3-030-92310-5_28 ◽

2021 ◽

pp. 239-246

Author(s):

Marwa Dhiaf ◽

Sana Khamekhem Jemni ◽

Yousri Kessentini

Keyword(s):

Deep Learning ◽

Named Entity Recognition ◽

Learning System ◽

Entity Recognition ◽

Document Images ◽

Named Entity ◽

Handwritten Document

Download Full-text

Transfer Learning for Named Entity Recognition in Financial and Biomedical Documents

Information ◽

10.3390/info10080248 ◽

2019 ◽

Vol 10 (8) ◽

pp. 248 ◽

Cited By ~ 3

Author(s):

Sumam Francis ◽

Jordy Van Landeghem ◽

Marie-Francine Moens

Keyword(s):

Deep Learning ◽

Transfer Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

Training Data ◽

Entity Recognition ◽

Language Models ◽

Reasonable Assumption ◽

Target Domain ◽

Named Entity

Recent deep learning approaches have shown promising results for named entity recognition (NER). A reasonable assumption for training robust deep learning models is that a sufficient amount of high-quality annotated training data is available. However, in many real-world scenarios, labeled training data is scarcely present. In this paper we consider two use cases: generic entity extraction from financial and from biomedical documents. First, we have developed a character based model for NER in financial documents and a word and character based model with attention for NER in biomedical documents. Further, we have analyzed how transfer learning addresses the problem of limited training data in a target domain. We demonstrate through experiments that NER models trained on labeled data from a source domain can be used as base models and then be fine-tuned with few labeled data for recognition of different named entity classes in a target domain. We also witness an interest in language models to improve NER as a way of coping with limited labeled data. The current most successful language model is BERT. Because of its success in state-of-the-art models we integrate representations based on BERT in our biomedical NER model along with word and character information. The results are compared with a state-of-the-art model applied on a benchmarking biomedical corpus.

Download Full-text

Combining Contextualized Embeddings and Prior Knowledge for Clinical Named Entity Recognition: Evaluation Study

JMIR Medical Informatics ◽

10.2196/14850 ◽

2019 ◽

Vol 7 (4) ◽

pp. e14850 ◽

Cited By ~ 4

Author(s):

Min Jiang ◽

Todd Sanger ◽

Xiong Liu

Keyword(s):

Deep Learning ◽

Prior Knowledge ◽

Language Processing ◽

Named Entity Recognition ◽

Word Embedding ◽

Training Data ◽

Entity Recognition ◽

Named Entities ◽

Clinical Text ◽

Named Entity

Background Named entity recognition (NER) is a key step in clinical natural language processing (NLP). Traditionally, rule-based systems leverage prior knowledge to define rules to identify named entities. Recently, deep learning–based NER systems have become more and more popular. Contextualized word embedding, as a new type of representation of the word, has been proposed to dynamically capture word sense using context information and has proven successful in many deep learning–based systems in either general domain or medical domain. However, there are very few studies that investigate the effects of combining multiple contextualized embeddings and prior knowledge on the clinical NER task. Objective This study aims to improve the performance of NER in clinical text by combining multiple contextual embeddings and prior knowledge. Methods In this study, we investigate the effects of combining multiple contextualized word embeddings with classic word embedding in deep neural networks to predict named entities in clinical text. We also investigate whether using a semantic lexicon could further improve the performance of the clinical NER system. Results By combining contextualized embeddings such as ELMo and Flair, our system achieves the F-1 score of 87.30% when only training based on a portion of the 2010 Informatics for Integrating Biology and the Bedside NER task dataset. After incorporating the medical lexicon into the word embedding, the F-1 score was further increased to 87.44%. Another finding was that our system still could achieve an F-1 score of 85.36% when the size of the training data was reduced to 40%. Conclusions Combined contextualized embedding could be beneficial for the clinical NER task. Moreover, the semantic lexicon could be used to further improve the performance of the clinical NER system.

Download Full-text

Probabilistic vs deep learning based approaches for narrow domain NER in Spanish

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-179868 ◽

2020 ◽

Vol 39 (2) ◽

pp. 2015-2025

Author(s):

Orlando Ramos-Flores ◽

David Pinto ◽

Manuel Montes-y-Gómez ◽

Andrés Vázquez

Keyword(s):

Deep Learning ◽

Conditional Random Fields ◽

Short Term Memory ◽

Named Entity Recognition ◽

Training Data ◽

Entity Recognition ◽

Mexican Spanish ◽

Named Entity ◽

Long Short Term Memory ◽

Deep Learning Model

This work presents an experimental study on the task of Named Entity Recognition (NER) for a narrow domain in Spanish language. This study considers two approaches commonly used in this kind of problem, namely, a Conditional Random Fields (CRF) model and Recurrent Neural Network (RNN). For the latter, we employed a bidirectional Long Short-Term Memory with ELMO’s pre-trained word embeddings for Spanish. The comparison between the probabilistic model and the deep learning model was carried out in two collections, the Spanish dataset from CoNLL-2002 considering four classes under the IOB tagging schema, and a Mexican Spanish news dataset with seventeen classes under IOBES schema. The paper presents an analysis about the scalability, robustness, and common errors of both models. This analysis indicates in general that the BiLSTM-ELMo model is more suitable than the CRF model when there is “enough” training data, and also that it is more scalable, as its performance was not significantly affected in the incremental experiments (by adding one class at a time). On the other hand, results indicate that the CRF model is more adequate for scenarios having small training datasets and many classes.

Download Full-text

Extraction of Information Related to Adverse Drug Events from Electronic Health Record Notes: Design of an End-to-End Model Based on Deep Learning (Preprint)

10.2196/preprints.12159 ◽

2018 ◽

Author(s):

Fei Li ◽

Weisong Liu ◽

Hong Yu

Keyword(s):

Deep Learning ◽

Adverse Drug Events ◽

Named Entity Recognition ◽

Relation Extraction ◽

Learning Model ◽

Entity Recognition ◽

Health Record ◽

Named Entity ◽

Related Information ◽

Deep Learning Model

BACKGROUND Pharmacovigilance and drug-safety surveillance are crucial for monitoring adverse drug events (ADEs), but the main ADE-reporting systems such as Food and Drug Administration Adverse Event Reporting System face challenges such as underreporting. Therefore, as complementary surveillance, data on ADEs are extracted from electronic health record (EHR) notes via natural language processing (NLP). As NLP develops, many up-to-date machine-learning techniques are introduced in this field, such as deep learning and multi-task learning (MTL). However, only a few studies have focused on employing such techniques to extract ADEs. OBJECTIVE We aimed to design a deep learning model for extracting ADEs and related information such as medications and indications. Since extraction of ADE-related information includes two steps—named entity recognition and relation extraction—our second objective was to improve the deep learning model using multi-task learning between the two steps. METHODS We employed the dataset from the Medication, Indication and Adverse Drug Events (MADE) 1.0 challenge to train and test our models. This dataset consists of 1089 EHR notes of cancer patients and includes 9 entity types such as Medication, Indication, and ADE and 7 types of relations between these entities. To extract information from the dataset, we proposed a deep-learning model that uses a bidirectional long short-term memory (BiLSTM) conditional random field network to recognize entities and a BiLSTM-Attention network to extract relations. To further improve the deep-learning model, we employed three typical MTL methods, namely, hard parameter sharing, parameter regularization, and task relation learning, to build three MTL models, called HardMTL, RegMTL, and LearnMTL, respectively. RESULTS Since extraction of ADE-related information is a two-step task, the result of the second step (ie, relation extraction) was used to compare all models. We used microaveraged precision, recall, and F1 as evaluation metrics. Our deep learning model achieved state-of-the-art results (F1=65.9%), which is significantly higher than that (F1=61.7%) of the best system in the MADE1.0 challenge. HardMTL further improved the F1 by 0.8%, boosting the F1 to 66.7%, whereas RegMTL and LearnMTL failed to boost the performance. CONCLUSIONS Deep learning models can significantly improve the performance of ADE-related information extraction. MTL may be effective for named entity recognition and relation extraction, but it depends on the methods, data, and other factors. Our results can facilitate research on ADE detection, NLP, and machine learning.

Download Full-text