Relation Extraction Exploiting Full Dependency Forests

Dependency syntax has long been recognized as a crucial source of features for relation extraction. Previous work considers 1-best trees produced by a parser during preprocessing. However, error propagation from the out-of-domain parser may impact the relation extraction performance. We propose to leverage full dependency forests for this task, where a full dependency forest encodes all possible trees. Such representations of full dependency forests provide a differentiable connection between a parser and a relation extraction model, and thus we are also able to study adjusting the parser parameters based on end-task loss. Experiments on three datasets show that full dependency forests and parser adjustment give significant improvements over carefully designed baselines, showing state-of-the-art or competitive performances on biomedical or newswire benchmarks.

Download Full-text

Joint Extraction of Entities and Overlapping Relations Using Position-Attentive Sequence Labeling

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33016300 ◽

2019 ◽

Vol 33 ◽

pp. 6300-6308 ◽

Cited By ~ 5

Author(s):

Dai Dai ◽

Xinyan Xiao ◽

Yajuan Lyu ◽

Shan Dou ◽

Qiaoqiao She ◽

...

Keyword(s):

Long Range ◽

State Of The Art ◽

Relation Extraction ◽

Attention Mechanism ◽

Single Model ◽

Sequence Labeling ◽

Word Position ◽

Query Word ◽

Public Datasets ◽

Extraction Model

Joint entity and relation extraction is to detect entity and relation using a single model. In this paper, we present a novel unified joint extraction model which directly tags entity and relation labels according to a query word position p, i.e., detecting entity at p, and identifying entities at other positions that have relationship with the former. To this end, we first design a tagging scheme to generate n tag sequences for an n-word sentence. Then a position-attention mechanism is introduced to produce different sentence representations for every query position to model these n tag sequences. In this way, our method can simultaneously extract all entities and their type, as well as all overlapping relations. Experiment results show that our framework performances significantly better on extracting overlapping relations as well as detecting long-range relation, and thus we achieve state-of-the-art performance on two public datasets.

Download Full-text

Learning Latent Forests for Medical Relation Extraction

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/505 ◽

2020 ◽

Author(s):

Zhijiang Guo ◽

Guoshun Nan ◽

Wei LU ◽

Shay B. Cohen

Keyword(s):

Error Propagation ◽

Latent Variable ◽

State Of The Art ◽

Relation Extraction ◽

Medical Texts ◽

Tree Structures ◽

Dependency Structure ◽

Dependency Tree ◽

Non Local ◽

Novel Model

The goal of medical relation extraction is to detect relations among entities, such as genes, mutations and drugs in medical texts. Dependency tree structures have been proven useful for this task. Existing approaches to such relation extraction leverage off-the-shelf dependency parsers to obtain a syntactic tree or forest for the text. However, for the medical domain, low parsing accuracy may lead to error propagation downstream the relation extraction pipeline. In this work, we propose a novel model which treats the dependency structure as a latent variable and induces it from the unstructured text in an end-to-end fashion. Our model can be understood as composing task-specific dependency forests that capture non-local interactions for better relation extraction. Extensive results on four datasets show that our model is able to significantly outperform state-of-the-art systems without relying on any direct tree supervision or pre-training.

Download Full-text

A Boundary Determined Neural Model for Relation Extraction

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2021.3.4235 ◽

2021 ◽

Vol 16 (3) ◽

Author(s):

RUI XUE TANG

Keyword(s):

Neural Network ◽

Error Propagation ◽

State Of The Art ◽

Relation Extraction ◽

Neural Model ◽

Relevant Information ◽

Propagation Problem ◽

Distributed Representation ◽

High Quality ◽

Information Encoding

Existing models extract entity relations only after two entity spans have been precisely extracted that influenced the performance of relation extraction. Compared with recognizing entity spans, because the boundary has a small granularity and a less ambiguity, it can be detected precisely and incorporated to learn better representation. Motivated by the strengths of boundary, we propose a boundary determined neural (BDN) model, which leverages boundaries as task-related cues to predict the relation labels. Our model can predict high-quality relation instance via the pairs of boundaries, which can relieve error propagation problem. Moreover, our model fuses with boundary-relevant information encoding to represent distributed representation to improve the ability of capturing semantic and dependency information, which can increase the discriminability of neural network. Experiments show that our model achieves state-of-the-art performances on ACE05 corpus.

Download Full-text

Utilizing Entity-Based Gated Convolution and Multilevel Sentence Attention to Improve Distantly Supervised Relation Extraction

Computational Intelligence and Neuroscience ◽

10.1155/2021/6110885 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Qian Yi ◽

Guixuan Zhang ◽

Shuwu Zhang

Keyword(s):

Selective Attention ◽

Large Scale ◽

State Of The Art ◽

Relation Extraction ◽

Benchmark Dataset ◽

Experimental Results ◽

Fine Grained ◽

Convolution Operation ◽

Distant Supervision ◽

Extraction Model

Distant supervision is an effective method to automatically collect large-scale datasets for relation extraction (RE). Automatically constructed datasets usually comprise two types of noise: the intrasentence noise and the wrongly labeled noisy sentence. To address issues caused by the above two types of noise and improve distantly supervised relation extraction, this paper proposes a novel distantly supervised relation extraction model, which consists of an entity-based gated convolution sentence encoder and a multilevel sentence selective attention (Matt) module. Specifically, we first apply an entity-based gated convolution operation to force the sentence encoder to extract entity-pair-related features and filter out useless intrasentence noise information. Furthermore, the multilevel attention schema fuses the bag information to obtain a fine-grained bag-specific query vector, which can better identify valid sentences and reduce the influence of wrongly labeled sentences. Experimental results on a large-scale benchmark dataset show that our model can effectively reduce the influence of the above two types of noise and achieves state-of-the-art performance in relation extraction.

Download Full-text

Named Entity Recognition and Relation Extraction

ACM Computing Surveys ◽

10.1145/3445965 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-39

Author(s):

Zara Nasar ◽

Syed Waqar Jaffry ◽

Muhammad Kamran Malik

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

Relation Extraction ◽

The State ◽

Entity Recognition ◽

Joint Models ◽

Named Entity ◽

Textual Data ◽

Benchmark Datasets

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.

Download Full-text

Extraction of causal relations based on SBEL and BERT model

Database ◽

10.1093/database/baab005 ◽

2021 ◽

Vol 2021 ◽

Author(s):

Yifan Shao ◽

Haoru Li ◽

Jinghang Gu ◽

Longhua Qian ◽

Guodong Zhou

Keyword(s):

State Of The Art ◽

Causal Relation ◽

Relation Extraction ◽

The Other ◽

Biomedical Text ◽

Intermediate Form ◽

Biomedical Text Mining ◽

Causal Relations ◽

The One ◽

Stage 1

Abstract Extraction of causal relations between biomedical entities in the form of Biological Expression Language (BEL) poses a new challenge to the community of biomedical text mining due to the complexity of BEL statements. We propose a simplified form of BEL statements [Simplified Biological Expression Language (SBEL)] to facilitate BEL extraction and employ BERT (Bidirectional Encoder Representation from Transformers) to improve the performance of causal relation extraction (RE). On the one hand, BEL statement extraction is transformed into the extraction of an intermediate form—SBEL statement, which is then further decomposed into two subtasks: entity RE and entity function detection. On the other hand, we use a powerful pretrained BERT model to both extract entity relations and detect entity functions, aiming to improve the performance of two subtasks. Entity relations and functions are then combined into SBEL statements and finally merged into BEL statements. Experimental results on the BioCreative-V Track 4 corpus demonstrate that our method achieves the state-of-the-art performance in BEL statement extraction with F1 scores of 54.8% in Stage 2 evaluation and of 30.1% in Stage 1 evaluation, respectively. Database URL: https://github.com/grapeff/SBEL_datasets

Download Full-text

Ontology of human relation extraction based on dependency syntax rules

Proceedings of the International Conference on Web Intelligence - WI '17 ◽

10.1145/3106426.3109050 ◽

2017 ◽

Author(s):

Long He ◽

Likun Qiu

Keyword(s):

Relation Extraction ◽

Human Relation ◽

Dependency Syntax

Download Full-text

Multi-Graph Cooperative Learning Towards Distant Supervised Relation Extraction

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3466560 ◽

2021 ◽

Vol 12 (5) ◽

pp. 1-21

Author(s):

Changsen Yuan ◽

Heyan Huang ◽

Chong Feng

Keyword(s):

Cooperative Learning ◽

State Of The Art ◽

Relation Extraction ◽

Sentence Length ◽

Universal Relation ◽

Dependency Parsing ◽

Convolutional Network ◽

Syntactic Features ◽

Use Dependency

The Graph Convolutional Network (GCN) is a universal relation extraction method that can predict relations of entity pairs by capturing sentences’ syntactic features. However, existing GCN methods often use dependency parsing to generate graph matrices and learn syntactic features. The quality of the dependency parsing will directly affect the accuracy of the graph matrix and change the whole GCN’s performance. Because of the influence of noisy words and sentence length in the distant supervised dataset, using dependency parsing on sentences causes errors and leads to unreliable information. Therefore, it is difficult to obtain credible graph matrices and relational features for some special sentences. In this article, we present a Multi-Graph Cooperative Learning model (MGCL), which focuses on extracting the reliable syntactic features of relations by different graphs and harnessing them to improve the representations of sentences. We conduct experiments on a widely used real-world dataset, and the experimental results show that our model achieves the state-of-the-art performance of relation extraction.

Download Full-text