Multi-Graph Cooperative Learning Towards Distant Supervised Relation Extraction

The Graph Convolutional Network (GCN) is a universal relation extraction method that can predict relations of entity pairs by capturing sentences’ syntactic features. However, existing GCN methods often use dependency parsing to generate graph matrices and learn syntactic features. The quality of the dependency parsing will directly affect the accuracy of the graph matrix and change the whole GCN’s performance. Because of the influence of noisy words and sentence length in the distant supervised dataset, using dependency parsing on sentences causes errors and leads to unreliable information. Therefore, it is difficult to obtain credible graph matrices and relational features for some special sentences. In this article, we present a Multi-Graph Cooperative Learning model (MGCL), which focuses on extracting the reliable syntactic features of relations by different graphs and harnessing them to improve the representations of sentences. We conduct experiments on a widely used real-world dataset, and the experimental results show that our model achieves the state-of-the-art performance of relation extraction.

Download Full-text

Learn from Syntax: Improving Pair-wise Aspect and Opinion Terms Extraction with Rich Syntactic Knowledge

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/545 ◽

2021 ◽

Author(s):

Shengqiong Wu ◽

Hao Fei ◽

Yafeng Ren ◽

Donghong Ji ◽

Jingye Li

Keyword(s):

State Of The Art ◽

Boundary Detection ◽

High Order ◽

Experimental Results ◽

Convolutional Network ◽

Syntactic Knowledge ◽

Current State ◽

Syntactic Features ◽

Benchmark Datasets

In this paper, we propose to enhance the pair-wise aspect and opinion terms extraction (PAOTE) task by incorporating rich syntactic knowledge. We first build a syntax fusion encoder for encoding syntactic features, including a label-aware graph convolutional network (LAGCN) for modeling the dependency edges and labels, as well as the POS tags unifiedly, and a local-attention module encoding POS tags for better term boundary detection. During pairing, we then adopt Biaffine and Triaffine scoring for high-order aspect-opinion term pairing, in the meantime re-harnessing the syntax-enriched representations in LAGCN for syntactic-aware scoring. Experimental results on four benchmark datasets demonstrate that our model outperforms current state-of-the-art baselines, meanwhile yielding explainable predictions with syntactic knowledge.

Download Full-text

Cross-Relation Cross-Bag Attention for Distantly-Supervised Relation Extraction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.3301419 ◽

2019 ◽

Vol 33 ◽

pp. 419-426 ◽

Cited By ~ 6

Author(s):

Yujin Yuan ◽

Liyuan Liu ◽

Siliang Tang ◽

Zhongfei Zhang ◽

Yueting Zhuang ◽

...

Keyword(s):

Selective Attention ◽

Supervised Learning ◽

State Of The Art ◽

Relation Extraction ◽

Knowledge Bases ◽

Training Data ◽

Distant Supervision ◽

Sentence Level ◽

Noise Robust

Distant supervision leverages knowledge bases to automatically label instances, thus allowing us to train relation extractor without human annotations. However, the generated training data typically contain massive noise, and may result in poor performances with the vanilla supervised learning. In this paper, we propose to conduct multi-instance learning with a novel Cross-relation Cross-bag Selective Attention (C2SA), which leads to noise-robust training for distant supervised relation extractor. Specifically, we employ the sentence-level selective attention to reduce the effect of noisy or mismatched sentences, while the correlation among relations were captured to improve the quality of attention weights. Moreover, instead of treating all entity-pairs equally, we try to pay more attention to entity-pairs with a higher quality. Similarly, we adopt the selective attention mechanism to achieve this goal. Experiments with two types of relation extractor demonstrate the superiority of the proposed approach over the state-of-the-art, while further ablation studies verify our intuitions and demonstrate the effectiveness of our proposed two techniques.

Download Full-text

Dependency parsing of Polish

Poznan Studies in Contemporary Linguistics ◽

10.1515/psicl-2019-0012 ◽

2019 ◽

Vol 55 (2) ◽

pp. 305-337 ◽

Cited By ~ 1

Author(s):

Alina Wróblewska ◽

Piotr Rybak

Keyword(s):

Language Processing ◽

Argument Structure ◽

Question Answering ◽

State Of The Art ◽

Dependency Parsing ◽

Dependency Theory ◽

Crucial Issue ◽

Experimental Part ◽

Predicate Argument Structure

Abstract The predicate-argument structure transparently encoded in dependency-based syntactic representations supports machine translation, question answering, information extraction, etc. The quality of dependency parsing is therefore a crucial issue in natural language processing. In the current paper we discuss the fundamental ideas of the dependency theory and provide an overview of selected dependency-based resources for Polish. Furthermore, we present some state-of-the-art dependency parsing systems whose models can be estimated on correctly annotated data. In the experimental part, we provide an in-depth evaluation of these systems on Polish data. Our results show that graph-based parsers, even those without any neural component, are better suited for Polish than transition-based parsing systems.

Download Full-text

Learning domain taxonomies: the TaxoLine approach

International Journal of Web Information Systems ◽

10.1108/ijwis-04-2017-0024 ◽

2017 ◽

Vol 13 (3) ◽

pp. 281-301 ◽

Cited By ~ 5

Author(s):

Omar El Idrissi Esserhrouchni ◽

Bouchra Frikh ◽

Brahim Ouhbi ◽

Ismail Khalil Ibrahim

Keyword(s):

Execution Time ◽

Design Methodology ◽

State Of The Art ◽

Relation Extraction ◽

Ontology Learning ◽

Conditional Mutual Information ◽

Web Documents ◽

Content Type ◽

Innovative Methodology

Purpose The aim of this paper is to present an online framework for building a domain taxonomy, called TaxoLine, from Web documents automatically. Design/methodology/approach TaxoLine proposes an innovative methodology that combines frequency and conditional mutual information to improve the quality of the domain taxonomy. The system also includes a set of mechanisms that improve the execution time needed to build the ontology. Findings The performance of the TaxoLine framework was applied to nine different financial corpora. The generated taxonomies are evaluated against a gold-standard ontology and are compared to state-of-the-art ontology learning methods. Originality/value The experimental results show that TaxoLine produces high precision and recall for both concept and relation extraction than well-known ontology learning algorithms. Furthermore, it also shows promising results in terms of execution time needed to build the domain taxonomy.

Download Full-text

Multi-Class Imbalanced Graph Convolutional Network Learning

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/398 ◽

2020 ◽

Author(s):

Min Shi ◽

Yufei Tang ◽

Xingquan Zhu ◽

David Wilson ◽

Jianxun Liu

Keyword(s):

State Of The Art ◽

Representation Learning ◽

Convolutional Network ◽

Network Learning ◽

Convolutional Networks ◽

Space Experiments ◽

Latent Distribution ◽

Adversarial Training ◽

Quality Of Fit

Networked data often demonstrate the Pareto principle (i.e., 80/20 rule) with skewed class distributions, where most vertices belong to a few majority classes and minority classes only contain a handful of instances. When presented with imbalanced class distributions, existing graph embedding learning tends to bias to nodes from majority classes, leaving nodes from minority classes under-trained. In this paper, we propose Dual-Regularized Graph Convolutional Networks (DR-GCN) to handle multi-class imbalanced graphs, where two types of regularization are imposed to tackle class imbalanced representation learning. To ensure that all classes are equally represented, we propose a class-conditioned adversarial training process to facilitate the separation of labeled nodes. Meanwhile, to maintain training equilibrium (i.e., retaining quality of fit across all classes), we force unlabeled nodes to follow a similar latent distribution to the labeled nodes by minimizing their difference in the embedding space. Experiments on real-world imbalanced graphs demonstrate that DR-GCN outperforms the state-of-the-art methods in node classification, graph clustering, and visualization.

Download Full-text

Multi-View Consistency for Relation Extraction via Mutual Information and Structure Prediction

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6445 ◽

2020 ◽

Vol 34 (05) ◽

pp. 9106-9113

Author(s):

Amir Veyseh ◽

Franck Dernoncourt ◽

My Thai ◽

Dejing Dou ◽

Thien Nguyen

Keyword(s):

Structure Prediction ◽

State Of The Art ◽

Relation Extraction ◽

Semantic Relations ◽

General Strategy ◽

Novel Method ◽

Dependency Trees ◽

The Common ◽

The One

Relation Extraction (RE) is one of the fundamental tasks in Information Extraction. The goal of this task is to find the semantic relations between entity mentions in text. It has been shown in many previous work that the structure of the sentences (i.e., dependency trees) can provide important information/features for the RE models. However, the common limitation of the previous work on RE is the reliance on some external parsers to obtain the syntactic trees for the sentence structures. On the one hand, it is not guaranteed that the independent external parsers can offer the optimal sentence structures for RE and the customized structures for RE might help to further improve the performance. On the other hand, the quality of the external parsers might suffer when applied to different domains, thus also affecting the performance of the RE models on such domains. In order to overcome this issue, we introduce a novel method for RE that simultaneously induces the structures and predicts the relations for the input sentences, thus avoiding the external parsers and potentially leading to better sentence structures for RE. Our general strategy to learn the RE-specific structures is to apply two different methods to infer the structures for the input sentences (i.e., two views). We then introduce several mechanisms to encourage the structure and semantic consistencies between these two views so the effective structure and semantic representations for RE can emerge. We perform extensive experiments on the ACE 2005 and SemEval 2010 datasets to demonstrate the advantages of the proposed method, leading to the state-of-the-art performance on such datasets.

Download Full-text

Relation Extraction Based on Fusion Dependency Parsing from Chinese EMRs

Scientific Programming ◽

10.1155/2020/8658040 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Pengjun Zhai ◽

Xin Huang ◽

Beibei Zhang ◽

Yu Fang

Keyword(s):

Medical Record ◽

Medical Knowledge ◽

Relation Extraction ◽

Sentence Structure ◽

Dependency Parsing ◽

Data Set ◽

Dependency Structure ◽

Syntactic Features ◽

Contextual Feature ◽

Basic Features

The Electronic Medical Record (EMR) contains a great deal of medical knowledge related to patients, which has been widely used in the construction of medical knowledge graphs. Previous studies mainly focus on the features based on surface semantics of EMRs for relation extraction, such as contextual feature, but the features of sentence structure in Chinese EMRs have been neglected. In this paper, a fusion dependency parsing-based relation extraction method is proposed. Specifically, this paper extends basic features with medical record feature and indicator feature that are applicable to Chinese EMRs. Furthermore, dependency syntactic features are introduced to analyse the dependency structure of sentences. Finally, the F1 value of relation extraction based on extended features is 4.87% higher than that of relation extraction based on basic features. And compared with the former, the F1 value of relation extraction based on fusion dependency parsing is increased by 4.39%. The results of experiments performed on a Chinese EMR data set show that the extended features and dependency parsing all contribute to the relation extraction.

Download Full-text

Evaluation of scratch and pre-trained convolutional neural networks for the classification of Tomato plant diseases

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i2.pp467-475 ◽

2021 ◽

Vol 10 (2) ◽

pp. 467

Author(s):

Mohammad Amimul Ihsan Aquil ◽

Wan Hussain Wan Ishak

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Tomato Plant ◽

State Of The Art ◽

Plant Diseases ◽

Fine Tuning ◽

Convolutional Network ◽

Tomato Diseases

<span id="docs-internal-guid-01580d49-7fff-6f2a-70d1-7893ec0a6e14"><span>Plant diseases are a major cause of destruction and death of most plants and especially trees. However, with the help of early detection, this issue can be solved and treated appropriately. A timely and accurate diagnosis is critical in maintaining the quality of crops. Recent innovations in the field of deep learning (DL), especially in convolutional neural networks (CNNs) have achieved great breakthroughs across different applications such as the classification of plant diseases. This study aims to evaluate scratch and pre-trained CNNs in the classification of tomato plant diseases by comparing some of the state-of-the-art architectures including densely connected convolutional network (Densenet) 120, residual network (ResNet) 101, ResNet 50, ReseNet 30, ResNet 18, squeezenet and Vgg.net. The comparison was then evaluated using a multiclass statistical analysis based on the F-Score, specificity, sensitivity, precision, and accuracy. The dataset used for the experiments was drawn from 9 classes of tomato diseases and a healthy class from PlantVillage. The findings show that the pretrained Densenet-120 performed excellently with 99.68% precision, 99.84% F-1 score, and 99.81% accuracy, which is higher compared to its non-trained based model showing the effectiveness of using a combination of a CNN model with fine-tuning adjustment in classifying crop diseases.</span></span>

Download Full-text

Syntax-Informed Self-Attention Network for Span-Based Joint Entity and Relation Extraction

Applied Sciences ◽

10.3390/app11041480 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1480

Author(s):

Haiyang Zhang ◽

Guanqun Zhang ◽

Ricardo Ma

Keyword(s):

State Of The Art ◽

Relation Extraction ◽

Local Context ◽

Semantic Features ◽

Global Context ◽

Contextual Features ◽

Current State ◽

Syntactic Features ◽

Dependency Trees ◽

Depth Learning

Current state-of-the-art joint entity and relation extraction framework is based on span-level entity classification and relation identification between pairs of entity mentions. However, while maintaining an efficient exhaustive search on spans, the importance of syntactic features is not taken into consideration. It will lead to a problem that the prediction of a relation between two entities is related based on corresponding entity types, but in fact they are not related in the sentence. In addition, although previous works have proven that extract local context is beneficial for the task, it still lacks in-depth learning of contextual features in local context. In this paper, we propose to incorporate syntax knowledge into multi-head self-attention by employing part of heads to focus on syntactic parents of each token from pruned dependency trees, and we use it to model the global context to fuse syntactic and semantic features. In addition, in order to get richer contextual features from the local context, we apply local focus mechanism on entity pairs and corresponding context. Based on applying the two strategies, we perform joint entity and relation extraction on span-level. Experimental results show that our model achieves significant improvements on both Conll04 and SciERC dataset compared to strong competitors.

Download Full-text

Estimating word-level quality of statistical machine translation output using monolingual information alone

Natural Language Engineering ◽

10.1017/s1351324919000111 ◽

2019 ◽

Vol 26 (1) ◽

pp. 73-94

Author(s):

Arda Tezcan ◽

Véronique Hoste ◽

Lieve Macken

Keyword(s):

Machine Translation ◽

Network Architecture ◽

State Of The Art ◽

Statistical Machine Translation ◽

Neural Network Architecture ◽

Quality Estimation ◽

Word Level ◽

Syntactic Features ◽

Grammatical Errors

AbstractVarious studies show that statistical machine translation (SMT) systems suffer from fluency errors, especially in the form of grammatical errors and errors related to idiomatic word choices. In this study, we investigate the effectiveness of using monolingual information contained in the machine-translated text to estimate word-level quality of SMT output. We propose a recurrent neural network architecture which uses morpho-syntactic features and word embeddings as word representations within surface and syntactic n-grams. We test the proposed method on two language pairs and for two tasks, namely detecting fluency errors and predicting overall post-editing effort. Our results show that this method is effective for capturing all types of fluency errors at once. Moreover, on the task of predicting post-editing effort, while solely relying on monolingual information, it achieves on-par results with the state-of-the-art quality estimation systems which use both bilingual and monolingual information.

Download Full-text