scholarly journals Semantic Enhanced Distantly Supervised Relation Extraction via Graph Attention Network

Information ◽  
2020 ◽  
Vol 11 (11) ◽  
pp. 528
Author(s):  
Xiaoye Ouyang ◽  
Shudong Chen ◽  
Rong Wang

Distantly Supervised relation extraction methods can automatically extract the relation between entity pairs, which are essential for the construction of a knowledge graph. However, the automatically constructed datasets comprise amounts of low-quality sentences and noisy words, and the current Distantly Supervised methods ignore these noisy data, resulting in unacceptable accuracy. To mitigate this problem, we present a novel Distantly Supervised approach SEGRE (Semantic Enhanced Graph attention networks Relation Extraction) for improved relation extraction. Our model first uses word position and entity type information to provide abundant local features and background knowledge. Then it builds the dependency trees to remove noisy words that are irrelevant to relations and employs Graph Attention Networks (GATs) to encode syntactic information, which also captures the important semantic features of relational words in each instance. Furthermore, to make our model more robust against noisy words, the intra-bag attention module is used to weight the bag representation and mitigate noise in the bag. Through extensive experiments on Riedel New York Times (NYT) and Google IISc Distantly Supervised (GIDS) datasets, we demonstrate SEGRE’s effectiveness.

2020 ◽  
Vol 34 (05) ◽  
pp. 8528-8535
Author(s):  
Tapas Nayak ◽  
Hwee Tou Ng

A relation tuple consists of two entities and the relation between them, and often such tuples are found in unstructured text. There may be multiple relation tuples present in a text and they may share one or both entities among them. Extracting such relation tuples from a sentence is a difficult task and sharing of entities or overlapping entities among the tuples makes it more challenging. Most prior work adopted a pipeline approach where entities were identified first followed by finding the relations among them, thus missing the interaction among the relation tuples in a sentence. In this paper, we propose two approaches to use encoder-decoder architecture for jointly extracting entities and relations. In the first approach, we propose a representation scheme for relation tuples which enables the decoder to generate one word at a time like machine translation models and still finds all the tuples present in a sentence with full entity names of different length and with overlapping entities. Next, we propose a pointer network-based decoding approach where an entire tuple is generated at every time step. Experiments on the publicly available New York Times corpus show that our proposed approaches outperform previous work and achieve significantly higher F1 scores.


2020 ◽  
Vol 10 (11) ◽  
pp. 3851
Author(s):  
Seongsik Park ◽  
Harksoo Kim

Relation extraction is a type of information extraction task that recognizes semantic relationships between entities in a sentence. Many previous studies have focused on extracting only one semantic relation between two entities in a single sentence. However, multiple entities in a sentence are associated through various relations. To address this issue, we proposed a relation extraction model based on a dual pointer network with a multi-head attention mechanism. The proposed model finds n-to-1 subject–object relations using a forward object decoder. Then, it finds 1-to-n subject–object relations using a backward subject decoder. Our experiments confirmed that the proposed model outperformed previous models, with an F1-score of 80.8% for the ACE (automatic content extraction) 2005 corpus and an F1-score of 78.3% for the NYT (New York Times) corpus.


2020 ◽  
Vol 76 (4) ◽  
pp. 67-75
Author(s):  
ELENA V. ILOVA ◽  
◽  
ELENA N. GALICHKINA ◽  
RUFINA ZH. IZMAILOVA ◽  
◽  
...  

The article describes distinctive features of a film review as a speech genre which is now one of the most popular genres of the mass media net discourse. The article proves its intertextual character and analyses its lexical and semantic features. Film reviews taken from the following sources: sites of cinema goers: www.imdb.com, www.empireonline.com, www.pluggedin.com ; official sites of film critics (e.g. R. Ebert) https://www.rogerebert.com; official sites of the newspapers: The New York Times https://www.nytimes.com/reviews/movies, The Guardian https://www.theguardian.com/film+tone/reviews and other sites: https://www.imdb.com/search/keyword/?keywords=movie-review, https://www.pluggedin.com/movie-reviews/. The material for analysis comprises about 100 film reviews released in 2020. The reviews are in open access in the Internet. The volume of the analysed material is about 200 pages. The method used to achieve the main objective is interpretation analysis of film reviews. As a result of the theoretical material analysis main directions in the genre research were specified; key distinctive features of a film review were studied. The actuality of the research is determined by the rising interest to net mass media discourse genres. The main objective is to elicit and describe lexical and semantic features of film reviews as a speech genre. The conducted research made it possible to prove the interdiscoursive and poly-discoursive nature of the genre in question and to systematize its lexical and semantic features. The analysis disclosed that intertextuality of a film review is actualized in the interaction of three types of discourse: that of the critic, that of the film and that of other people. Poly-discoursive nature of a film review is expressed through the combination of publicistic, literary and scientific styles features. Another important characteristic is evaluativity represented in emotionally coloured vocabulary. Among other lexical and semantic features are the following: usage of non-specific terms, cliches, rhetorical questions with precedent names, intertextual inserting, various stylistic devices, among which epithets and metaphors are most often used. It’s been observed that a film review is filled with bookish vocabulary as well as stylistically low words and expressions.


2021 ◽  
Vol 11 (4) ◽  
pp. 1480
Author(s):  
Haiyang Zhang ◽  
Guanqun Zhang ◽  
Ricardo Ma

Current state-of-the-art joint entity and relation extraction framework is based on span-level entity classification and relation identification between pairs of entity mentions. However, while maintaining an efficient exhaustive search on spans, the importance of syntactic features is not taken into consideration. It will lead to a problem that the prediction of a relation between two entities is related based on corresponding entity types, but in fact they are not related in the sentence. In addition, although previous works have proven that extract local context is beneficial for the task, it still lacks in-depth learning of contextual features in local context. In this paper, we propose to incorporate syntax knowledge into multi-head self-attention by employing part of heads to focus on syntactic parents of each token from pruned dependency trees, and we use it to model the global context to fuse syntactic and semantic features. In addition, in order to get richer contextual features from the local context, we apply local focus mechanism on entity pairs and corresponding context. Based on applying the two strategies, we perform joint entity and relation extraction on span-level. Experimental results show that our model achieves significant improvements on both Conll04 and SciERC dataset compared to strong competitors.


2021 ◽  
Vol 3 (1) ◽  
pp. 123-167
Author(s):  
Lars Hillebrand ◽  
David Biesner ◽  
Christian Bauckhage ◽  
Rafet Sifa

Unsupervised topic extraction is a vital step in automatically extracting concise contentual information from large text corpora. Existing topic extraction methods lack the capability of linking relations between these topics which would further help text understanding. Therefore we propose utilizing the Decomposition into Directional Components (DEDICOM) algorithm which provides a uniquely interpretable matrix factorization for symmetric and asymmetric square matrices and tensors. We constrain DEDICOM to row-stochasticity and non-negativity in order to factorize pointwise mutual information matrices and tensors of text corpora. We identify latent topic clusters and their relations within the vocabulary and simultaneously learn interpretable word embeddings. Further, we introduce multiple methods based on alternating gradient descent to efficiently train constrained DEDICOM algorithms. We evaluate the qualitative topic modeling and word embedding performance of our proposed methods on several datasets, including a novel New York Times news dataset, and demonstrate how the DEDICOM algorithm provides deeper text analysis than competing matrix factorization approaches.


2003 ◽  
Vol 15 (3) ◽  
pp. 98-105 ◽  
Author(s):  
Mark Galliker ◽  
Jan Herman
Keyword(s):  
New York ◽  

Zusammenfassung. Am Beispiel der Repräsentation von Mann und Frau in der Times und in der New York Times wird ein inhaltsanalytisches Verfahren vorgestellt, das sich besonders für die Untersuchung elektronisch gespeicherter Printmedien eignet. Unter Co-Occurrence-Analyse wird die systematische Untersuchung verbaler Kombinationen pro Zähleinheit verstanden. Diskutiert wird das Problem der Auswahl der bei der Auswertung und Darstellung der Ergebnisse berücksichtigten semantischen Einheiten.


Sign in / Sign up

Export Citation Format

Share Document