scholarly journals Relation Extraction Using Supervision from Topic Knowledge of Relation Labels

Author(s):  
Haiyun Jiang ◽  
Li Cui ◽  
Zhe Xu ◽  
Deqing Yang ◽  
Jindong Chen ◽  
...  

Explicitly exploring the semantics of a relation is significant for high-accuracy relation extraction, which is, however, not fully studied in previous work. In this paper, we mine the topic knowledge of a relation to explicitly represent the semantics of this relation, and model relation extraction as a matching problem. That is, the matching score between a sentence and a candidate relation is predicted for an entity pair. To this end, we propose a deep matching network to precisely model the semantic similarity between a sentence-relation pair. Besides, the topic knowledge also allows us to derive the importance information of samples as well as two knowledge-guided negative sampling strategies in the training process. We conduct extensive experiments to evaluate the proposed framework and observe improvements in AUC of 11.5% and max F1 of 5.4% over the baselines with state-of-the-art performance.

Author(s):  
Sen Su ◽  
Ningning Jia ◽  
Xiang Cheng ◽  
Shuguang Zhu ◽  
Ruiping Li

In this paper, we present an encoder-decoder model for distant supervised relation extraction. Given an entity pair and its sentence bag as input, in the encoder component, we employ the convolutional neural network to extract the features of the sentences in the sentence bag and merge them into a bag representation. In the decoder component, we utilize the long short-term memory network to model relation dependencies and predict the target relations in a sequential manner. In particular, to enable the sequential prediction of relations, we introduce a measure to quantify the amounts of information the relations take in their sentence bag, and use such information to determine the order of the relations of a sentence bag during model training. Moreover, we incorporate the attention mechanism into our model to dynamically adjust the bag representation to reduce the impact of sentences whose corresponding relations have been predicted. Extensive experiments on a popular dataset show that our model achieves significant improvement over state-of-the-art methods.


Nanomaterials ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 560
Author(s):  
Alexandra Carvalho ◽  
Mariana C. F. Costa ◽  
Valeria S. Marangoni ◽  
Pei Rou Ng ◽  
Thi Le Hang Nguyen ◽  
...  

We show that the degree of oxidation of graphene oxide (GO) can be obtained by using a combination of state-of-the-art ab initio computational modeling and X-ray photoemission spectroscopy (XPS). We show that the shift of the XPS C1s peak relative to pristine graphene, ΔEC1s, can be described with high accuracy by ΔEC1s=A(cO−cl)2+E0, where c0 is the oxygen concentration, A=52.3 eV, cl=0.122, and E0=1.22 eV. Our results demonstrate a precise determination of the oxygen content of GO samples.


Author(s):  
Jonas Austerjost ◽  
Robert Söldner ◽  
Christoffer Edlund ◽  
Johan Trygg ◽  
David Pollard ◽  
...  

Machine vision is a powerful technology that has become increasingly popular and accurate during the last decade due to rapid advances in the field of machine learning. The majority of machine vision applications are currently found in consumer electronics, automotive applications, and quality control, yet the potential for bioprocessing applications is tremendous. For instance, detecting and controlling foam emergence is important for all upstream bioprocesses, but the lack of robust foam sensing often leads to batch failures from foam-outs or overaddition of antifoam agents. Here, we report a new low-cost, flexible, and reliable foam sensor concept for bioreactor applications. The concept applies convolutional neural networks (CNNs), a state-of-the-art machine learning system for image processing. The implemented method shows high accuracy for both binary foam detection (foam/no foam) and fine-grained classification of foam levels.


2021 ◽  
Vol 54 (1) ◽  
pp. 1-39
Author(s):  
Zara Nasar ◽  
Syed Waqar Jaffry ◽  
Muhammad Kamran Malik

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.


Database ◽  
2021 ◽  
Vol 2021 ◽  
Author(s):  
Yifan Shao ◽  
Haoru Li ◽  
Jinghang Gu ◽  
Longhua Qian ◽  
Guodong Zhou

Abstract Extraction of causal relations between biomedical entities in the form of Biological Expression Language (BEL) poses a new challenge to the community of biomedical text mining due to the complexity of BEL statements. We propose a simplified form of BEL statements [Simplified Biological Expression Language (SBEL)] to facilitate BEL extraction and employ BERT (Bidirectional Encoder Representation from Transformers) to improve the performance of causal relation extraction (RE). On the one hand, BEL statement extraction is transformed into the extraction of an intermediate form—SBEL statement, which is then further decomposed into two subtasks: entity RE and entity function detection. On the other hand, we use a powerful pretrained BERT model to both extract entity relations and detect entity functions, aiming to improve the performance of two subtasks. Entity relations and functions are then combined into SBEL statements and finally merged into BEL statements. Experimental results on the BioCreative-V Track 4 corpus demonstrate that our method achieves the state-of-the-art performance in BEL statement extraction with F1 scores of 54.8% in Stage 2 evaluation and of 30.1% in Stage 1 evaluation, respectively. Database URL: https://github.com/grapeff/SBEL_datasets


Author(s):  
Siva Reddy ◽  
Mirella Lapata ◽  
Mark Steedman

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.


2021 ◽  
Vol 12 (5) ◽  
pp. 1-21
Author(s):  
Changsen Yuan ◽  
Heyan Huang ◽  
Chong Feng

The Graph Convolutional Network (GCN) is a universal relation extraction method that can predict relations of entity pairs by capturing sentences’ syntactic features. However, existing GCN methods often use dependency parsing to generate graph matrices and learn syntactic features. The quality of the dependency parsing will directly affect the accuracy of the graph matrix and change the whole GCN’s performance. Because of the influence of noisy words and sentence length in the distant supervised dataset, using dependency parsing on sentences causes errors and leads to unreliable information. Therefore, it is difficult to obtain credible graph matrices and relational features for some special sentences. In this article, we present a Multi-Graph Cooperative Learning model (MGCL), which focuses on extracting the reliable syntactic features of relations by different graphs and harnessing them to improve the representations of sentences. We conduct experiments on a widely used real-world dataset, and the experimental results show that our model achieves the state-of-the-art performance of relation extraction.


2021 ◽  
Author(s):  
ChunMing Yang

BACKGROUND Extracting relations between the entities from Chinese electronic medical records(EMRs) is the key to automatically constructing medical knowledge graphs. Due to the less available labeled corpus, most of the current researches are based on shallow networks, which cannot fully capture the complex semantic features in the text of Chinese EMRs. OBJECTIVE In this study, a hybrid deep learning method based on semi-supervised learning is proposed to extract the entity relations from small-scale complex Chinese EMRs. METHODS The semantic features of sentences are extracted by residual network (ResNet) and the long dependent information is captured by bidirectional GRU (Gated Recurrent Unit). Then the attention mechanism is used to assign weights to the extracted features respectively, and the output of the two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN model. The F1-score on DAP, SAP, SNAP, TeRD, TeAP, TeCP, TeRS, TeAS, TrAD, TrRD and TrAP 11 relation categories reaches 80.95%, 93.91%, 92.96%, 88.43%, 86.54%, 85.58%, 87.96%, 94.74%, 93.01%, 87.58% and 95.48%, respectively. CONCLUSIONS The hybrid neural network method strengthens the feature transfer and reuse between different network layers and reduces the cost of manual tagging relations. The results demonstrate that our proposed method is effective for the relation extraction in Chinese EMRs.


2020 ◽  
Vol 14 (4) ◽  
pp. 653-667
Author(s):  
Laxman Dhulipala ◽  
Changwan Hong ◽  
Julian Shun

Connected components is a fundamental kernel in graph applications. The fastest existing multicore algorithms for solving graph connectivity are based on some form of edge sampling and/or linking and compressing trees. However, many combinations of these design choices have been left unexplored. In this paper, we design the ConnectIt framework, which provides different sampling strategies as well as various tree linking and compression schemes. ConnectIt enables us to obtain several hundred new variants of connectivity algorithms, most of which extend to computing spanning forest. In addition to static graphs, we also extend ConnectIt to support mixes of insertions and connectivity queries in the concurrent setting. We present an experimental evaluation of ConnectIt on a 72-core machine, which we believe is the most comprehensive evaluation of parallel connectivity algorithms to date. Compared to a collection of state-of-the-art static multicore algorithms, we obtain an average speedup of 12.4x (2.36x average speedup over the fastest existing implementation for each graph). Using ConnectIt, we are able to compute connectivity on the largest publicly-available graph (with over 3.5 billion vertices and 128 billion edges) in under 10 seconds using a 72-core machine, providing a 3.1x speedup over the fastest existing connectivity result for this graph, in any computational setting. For our incremental algorithms, we show that our algorithms can ingest graph updates at up to several billion edges per second. To guide the user in selecting the best variants in ConnectIt for different situations, we provide a detailed analysis of the different strategies. Finally, we show how the techniques in ConnectIt can be used to speed up two important graph applications: approximate minimum spanning forest and SCAN clustering.


Sign in / Sign up

Export Citation Format

Share Document