Relation Extraction Using Supervision from Topic Knowledge of Relation Labels

Explicitly exploring the semantics of a relation is significant for high-accuracy relation extraction, which is, however, not fully studied in previous work. In this paper, we mine the topic knowledge of a relation to explicitly represent the semantics of this relation, and model relation extraction as a matching problem. That is, the matching score between a sentence and a candidate relation is predicted for an entity pair. To this end, we propose a deep matching network to precisely model the semantic similarity between a sentence-relation pair. Besides, the topic knowledge also allows us to derive the importance information of samples as well as two knowledge-guided negative sampling strategies in the training process. We conduct extensive experiments to evaluate the proposed framework and observe improvements in AUC of 11.5% and max F1 of 5.4% over the baselines with state-of-the-art performance.

Download Full-text

Exploring Encoder-Decoder Model for Distant Supervised Relation Extraction

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/610 ◽

2018 ◽

Author(s):

Sen Su ◽

Ningning Jia ◽

Xiang Cheng ◽

Shuguang Zhu ◽

Ruiping Li

Keyword(s):

Short Term Memory ◽

State Of The Art ◽

Relation Extraction ◽

Short Term ◽

Sequential Prediction ◽

Memory Network ◽

Long Short Term Memory ◽

Model Training ◽

The Impact ◽

Model Relation

In this paper, we present an encoder-decoder model for distant supervised relation extraction. Given an entity pair and its sentence bag as input, in the encoder component, we employ the convolutional neural network to extract the features of the sentences in the sentence bag and merge them into a bag representation. In the decoder component, we utilize the long short-term memory network to model relation dependencies and predict the target relations in a sequential manner. In particular, to enable the sequential prediction of relations, we introduce a measure to quantify the amounts of information the relations take in their sentence bag, and use such information to determine the order of the relations of a sentence bag during model training. Moreover, we incorporate the attention mechanism into our model to dynamically adjust the bag representation to reduce the impact of sentences whose corresponding relations have been predicted. Extensive experiments on a popular dataset show that our model achieves significant improvement over state-of-the-art methods.

Download Full-text

The Degree of Oxidation of Graphene Oxide

Nanomaterials ◽

10.3390/nano11030560 ◽

2021 ◽

Vol 11 (3) ◽

pp. 560

Author(s):

Alexandra Carvalho ◽

Mariana C. F. Costa ◽

Valeria S. Marangoni ◽

Pei Rou Ng ◽

Thi Le Hang Nguyen ◽

...

Keyword(s):

Graphene Oxide ◽

Ab Initio ◽

State Of The Art ◽

High Accuracy ◽

Precise Determination ◽

Photoemission Spectroscopy ◽

Pristine Graphene ◽

X Ray ◽

Degree Of Oxidation

We show that the degree of oxidation of graphene oxide (GO) can be obtained by using a combination of state-of-the-art ab initio computational modeling and X-ray photoemission spectroscopy (XPS). We show that the shift of the XPS C1s peak relative to pristine graphene, ΔEC1s, can be described with high accuracy by ΔEC1s=A(cO−cl)2+E0, where c0 is the oxygen concentration, A=52.3 eV, cl=0.122, and E0=1.22 eV. Our results demonstrate a precise determination of the oxygen content of GO samples.

Download Full-text

A Machine Vision Approach for Bioreactor Foam Sensing

SLAS TECHNOLOGY Translating Life Sciences Innovation ◽

10.1177/24726303211008861 ◽

2021 ◽

pp. 247263032110088

Author(s):

Jonas Austerjost ◽

Robert Söldner ◽

Christoffer Edlund ◽

Johan Trygg ◽

David Pollard ◽

...

Keyword(s):

Machine Learning ◽

Machine Vision ◽

State Of The Art ◽

Low Cost ◽

High Accuracy ◽

Consumer Electronics ◽

Learning System ◽

Automotive Applications ◽

Fine Grained

Machine vision is a powerful technology that has become increasingly popular and accurate during the last decade due to rapid advances in the field of machine learning. The majority of machine vision applications are currently found in consumer electronics, automotive applications, and quality control, yet the potential for bioprocessing applications is tremendous. For instance, detecting and controlling foam emergence is important for all upstream bioprocesses, but the lack of robust foam sensing often leads to batch failures from foam-outs or overaddition of antifoam agents. Here, we report a new low-cost, flexible, and reliable foam sensor concept for bioreactor applications. The concept applies convolutional neural networks (CNNs), a state-of-the-art machine learning system for image processing. The implemented method shows high accuracy for both binary foam detection (foam/no foam) and fine-grained classification of foam levels.

Download Full-text

Named Entity Recognition and Relation Extraction

ACM Computing Surveys ◽

10.1145/3445965 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-39

Author(s):

Zara Nasar ◽

Syed Waqar Jaffry ◽

Muhammad Kamran Malik

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

Relation Extraction ◽

The State ◽

Entity Recognition ◽

Joint Models ◽

Named Entity ◽

Textual Data ◽

Benchmark Datasets

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.

Download Full-text

Extraction of causal relations based on SBEL and BERT model

Database ◽

10.1093/database/baab005 ◽

2021 ◽

Vol 2021 ◽

Author(s):

Yifan Shao ◽

Haoru Li ◽

Jinghang Gu ◽

Longhua Qian ◽

Guodong Zhou

Keyword(s):

State Of The Art ◽

Causal Relation ◽

Relation Extraction ◽

The Other ◽

Biomedical Text ◽

Intermediate Form ◽

Biomedical Text Mining ◽

Causal Relations ◽

The One ◽

Stage 1

Abstract Extraction of causal relations between biomedical entities in the form of Biological Expression Language (BEL) poses a new challenge to the community of biomedical text mining due to the complexity of BEL statements. We propose a simplified form of BEL statements [Simplified Biological Expression Language (SBEL)] to facilitate BEL extraction and employ BERT (Bidirectional Encoder Representation from Transformers) to improve the performance of causal relation extraction (RE). On the one hand, BEL statement extraction is transformed into the extraction of an intermediate form—SBEL statement, which is then further decomposed into two subtasks: entity RE and entity function detection. On the other hand, we use a powerful pretrained BERT model to both extract entity relations and detect entity functions, aiming to improve the performance of two subtasks. Entity relations and functions are then combined into SBEL statements and finally merged into BEL statements. Experimental results on the BioCreative-V Track 4 corpus demonstrate that our method achieves the state-of-the-art performance in BEL statement extraction with F1 scores of 54.8% in Stage 2 evaluation and of 30.1% in Stage 1 evaluation, respectively. Database URL: https://github.com/grapeff/SBEL_datasets

Download Full-text

Large-scale Semantic Parsing without Question-Answer Pairs

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00190 ◽

2014 ◽

Vol 2 ◽

pp. 377-392 ◽

Cited By ~ 40

Author(s):

Siva Reddy ◽

Mirella Lapata ◽

Mark Steedman

Keyword(s):

Natural Language ◽

Large Scale ◽

Graph Matching ◽

State Of The Art ◽

The State ◽

Semantic Parsing ◽

Matching Problem ◽

Weak Supervision ◽

Benchmark Datasets

In this paper we introduce a novel semantic parsing approach to query Freebase in natural language without requiring manual annotations or question-answer pairs. Our key insight is to represent natural language via semantic graphs whose topology shares many commonalities with Freebase. Given this representation, we conceptualize semantic parsing as a graph matching problem. Our model converts sentences to semantic graphs using CCG and subsequently grounds them to Freebase guided by denotations as a form of weak supervision. Evaluation experiments on a subset of the Free917 and WebQuestions benchmark datasets show our semantic parser improves over the state of the art.

Download Full-text

Multi-Graph Cooperative Learning Towards Distant Supervised Relation Extraction

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3466560 ◽

2021 ◽

Vol 12 (5) ◽

pp. 1-21

Author(s):

Changsen Yuan ◽

Heyan Huang ◽

Chong Feng

Keyword(s):

Cooperative Learning ◽

State Of The Art ◽

Relation Extraction ◽

Sentence Length ◽

Universal Relation ◽

Dependency Parsing ◽

Convolutional Network ◽

Syntactic Features ◽

Use Dependency

The Graph Convolutional Network (GCN) is a universal relation extraction method that can predict relations of entity pairs by capturing sentences’ syntactic features. However, existing GCN methods often use dependency parsing to generate graph matrices and learn syntactic features. The quality of the dependency parsing will directly affect the accuracy of the graph matrix and change the whole GCN’s performance. Because of the influence of noisy words and sentence length in the distant supervised dataset, using dependency parsing on sentences causes errors and leads to unreliable information. Therefore, it is difficult to obtain credible graph matrices and relational features for some special sentences. In this article, we present a Multi-Graph Cooperative Learning model (MGCL), which focuses on extracting the reliable syntactic features of relations by different graphs and harnessing them to improve the representations of sentences. We conduct experiments on a widely used real-world dataset, and the experimental results show that our model achieves the state-of-the-art performance of relation extraction.

Download Full-text

A Hybrid Method Based on Semi-Supervised Learning for Relation Extraction in Chinese EMRs (Preprint)

10.2196/preprints.28220 ◽

2021 ◽

Author(s):

ChunMing Yang

Keyword(s):

Supervised Learning ◽

Learning Algorithm ◽

Medical Knowledge ◽

Relation Extraction ◽

Small Scale ◽

Semantic Features ◽

Training Process ◽

Network Layers ◽

Relation Prediction ◽

The Cost

BACKGROUND Extracting relations between the entities from Chinese electronic medical records(EMRs) is the key to automatically constructing medical knowledge graphs. Due to the less available labeled corpus, most of the current researches are based on shallow networks, which cannot fully capture the complex semantic features in the text of Chinese EMRs. OBJECTIVE In this study, a hybrid deep learning method based on semi-supervised learning is proposed to extract the entity relations from small-scale complex Chinese EMRs. METHODS The semantic features of sentences are extracted by residual network (ResNet) and the long dependent information is captured by bidirectional GRU (Gated Recurrent Unit). Then the attention mechanism is used to assign weights to the extracted features respectively, and the output of the two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN model. The F1-score on DAP, SAP, SNAP, TeRD, TeAP, TeCP, TeRS, TeAS, TrAD, TrRD and TrAP 11 relation categories reaches 80.95%, 93.91%, 92.96%, 88.43%, 86.54%, 85.58%, 87.96%, 94.74%, 93.01%, 87.58% and 95.48%, respectively. CONCLUSIONS The hybrid neural network method strengthens the feature transfer and reuse between different network layers and reduces the cost of manual tagging relations. The results demonstrate that our proposed method is effective for the relation extraction in Chinese EMRs.

Download Full-text

Semantic Similarity Measurement Methods: The State-of-the-art

Research Journal of Applied Sciences Engineering and Technology ◽

10.19026/rjaset.8.1183 ◽

2014 ◽

Vol 8 (18) ◽

pp. 1923-1932 ◽

Cited By ~ 1

Author(s):

Fatmah Nazar Mahmood ◽

Amirah Ismail

Keyword(s):

Semantic Similarity ◽

State Of The Art ◽

Measurement Methods ◽

The State ◽

Similarity Measurement ◽

Semantic Similarity Measurement

Download Full-text

ConnectIt

Proceedings of the VLDB Endowment ◽

10.14778/3436905.3436923 ◽

2020 ◽

Vol 14 (4) ◽

pp. 653-667

Author(s):

Laxman Dhulipala ◽

Changwan Hong ◽

Julian Shun

Keyword(s):

Experimental Evaluation ◽

Comprehensive Evaluation ◽

State Of The Art ◽

Graph Connectivity ◽

Connected Components ◽

Sampling Strategies ◽

Spanning Forest ◽

Speed Up ◽

Minimum Spanning Forest ◽

Edge Sampling

Connected components is a fundamental kernel in graph applications. The fastest existing multicore algorithms for solving graph connectivity are based on some form of edge sampling and/or linking and compressing trees. However, many combinations of these design choices have been left unexplored. In this paper, we design the ConnectIt framework, which provides different sampling strategies as well as various tree linking and compression schemes. ConnectIt enables us to obtain several hundred new variants of connectivity algorithms, most of which extend to computing spanning forest. In addition to static graphs, we also extend ConnectIt to support mixes of insertions and connectivity queries in the concurrent setting. We present an experimental evaluation of ConnectIt on a 72-core machine, which we believe is the most comprehensive evaluation of parallel connectivity algorithms to date. Compared to a collection of state-of-the-art static multicore algorithms, we obtain an average speedup of 12.4x (2.36x average speedup over the fastest existing implementation for each graph). Using ConnectIt, we are able to compute connectivity on the largest publicly-available graph (with over 3.5 billion vertices and 128 billion edges) in under 10 seconds using a 72-core machine, providing a 3.1x speedup over the fastest existing connectivity result for this graph, in any computational setting. For our incremental algorithms, we show that our algorithms can ingest graph updates at up to several billion edges per second. To guide the user in selecting the best variants in ConnectIt for different situations, we provide a detailed analysis of the different strategies. Finally, we show how the techniques in ConnectIt can be used to speed up two important graph applications: approximate minimum spanning forest and SCAN clustering.

Download Full-text