A Co-Embedding Model with Variational Auto-Encoder for Knowledge Graphs

Knowledge graph (KG) embedding has been widely studied to obtain low-dimensional representations for entities and relations. It serves as the basis for downstream tasks, such as KG completion and relation extraction. Traditional KG embedding techniques usually represent entities/relations as vectors or tensors, mapping them in different semantic spaces and ignoring the uncertainties. The affinities between entities and relations are ambiguous when they are not embedded in the same latent spaces. In this paper, we incorporate a co-embedding model for KG embedding, which learns low-dimensional representations of both entities and relations in the same semantic space. To address the issue of neglecting uncertainty for KG components, we propose a variational auto-encoder that represents KG components as Gaussian distributions. In addition, compared with previous methods, our method has the advantages of high quality and interpretability. Our experimental results on several benchmark datasets demonstrate our model’s superiority over the state-of-the-art baselines.

Download Full-text

A Novel Negative Sampling Based on Frequency of Relational Association Entities for Knowledge Graph Embedding

Journal of Web Engineering ◽

10.13052/jwe1540-9589.2068 ◽

2021 ◽

Author(s):

Wanhua Cao ◽

Yi Zhang ◽

Juntao Liu ◽

Ziyun Rao

Keyword(s):

Link Prediction ◽

State Of The Art ◽

Evaluation Criteria ◽

Relation Extraction ◽

Graph Embedding ◽

Semantic Space ◽

Knowledge Graph ◽

Knowledge Reasoning ◽

Relation Prediction ◽

Low Dimensional

Knowledge graph embedding improves the performance of relation extraction and knowledge reasoning by encoding entities and relationships in low-dimensional semantic space. During training, negative samples are usually constructed by replacing the head/tail entity. And the different replacing relationships lead to different accuracy of the prediction results. This paper develops a negative triplets construction framework according to the frequency of relational association entities. The proposed construction framework can fully consider the quantitative of relations and entities in the dataset to assign the proportion of relation and entity replacement and the frequency of the entities associated with each relationship to set reasonable proportions for different relations. To verify the validity of the proposed construction framework, it is integrated into the state-of-the-art knowledge graph embedding models, such as TransE, TransH, DistMult, ComplEx, and Analogy. And both the evaluation criteria of relation prediction and entity prediction are used to evaluate the performance of link prediction more comprehensively. The experimental results on two commonly used datasets, WN18 and FB15K, show that the proposed method improves entity link and triplet classification accuracy, especially the accuracy of relational link prediction.

Download Full-text

TransET: Knowledge Graph Embedding with Entity Types

Electronics ◽

10.3390/electronics10121407 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1407

Author(s):

Peng Wang ◽

Jing Zhou ◽

Yuzhang Liu ◽

Xingchen Zhou

Keyword(s):

Link Prediction ◽

State Of The Art ◽

Score Function ◽

Graph Embedding ◽

Vector Spaces ◽

Knowledge Graph ◽

Semantic Features ◽

Knowledge Graphs ◽

Real World Datasets ◽

Low Dimensional

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.

Download Full-text

High-Quality Knowledge Graphs Generation: R2RML and RML Comparison, Rules Validation and Inconsistency Resolution

Applications and Practices in Ontology Design, Extraction, and Reasoning - Studies on the Semantic Web ◽

10.3233/ssw200035 ◽

2020 ◽

Author(s):

Anastasia Dimou

Keyword(s):

State Of The Art ◽

The State ◽

Knowledge Graph ◽

High Quality ◽

Inconsistency Resolution ◽

Knowledge Graphs ◽

Rdf Graphs ◽

Graph Generation

In this chapter, an overview of the state of the art on knowledge graph generation is provided, with focus on the two prevalent mapping languages: the W3C recommended R2RML and its generalisation RML. We look into details on their differences and explain how knowledge graphs, in the form of RDF graphs, can be generated with each one of the two mapping languages. Then we assess if the vocabulary terms were properly applied to the data and no violations occurred on their use, either using R2RML or RML to generate the desired knowledge graph.

Download Full-text

Bootstrapping Entity Alignment with Knowledge Graph Embedding

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/611 ◽

2018 ◽

Cited By ~ 35

Author(s):

Zequn Sun ◽

Wei Hu ◽

Qingheng Zhang ◽

Yuzhong Qu

Keyword(s):

Performance Improvement ◽

Real World ◽

State Of The Art ◽

Graph Embedding ◽

Training Data ◽

Knowledge Graph ◽

Error Accumulation ◽

Knowledge Graphs ◽

Real World Datasets ◽

Low Dimensional

Embedding-based entity alignment represents different knowledge graphs (KGs) as low-dimensional embeddings and finds entity alignment by measuring the similarities between entity embeddings. Existing approaches have achieved promising results, however, they are still challenged by the lack of enough prior alignment as labeled training data. In this paper, we propose a bootstrapping approach to embedding-based entity alignment. It iteratively labels likely entity alignment as training data for learning alignment-oriented KG embeddings. Furthermore, it employs an alignment editing method to reduce error accumulation during iterations. Our experiments on real-world datasets showed that the proposed approach significantly outperformed the state-of-the-art embedding-based ones for entity alignment. The proposed alignment-oriented KG embedding, bootstrapping process and alignment editing method all contributed to the performance improvement.

Download Full-text

Conversation Concepts: Understanding Topics and Building Taxonomies for Financial Services

Information ◽

10.3390/info12040160 ◽

2021 ◽

Vol 12 (4) ◽

pp. 160

Author(s):

John P. McCrae ◽

Pranab Mohanty ◽

Siddharth Narayanan ◽

Bianca Pereira ◽

Paul Buitelaar ◽

...

Keyword(s):

Language Processing ◽

Financial Services ◽

State Of The Art ◽

Knowledge Graph ◽

Financial Industry ◽

High Quality ◽

Processing Technologies ◽

Knowledge Graphs ◽

Graph System ◽

New Applications

Knowledge graphs are proving to be an increasingly important part of modern enterprises, and new applications of such enterprise knowledge graphs are still being found. In this paper, we report on the experience with the use of an automatic knowledge graph system called Saffron in the context of a large financial enterprise and show how this has found applications within this enterprise as part of the “Conversation Concepts Artificial Intelligence” tool. In particular, we analyse the use cases for knowledge graphs within this enterprise, and this led us to a new extension to the knowledge graph system. We present the results of these adaptations, including the introduction of a semi-supervised taxonomy extraction system, which includes analysts in-the-loop. Further, we extend the kinds of relations extracted by the system and show how the use of the BERTand ELMomodels can produce high-quality results. Thus, we show how this tool can help realize a smart enterprise and how requirements in the financial industry can be realised by state-of-the-art natural language processing technologies.

Download Full-text

Text-Graph Enhanced Knowledge Graph Representation Learning

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.697856 ◽

2021 ◽

Vol 4 ◽

Author(s):

Linmei Hu ◽

Mengmei Zhang ◽

Shaohua Li ◽

Jinghan Shi ◽

Chuan Shi ◽

...

Keyword(s):

State Of The Art ◽

Representation Learning ◽

Graph Representation ◽

Semantic Relationships ◽

Convolutional Networks ◽

Gating Mechanism ◽

Benchmark Datasets ◽

Knowledge Graphs ◽

Low Dimensional ◽

Embedding Methods

Knowledge Graphs (KGs) such as Freebase and YAGO have been widely adopted in a variety of NLP tasks. Representation learning of Knowledge Graphs (KGs) aims to map entities and relationships into a continuous low-dimensional vector space. Conventional KG embedding methods (such as TransE and ConvE) utilize only KG triplets and thus suffer from structure sparsity. Some recent works address this issue by incorporating auxiliary texts of entities, typically entity descriptions. However, these methods usually focus only on local consecutive word sequences, but seldom explicitly use global word co-occurrence information in a corpus. In this paper, we propose to model the whole auxiliary text corpus with a graph and present an end-to-end text-graph enhanced KG embedding model, named Teger. Specifically, we model the auxiliary texts with a heterogeneous entity-word graph (called text-graph), which entails both local and global semantic relationships among entities and words. We then apply graph convolutional networks to learn informative entity embeddings that aggregate high-order neighborhood information. These embeddings are further integrated with the KG triplet embeddings via a gating mechanism, thus enriching the KG representations and alleviating the inherent structure sparsity. Experiments on benchmark datasets show that our method significantly outperforms several state-of-the-art methods.

Download Full-text

Dual Quaternion Embeddings for Link Prediction

Applied Sciences ◽

10.3390/app11125572 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5572

Author(s):

Liming Gao ◽

Huiling Zhu ◽

Hankz Hankui Zhuo ◽

Jin Xu

Keyword(s):

Artificial Intelligence ◽

Link Prediction ◽

State Of The Art ◽

Knowledge Graph ◽

Dual Quaternion ◽

Novel Approach ◽

Quaternion Space ◽

Knowledge Graphs ◽

Low Dimensional

The applications of knowledge graph have received much attention in the field of artificial intelligence. The quality of knowledge graphs is, however, often influenced by missing facts. To predict the missing facts, various solid transformation based models have been proposed by mapping knowledge graphs into low dimensional spaces. However, most of the existing transformation based approaches ignore that there are multiple relations between two entities, which is common in the real world. In order to address this challenge, we propose a novel approach called DualQuatE that maps entities and relations into a dual quaternion space. Specifically, entities are represented by pure quaternions and relations are modeled based on the combination of rotation and translation from head to tail entities. After that we utilize interactions of different translations and rotations to distinguish various relations between head and tail entities. Experimental results exhibit that the performance of DualQuatE is competitive compared to the existing state-of-the-art models.

Download Full-text

Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/345 ◽

2018 ◽

Cited By ~ 15

Author(s):

Yang Liu ◽

Quanxue Gao ◽

Jin Li ◽

Jungong Han ◽

Ling Shao

Keyword(s):

State Of The Art ◽

Feature Space ◽

Original Data ◽

Semantic Space ◽

Dimensional Subspace ◽

Low Rank ◽

Visual Features ◽

Visual Feature ◽

Benchmark Datasets ◽

Low Dimensional

Zero-shot learning (ZSL) has been widely researched and get successful in machine learning. Most existing ZSL methods aim to accurately recognize objects of unseen classes by learning a shared mapping from the feature space to a semantic space. However, such methods did not investigate in-depth whether the mapping can precisely reconstruct the original visual feature. Motivated by the fact that the data have low intrinsic dimensionality e.g. low-dimensional subspace. In this paper, we formulate a novel framework named Low-rank Embedded Semantic AutoEncoder (LESAE) to jointly seek a low-rank mapping to link visual features with their semantic representations. Taking the encoder-decoder paradigm, the encoder part aims to learn a low-rank mapping from the visual feature to the semantic space, while decoder part manages to reconstruct the original data with the learned mapping. In addition, a non-greedy iterative algorithm is adopted to solve our model. Extensive experiments on six benchmark datasets demonstrate its superiority over several state-of-the-art algorithms.

Download Full-text

An All-Batch Loss for Constructing Prediction Intervals

Applied Sciences ◽

10.3390/app11041728 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1728

Author(s):

Hua Zhong ◽

Li Xu

Keyword(s):

Gradient Descent ◽

State Of The Art ◽

Prediction Interval ◽

Feedforward Neural Networks ◽

Important Research ◽

Likelihood Principle ◽

High Quality ◽

Construction Methods ◽

Important Research Topic ◽

Benchmark Datasets

The prediction interval (PI) is an important research topic in reliability analyses and decision support systems. Data size and computation costs are two of the issues which may hamper the construction of PIs. This paper proposes an all-batch (AB) loss function for constructing high quality PIs. Taking the full advantage of the likelihood principle, the proposed loss makes it possible to train PI generation models using the gradient descent (GD) method for both small and large batches of samples. With the structure of dual feedforward neural networks (FNNs), a high-quality PI generation framework is introduced, which can be adapted to a variety of problems including regression analysis. Numerical experiments were conducted on the benchmark datasets; the results show that higher-quality PIs were achieved using the proposed scheme. Its reliability and stability were also verified in comparison with various state-of-the-art PI construction methods.

Download Full-text

Named Entity Recognition and Relation Extraction

ACM Computing Surveys ◽

10.1145/3445965 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-39

Author(s):

Zara Nasar ◽

Syed Waqar Jaffry ◽

Muhammad Kamran Malik

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Named Entity Recognition ◽

Relation Extraction ◽

The State ◽

Entity Recognition ◽

Joint Models ◽

Named Entity ◽

Textual Data ◽

Benchmark Datasets

With the advent of Web 2.0, there exist many online platforms that result in massive textual-data production. With ever-increasing textual data at hand, it is of immense importance to extract information nuggets from this data. One approach towards effective harnessing of this unstructured textual data could be its transformation into structured text. Hence, this study aims to present an overview of approaches that can be applied to extract key insights from textual data in a structured way. For this, Named Entity Recognition and Relation Extraction are being majorly addressed in this review study. The former deals with identification of named entities, and the latter deals with problem of extracting relation between set of entities. This study covers early approaches as well as the developments made up till now using machine learning models. Survey findings conclude that deep-learning-based hybrid and joint models are currently governing the state-of-the-art. It is also observed that annotated benchmark datasets for various textual-data generators such as Twitter and other social forums are not available. This scarcity of dataset has resulted into relatively less progress in these domains. Additionally, the majority of the state-of-the-art techniques are offline and computationally expensive. Last, with increasing focus on deep-learning frameworks, there is need to understand and explain the under-going processes in deep architectures.

Download Full-text