Negative Sampling in Knowledge Representation Learning: A Mini-Review

Mapping Intimacies ◽

10.5121/csit.2020.101519 ◽

2020 ◽

Author(s):

Jing Qian ◽

Gangmin Li ◽

Katie Atkinson ◽

Yong Yue

Keyword(s):

Knowledge Representation ◽

Link Prediction ◽

Representation Learning ◽

Cluster Sampling ◽

Continuous Space ◽

Knowledge Representations ◽

Space Efficiency ◽

Low Dimensional ◽

Fixed Distribution

Knowledge representation learning (KRL) aims at encoding components of a knowledge graph (KG) into a low-dimensional continuous space, which has brought considerable successes in applying deep learning to graph embedding. Most famous KGs contain only positive instances for space efficiency. Typical KRL techniques, especially translational distance-based models, are trained through discriminating positive and negative samples. Thus, negative sampling is unquestionably a non-trivial step in KG embedding. The quality of generated negative samples can directly influence the performance of final knowledge representations in downstream tasks, such as link prediction and triple classification. This review summarizes current negative sampling methods in KRL and we categorize them into three sorts, fixed distribution-based, generative adversarial net (GAN)-based and cluster sampling. Based on this categorization we discuss the most prevalent existing approaches and their characteristics.

Download Full-text

Understanding Negative Sampling in Knowledge Graph Embedding

International Journal of Artificial Intelligence & Applications ◽

10.5121/ijaia.2021.12105 ◽

2021 ◽

Vol 12 (1) ◽

pp. 71-81

Author(s):

Jing Qian ◽

Gangmin Li ◽

Katie Atkinson ◽

Yong Yue

Keyword(s):

Link Prediction ◽

Graph Embedding ◽

Knowledge Graph ◽

Direct Impact ◽

Dimensional Vector Space ◽

Dynamic Distribution ◽

Space Efficiency ◽

Node Classification ◽

Low Dimensional

Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

Download Full-text

Image-embodied Knowledge Representation Learning

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/438 ◽

2017 ◽

Cited By ~ 13

Author(s):

Ruobing Xie ◽

Zhiyuan Liu ◽

Huanbo Luan ◽

Maosong Sun

Keyword(s):

Knowledge Representation ◽

Visual Information ◽

Representation Learning ◽

Learning Model ◽

Experimental Results ◽

Knowledge Graph ◽

Embodied Knowledge ◽

Knowledge Representations ◽

Conventional Methods ◽

Image Representations

Entity images could provide significant visual information for knowledge representation learning. Most conventional methods learn knowledge representations merely from structured triples, ignoring rich visual information extracted from entity images. In this paper, we propose a novel Image-embodied Knowledge Representation Learning model (IKRL), where knowledge representations are learned with both triple facts and images. More specifically, we first construct representations for all images of an entity with a neural image encoder. These image representations are then integrated into an aggregated image-based representation via an attention-based method. We evaluate our IKRL models on knowledge graph completion and triple classification. Experimental results demonstrate that our models outperform all baselines on both tasks, which indicates the significance of visual information for knowledge representations and the capability of our models in learning knowledge representations with images.

Download Full-text

Learning Network Embedding with Community Structural Information

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/407 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yu Li ◽

Ying Wang ◽

Tingting Zhang ◽

Jiawei Zhang ◽

Yi Chang

Keyword(s):

Community Structure ◽

Link Prediction ◽

Structural Information ◽

Representation Learning ◽

Network Embedding ◽

Learning Network ◽

Optimization Framework ◽

Vertex Representation ◽

Low Dimensional ◽

Embedding Methods

Network embedding is an effective approach to learn the low-dimensional representations of vertices in networks, aiming to capture and preserve the structure and inherent properties of networks. The vast majority of existing network embedding methods exclusively focus on vertex proximity of networks, while ignoring the network internal community structure. However, the homophily principle indicates that vertices within the same community are more similar to each other than those from different communities, thus vertices within the same community should have similar vertex representations. Motivated by this, we propose a novel network embedding framework NECS to learn the Network Embedding with Community Structural information, which preserves the high-order proximity and incorporates the community structure in vertex representation learning. We formulate the problem into a principled optimization framework and provide an effective alternating algorithm to solve it. Extensive experimental results on several benchmark network datasets demonstrate the effectiveness of the proposed framework in various network analysis tasks including network reconstruction, link prediction and vertex classification.

Download Full-text

Feature Hashing for Network Representation Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/390 ◽

2018 ◽

Cited By ~ 2

Author(s):

Qixiang Wang ◽

Shanfeng Wang ◽

Maoguo Gong ◽

Yue Wu

Keyword(s):

Link Prediction ◽

Feature Space ◽

Representation Learning ◽

Learning Approaches ◽

Network Representation ◽

Proximity Matrix ◽

Low Dimensional ◽

Vector Representations ◽

Feature Hashing ◽

Node Embeddings

The goal of network representation learning is to embed nodes so as to encode the proximity structures of a graph into a continuous low-dimensional feature space. In this paper, we propose a novel algorithm called node2hash based on feature hashing for generating node embeddings. This approach follows the encoder-decoder framework. There are two main mapping functions in this framework. The first is an encoder to map each node into high-dimensional vectors. The second is a decoder to hash these vectors into a lower dimensional feature space. More specifically, we firstly derive a proximity measurement called expected distance as target which combines position distribution and co-occurrence statistics of nodes over random walks so as to build a proximity matrix, then introduce a set of T different hash functions into feature hashing to generate uniformly distributed vector representations of nodes from the proximity matrix. Compared with the existing state-of-the-art network representation learning approaches, node2hash shows a competitive performance on multi-class node classification and link prediction tasks on three real-world networks from various domains.

Download Full-text

An Optimized Network Representation Learning Algorithm Using Multi-Relational Data

Mathematics ◽

10.3390/math7050460 ◽

2019 ◽

Vol 7 (5) ◽

pp. 460

Author(s):

Zhonglin Ye ◽

Haixing Zhao ◽

Ke Zhang ◽

Yu Zhu ◽

Zhaoyang Wang

Keyword(s):

Knowledge Representation ◽

Learning Algorithm ◽

Representation Learning ◽

Knowledge Bases ◽

Relational Data ◽

New Approach ◽

Network Representation ◽

Structural Relationships ◽

Low Dimensional

Representation learning aims to encode the relationships of research objects into low-dimensional, compressible, and distributed representation vectors. The purpose of network representation learning is to learn the structural relationships between network vertices. Knowledge representation learning is oriented to model the entities and relationships in knowledge bases. In this paper, we first introduce the idea of knowledge representation learning into network representation learning, namely, we propose a new approach to model the vertex triplet relationships based on DeepWalk without TransE. Consequently, we propose an optimized network representation learning algorithm using multi-relational data, MRNR, which introduces the multi-relational data between vertices into the procedures of network representation learning. Importantly, we adopted a kind of higher order transformation strategy to optimize the learnt network representation vectors. The purpose of MRNR is that multi-relational data (triplets) can effectively guide and constrain the procedures of network representation learning. The experimental results demonstrate that the proposed MRNR can learn the discriminative network representations, which show better performance on network classification, visualization, and case study tasks compared to the proposed baseline algorithms in this paper.

Download Full-text

Variational Bayesian representation learning for grocery recommendation

Information Retrieval ◽

10.1007/s10791-021-09397-1 ◽

2021 ◽

Vol 24 (4-5) ◽

pp. 347-369

Author(s):

Zaiqiao Meng ◽

Richard McCreadie ◽

Craig Macdonald ◽

Iadh Ounis

Keyword(s):

Real World ◽

Side Information ◽

Representation Learning ◽

Continuous Space ◽

Variational Bayesian ◽

Current State ◽

Significant Performance ◽

Low Dimensional ◽

Performance Gains ◽

The Impact

AbstractRepresentation learning has been widely applied in real-world recommendation systems to capture the features of both users and items. Existing grocery recommendation methods only represent each user and item by single deterministic points in a low-dimensional continuous space, which limit the expressive ability of their embeddings, resulting in recommendation performance bottlenecks. In addition, existing representation learning methods for grocery recommendation only consider the items (products) as independent entities, neglecting their other valuable side information, such as the textual descriptions and the categorical data of items. In this paper, we propose the Variational Bayesian Context-Aware Representation (VBCAR) model for grocery recommendation. VBCAR is a novel variational Bayesian model that learns distributional representations of users and items by leveraging basket context information from historical interactions. Our VBCAR model is also extendable to leverage side information by encoding contextual features into representations based on the inference encoder. We conduct extensive experiments on three real-world grocery datasets to assess the effectiveness of our model as well as the impact of different construction strategies for item side information. Our results show that our VBCAR model outperforms the current state-of-the-art grocery recommendation models while integrating item side information (especially the categorical features with the textual information of items) results in further significant performance gains. Furthermore, we demonstrate through analysis that our model is able to effectively encode similarities between product types, which we argue is the primary reason for the observed effectiveness gains.

Download Full-text

TransRHS: A Representation Learning Method for Knowledge Graphs with Relation Hierarchical Structure

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/413 ◽

2020 ◽

Cited By ~ 1

Author(s):

Fuxiang Zhang ◽

Xin Wang ◽

Zhao Li ◽

Jianxin Li

Keyword(s):

Hierarchical Structure ◽

Link Prediction ◽

Dimensional Space ◽

Model Performance ◽

Representation Learning ◽

Negative Effect ◽

Novel Method ◽

Overall Performance ◽

Knowledge Graphs ◽

Low Dimensional

Representation learning of knowledge graphs aims to project both entities and relations as vectors in a continuous low-dimensional space. Relation Hierarchical Structure (RHS), which is constructed by a generalization relationship named subRelationOf between relations, can improve the overall performance of knowledge representation learning. However, most of the existing methods ignore this critical information, and a straightforward way of considering RHS may have a negative effect on the embeddings and thus reduce the model performance. In this paper, we propose a novel method named TransRHS, which is able to incorporate RHS seamlessly into the embeddings. More specifically, TransRHS encodes each relation as a vector together with a relation-specific sphere in the same space. Our TransRHS employs the relative positions among the vectors and spheres to model the subRelationOf, which embodies the inherent generalization relationships among relations. We evaluate our model on two typical tasks, i.e., link prediction and triple classification. The experimental results show that our TransRHS model significantly outperforms all baselines on both tasks, which verifies that the RHS information is significant to representation learning of knowledge graphs, and TransRHS can effectively and efficiently fuse RHS into knowledge graph embeddings.

Download Full-text

Rule-Guided Compositional Representation Learning on Knowledge Graphs with Hierarchical Types

Mathematics ◽

10.3390/math9161978 ◽

2021 ◽

Vol 9 (16) ◽

pp. 1978

Author(s):

Yanying Mao ◽

Honghui Chen

Keyword(s):

Knowledge Representation ◽

Representation Learning ◽

Knowledge Graph ◽

Path Information ◽

Different Types ◽

Rich Information ◽

The Rich ◽

Horn Rules ◽

Low Dimensional ◽

Type Information

The representation learning of the knowledge graph projects the entities and relationships in the triples into a low-dimensional continuous vector space. Early representation learning mostly focused on the information contained in the triplet itself but ignored other useful information. Since entities have different types of representations in different scenarios, the rich information in the types of entity levels is helpful for obtaining a more complete knowledge representation. In this paper, a new knowledge representation frame (TRKRL) combining rule path information and entity hierarchical type information is proposed to exploit interpretability of logical rules and the advantages of entity hierarchical types. Specifically, for entity hierarchical type information, we consider that entities have multiple representations of different types, as well as treat it as the projection matrix of entities, using the type encoder to model entity hierarchical types. For rule path information, we mine Horn rules from the knowledge graph to guide the synthesis of relations in paths. Experimental results show that TRKRL outperforms baselines on the knowledge graph completion task, which indicates that our model is capable of using entity hierarchical type information, relation paths information, and logic rules information for representation learning.

Download Full-text

Fusion of text and graph information for machine learning problems on networks

PeerJ Computer Science ◽

10.7717/peerj-cs.526 ◽

2021 ◽

Vol 7 ◽

pp. e526

Author(s):

Ilya Makarov ◽

Mikhail Makarov ◽

Dmitrii Kiselev

Keyword(s):

Machine Learning ◽

Link Prediction ◽

Citation Network ◽

Representation Learning ◽

Scientific Paper ◽

Learning Problems ◽

Network Embedding ◽

Network Properties ◽

Text Information ◽

Low Dimensional

Today, increased attention is drawn towards network representation learning, a technique that maps nodes of a network into vectors of a low-dimensional embedding space. A network embedding constructed this way aims to preserve nodes similarity and other specific network properties. Embedding vectors can later be used for downstream machine learning problems, such as node classification, link prediction and network visualization. Naturally, some networks have text information associated with them. For instance, in a citation network, each node is a scientific paper associated with its abstract or title; in a social network, all users may be viewed as nodes of a network and posts of each user as textual attributes. In this work, we explore how combining existing methods of text and network embeddings can increase accuracy for downstream tasks and propose modifications to popular architectures to better capture textual information in network embedding and fusion frameworks.

Download Full-text

ANRL: Attributed Network Representation Learning via Deep Neural Networks

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/438 ◽

2018 ◽

Cited By ~ 31

Author(s):

Zhen Zhang ◽

Hongxia Yang ◽

Jiajun Bu ◽

Sheng Zhou ◽

Pinggang Yu ◽

...

Keyword(s):

Network Structure ◽

Link Prediction ◽

Deep Neural Networks ◽

User Behavior ◽

Representation Learning ◽

Network Representation ◽

Node Attribute ◽

Attributed Network ◽

Attribute Information ◽

Low Dimensional

Network representation learning (RL) aims to transform the nodes in a network into low-dimensional vector spaces while preserving the inherent properties of the network. Though network RL has been intensively studied, most existing works focus on either network structure or node attribute information. In this paper, we propose a novel framework, named ANRL, to incorporate both the network structure and node attribute information in a principled way. Specifically, we propose a neighbor enhancement autoencoder to model the node attribute information, which reconstructs its target neighbors instead of itself. To capture the network structure, attribute-aware skip-gram model is designed based on the attribute encoder to formulate the correlations between each node and its direct or indirect neighbors. We conduct extensive experiments on six real-world networks, including two social networks, two citation networks and two user behavior networks. The results empirically show that ANRL can achieve relatively significant gains in node classification and link prediction tasks.

Download Full-text