Rule-Guided Compositional Representation Learning on Knowledge Graphs

Guanglin Niu; Yongfei Zhang; Bo Li; Peng Cui; Si Liu; Jingyang Li; Xiaowei Zhang

doi:10.1609/aaai.v34i03.5687

Rule-Guided Compositional Representation Learning on Knowledge Graphs

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i03.5687 ◽

2020 ◽

Vol 34 (03) ◽

pp. 2950-2958

Author(s):

Guanglin Niu ◽

Yongfei Zhang ◽

Bo Li ◽

Peng Cui ◽

Si Liu ◽

...

Keyword(s):

State Of The Art ◽

Representation Learning ◽

Vector Spaces ◽

Semantic Structure ◽

Completion Task ◽

Joint Embedding ◽

Semantic Associations ◽

Structured Information ◽

Low Dimensional ◽

Embedding Methods

Representation learning on a knowledge graph (KG) is to embed entities and relations of a KG into low-dimensional continuous vector spaces. Early KG embedding methods only pay attention to structured information encoded in triples, which would cause limited performance due to the structure sparseness of KGs. Some recent attempts consider paths information to expand the structure of KGs but lack explainability in the process of obtaining the path representations. In this paper, we propose a novel Rule and Path-based Joint Embedding (RPJE) scheme, which takes full advantage of the explainability and accuracy of logic rules, the generalization of KG embedding as well as the supplementary semantic structure of paths. Specifically, logic rules of different lengths (the number of relations in rule body) in the form of Horn clauses are first mined from the KG and elaborately encoded for representation learning. Then, the rules of length 2 are applied to compose paths accurately while the rules of length 1 are explicitly employed to create semantic associations among relations and constrain relation embeddings. Moreover, the confidence level of each rule is also considered in optimization to guarantee the availability of applying the rule to representation learning. Extensive experimental results illustrate that RPJE outperforms other state-of-the-art baselines on KG completion task, which also demonstrate the superiority of utilizing logic rules as well as paths for improving the accuracy and explainability of representation learning.

Download Full-text

Text-Graph Enhanced Knowledge Graph Representation Learning

Frontiers in Artificial Intelligence ◽

10.3389/frai.2021.697856 ◽

2021 ◽

Vol 4 ◽

Author(s):

Linmei Hu ◽

Mengmei Zhang ◽

Shaohua Li ◽

Jinghan Shi ◽

Chuan Shi ◽

...

Keyword(s):

State Of The Art ◽

Representation Learning ◽

Graph Representation ◽

Semantic Relationships ◽

Convolutional Networks ◽

Gating Mechanism ◽

Benchmark Datasets ◽

Knowledge Graphs ◽

Low Dimensional ◽

Embedding Methods

Knowledge Graphs (KGs) such as Freebase and YAGO have been widely adopted in a variety of NLP tasks. Representation learning of Knowledge Graphs (KGs) aims to map entities and relationships into a continuous low-dimensional vector space. Conventional KG embedding methods (such as TransE and ConvE) utilize only KG triplets and thus suffer from structure sparsity. Some recent works address this issue by incorporating auxiliary texts of entities, typically entity descriptions. However, these methods usually focus only on local consecutive word sequences, but seldom explicitly use global word co-occurrence information in a corpus. In this paper, we propose to model the whole auxiliary text corpus with a graph and present an end-to-end text-graph enhanced KG embedding model, named Teger. Specifically, we model the auxiliary texts with a heterogeneous entity-word graph (called text-graph), which entails both local and global semantic relationships among entities and words. We then apply graph convolutional networks to learn informative entity embeddings that aggregate high-order neighborhood information. These embeddings are further integrated with the KG triplet embeddings via a gating mechanism, thus enriching the KG representations and alleviating the inherent structure sparsity. Experiments on benchmark datasets show that our method significantly outperforms several state-of-the-art methods.

Download Full-text

TransET: Knowledge Graph Embedding with Entity Types

Electronics ◽

10.3390/electronics10121407 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1407

Author(s):

Peng Wang ◽

Jing Zhou ◽

Yuzhang Liu ◽

Xingchen Zhou

Keyword(s):

Link Prediction ◽

State Of The Art ◽

Score Function ◽

Graph Embedding ◽

Vector Spaces ◽

Knowledge Graph ◽

Semantic Features ◽

Knowledge Graphs ◽

Real World Datasets ◽

Low Dimensional

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.

Download Full-text

Structural Adversarial Variational Auto-Encoder for Attributed Network Embedding

Applied Sciences ◽

10.3390/app11052371 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2371

Author(s):

Junjian Zhan ◽

Feng Li ◽

Yang Wang ◽

Daoyu Lin ◽

Guangluan Xu

Keyword(s):

State Of The Art ◽

Global Information ◽

Network Embedding ◽

Sampling Process ◽

Attributed Network ◽

Benchmark Datasets ◽

Adversarial Training ◽

Low Dimensional ◽

Embedding Methods ◽

Local Proximity

As most networks come with some content in each node, attributed network embedding has aroused much research interest. Most existing attributed network embedding methods aim at learning a fixed representation for each node encoding its local proximity. However, those methods usually neglect the global information between nodes distant from each other and distribution of the latent codes. We propose Structural Adversarial Variational Graph Auto-Encoder (SAVGAE), a novel framework which encodes the network structure and node content into low-dimensional embeddings. On one hand, our model captures the local proximity and proximities at any distance of a network by exploiting a high-order proximity indicator named Rooted Pagerank. On the other hand, our method learns the data distribution of each node representation while circumvents the side effect its sampling process causes on learning a robust embedding through adversarial training. On benchmark datasets, we demonstrate that our method performs competitively compared with state-of-the-art models.

Download Full-text

Learning Network Embedding with Community Structural Information

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/407 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yu Li ◽

Ying Wang ◽

Tingting Zhang ◽

Jiawei Zhang ◽

Yi Chang

Keyword(s):

Community Structure ◽

Link Prediction ◽

Structural Information ◽

Representation Learning ◽

Network Embedding ◽

Learning Network ◽

Optimization Framework ◽

Vertex Representation ◽

Low Dimensional ◽

Embedding Methods

Network embedding is an effective approach to learn the low-dimensional representations of vertices in networks, aiming to capture and preserve the structure and inherent properties of networks. The vast majority of existing network embedding methods exclusively focus on vertex proximity of networks, while ignoring the network internal community structure. However, the homophily principle indicates that vertices within the same community are more similar to each other than those from different communities, thus vertices within the same community should have similar vertex representations. Motivated by this, we propose a novel network embedding framework NECS to learn the Network Embedding with Community Structural information, which preserves the high-order proximity and incorporates the community structure in vertex representation learning. We formulate the problem into a principled optimization framework and provide an effective alternating algorithm to solve it. Extensive experimental results on several benchmark network datasets demonstrate the effectiveness of the proposed framework in various network analysis tasks including network reconstruction, link prediction and vertex classification.

Download Full-text

Modality to Modality Translation: An Adversarial Representation Learning and Graph Fusion Network for Multimodal Fusion

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5347 ◽

2020 ◽

Vol 34 (01) ◽

pp. 164-172

Author(s):

Sijie Mai ◽

Haifeng Hu ◽

Songlong Xing

Keyword(s):

Neural Network ◽

State Of The Art ◽

Representation Learning ◽

Multimodal Fusion ◽

Multiple Datasets ◽

Multi Stage ◽

Invariant Embedding ◽

Joint Embedding ◽

Adversarial Training ◽

Additional Constraints

Learning joint embedding space for various modalities is of vital importance for multimodal fusion. Mainstream modality fusion approaches fail to achieve this goal, leaving a modality gap which heavily affects cross-modal fusion. In this paper, we propose a novel adversarial encoder-decoder-classifier framework to learn a modality-invariant embedding space. Since the distributions of various modalities vary in nature, to reduce the modality gap, we translate the distributions of source modalities into that of target modality via their respective encoders using adversarial training. Furthermore, we exert additional constraints on embedding space by introducing reconstruction loss and classification loss. Then we fuse the encoded representations using hierarchical graph neural network which explicitly explores unimodal, bimodal and trimodal interactions in multi-stage. Our method achieves state-of-the-art performance on multiple datasets. Visualization of the learned embeddings suggests that the joint embedding space learned by our method is discriminative.

Download Full-text

Graph embedding on biomedical networks: methods, applications and evaluations

Bioinformatics ◽

10.1093/bioinformatics/btz718 ◽

2019 ◽

Cited By ~ 14

Author(s):

Xiang Yue ◽

Zhen Wang ◽

Jingong Huang ◽

Srinivasan Parthasarathy ◽

Soheil Moosavinasab ◽

...

Keyword(s):

Protein Function ◽

State Of The Art ◽

Protein Function Prediction ◽

Graph Embedding ◽

Experimental Results ◽

Supplementary Information ◽

Protein Protein Interaction ◽

Biological Features ◽

Low Dimensional ◽

Embedding Methods

Abstract Motivation Graph embedding learning that aims to automatically learn low-dimensional node representations, has drawn increasing attention in recent years. To date, most recent graph embedding methods are evaluated on social and information networks and are not comprehensively studied on biomedical networks under systematic experiments and analyses. On the other hand, for a variety of biomedical network analysis tasks, traditional techniques such as matrix factorization (which can be seen as a type of graph embedding methods) have shown promising results, and hence there is a need to systematically evaluate the more recent graph embedding methods (e.g. random walk-based and neural network-based) in terms of their usability and potential to further the state-of-the-art. Results We select 11 representative graph embedding methods and conduct a systematic comparison on 3 important biomedical link prediction tasks: drug-disease association (DDA) prediction, drug–drug interaction (DDI) prediction, protein–protein interaction (PPI) prediction; and 2 node classification tasks: medical term semantic type classification, protein function prediction. Our experimental results demonstrate that the recent graph embedding methods achieve promising results and deserve more attention in the future biomedical graph analysis. Compared with three state-of-the-art methods for DDAs, DDIs and protein function predictions, the recent graph embedding methods achieve competitive performance without using any biological features and the learned embeddings can be treated as complementary representations for the biological features. By summarizing the experimental results, we provide general guidelines for properly selecting graph embedding methods and setting their hyper-parameters for different biomedical tasks. Availability and implementation As part of our contributions in the paper, we develop an easy-to-use Python package with detailed instructions, BioNEV, available at: https://github.com/xiangyue9607/BioNEV, including all source code and datasets, to facilitate studying various graph embedding methods on biomedical tasks. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Modeling Complementarity in Behavior Data with Multi-Type Itemset Embedding

ACM Transactions on Intelligent Systems and Technology ◽

10.1145/3458724 ◽

2021 ◽

Vol 12 (4) ◽

pp. 1-25

Author(s):

Daheng Wang ◽

Qingkai Zeng ◽

Nitesh V. Chawla ◽

Meng Jiang

Keyword(s):

Computational Models ◽

Team Building ◽

Representation Learning ◽

Behavior Prediction ◽

Team Members ◽

Space Experiments ◽

Reading Materials ◽

Effectiveness And Efficiency ◽

Low Dimensional ◽

Embedding Methods

People are looking for complementary contexts, such as team members of complementary skills for project team building and/or reading materials of complementary knowledge for effective student learning, to make their behaviors more likely to be successful. Complementarity has been revealed by behavioral sciences as one of the most important factors in decision making. Existing computational models that learn low-dimensional context representations from behavior data have poor scalability and recent network embedding methods only focus on preserving the similarity between the contexts. In this work, we formulate a behavior entry as a set of context items and propose a novel representation learning method, Multi-type Itemset Embedding , to learn the context representations preserving the itemset structures. We propose a measurement of complementarity between context items in the embedding space. Experiments demonstrate both effectiveness and efficiency of the proposed method over the state-of-the-art methods on behavior prediction and context recommendation. We discover that the complementary contexts and similar contexts are significantly different in human behaviors.

Download Full-text

FREDE

Proceedings of the VLDB Endowment ◽

10.14778/3447689.3447713 ◽

2021 ◽

Vol 14 (6) ◽

pp. 1102-1110

Author(s):

Anton Tsitsulin ◽

Marina Munkhoeva ◽

Davide Mottin ◽

Panagiotis Karras ◽

Ivan Oseledets ◽

...

Keyword(s):

Data Science ◽

State Of The Art ◽

Graph Embedding ◽

Space Complexity ◽

Similarity Matrix ◽

Data Engineering ◽

Matrix Sketching ◽

Low Dimensional ◽

Diverse Data ◽

Embedding Methods

Low-dimensional representations, or embeddings , of a graph's nodes facilitate several practical data science and data engineering tasks. As such embeddings rely, explicitly or implicitly, on a similarity measure among nodes, they require the computation of a quadratic similarity matrix, inducing a tradeoff between space complexity and embedding quality. To date, no graph embedding work combines (i) linear space complexity, (ii) a nonlinear transform as its basis, and (iii) nontrivial quality guarantees. In this paper we introduce FREDE ( FREquent Directions Embedding ), a graph embedding based on matrix sketching that combines those three desiderata. Starting out from the observation that embedding methods aim to preserve the covariance among the rows of a similarity matrix, FREDE iteratively improves on quality while individually processing rows of a nonlinearly transformed PPR similarity matrix derived from a state-of-the-art graph embedding method and provides, at any iteration , column-covariance approximation guarantees in due course almost indistinguishable from those of the optimal approximation by SVD. Our experimental evaluation on variably sized networks shows that FREDE performs almost as well as SVD and competitively against state-of-the-art embedding methods in diverse data science tasks, even when it is based on as little as 10% of node similarities.

Download Full-text

SPINE: Structural Identity Preserved Inductive Network Embedding

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/333 ◽

2019 ◽

Cited By ~ 3

Author(s):

Junliang Guo ◽

Linli Xu ◽

Jingchang Liu

Keyword(s):

State Of The Art ◽

Structural Information ◽

Critical Role ◽

Network Embedding ◽

Dimensional Network ◽

Structural Identity ◽

Benchmark Datasets ◽

Low Dimensional ◽

Embedding Methods ◽

Local Proximity

Recent advances in the field of network embedding have shown that low-dimensional network representation is playing a critical role in network analysis. Most existing network embedding methods encode the local proximity of a node, such as the first- and second-order proximities. While being efficient, these methods are short of leveraging the global structural information between nodes distant from each other. In addition, most existing methods learn embeddings on one single fixed network, and thus cannot be generalized to unseen nodes or networks without retraining. In this paper we present SPINE, a method that can jointly capture the local proximity and proximities at any distance, while being inductive to efficiently deal with unseen nodes or networks. Extensive experimental results on benchmark datasets demonstrate the superiority of the proposed framework over the state of the art.

Download Full-text

SNEQ: Semi-Supervised Attributed Network Embedding with Attention-Based Quantisation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5832 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4091-4098 ◽

Cited By ~ 1

Author(s):

Tao He ◽

Lianli Gao ◽

Jingkuan Song ◽

Xin Wang ◽

Kejie Huang ◽

...

Keyword(s):

Link Prediction ◽

Large Scale ◽

State Of The Art ◽

Network Embedding ◽

Compression Method ◽

Network Analytics ◽

Attributed Network ◽

Large Scale Networks ◽

Low Dimensional ◽

Embedding Methods

Learning accurate low-dimensional embeddings for a network is a crucial task as it facilitates many network analytics tasks. Moreover, the trained embeddings often require a significant amount of space to store, making storage and processing a challenge, especially as large-scale networks become more prevalent. In this paper, we present a novel semi-supervised network embedding and compression method, SNEQ, that is competitive with state-of-art embedding methods while being far more space- and time-efficient. SNEQ incorporates a novel quantisation method based on a self-attention layer that is trained in an end-to-end fashion, which is able to dramatically compress the size of the trained embeddings, thus reduces storage footprint and accelerates retrieval speed. Our evaluation on four real-world networks of diverse characteristics shows that SNEQ outperforms a number of state-of-the-art embedding methods in link prediction, node classification and node recommendation. Moreover, the quantised embedding shows a great advantage in terms of storage and time compared with continuous embeddings as well as hashing methods.

Download Full-text