scholarly journals Knowledge Graph Representation with Jointly Structural and Textual Encoding

Author(s):  
Jiacheng Xu ◽  
Xipeng Qiu ◽  
Kan Chen ◽  
Xuanjing Huang

The objective of knowledge graph embedding is to encode both entities and relations of knowledge graphs into continuous low-dimensional vector spaces. Previously, most works focused on symbolic representation of knowledge graph with structure information, which can not handle new entities or entities with few facts well. In this paper, we propose a novel deep architecture to utilize both structural and textual information of entities. Specifically, we introduce three neural models to encode the valuable information from text description of entity, among which an attentive model can select related information as needed. Then, a gating mechanism is applied to integrate representations of structure and text into a unified architecture. Experiments show that our models outperform baseline and obtain state-of-the-art results on link prediction and triplet classification tasks.

Author(s):  
Yuhan Wang ◽  
Weidong Xiao ◽  
Zhen Tan ◽  
Xiang Zhao

AbstractKnowledge graphs are typical multi-relational structures, which is consisted of many entities and relations. Nonetheless, existing knowledge graphs are still sparse and far from being complete. To refine the knowledge graphs, representation learning is utilized to embed entities and relations into low-dimensional spaces. Many existing knowledge graphs embedding models focus on learning latent features in close-world assumption but omit the changeable of each knowledge graph.In this paper, we propose a knowledge graph representation learning model, called Caps-OWKG, which leverages the capsule network to capture the both known and unknown triplets features in open-world knowledge graph. It combines the descriptive text and knowledge graph to get descriptive embedding and structural embedding, simultaneously. Then, the both above embeddings are used to calculate the probability of triplet authenticity. We verify the performance of Caps-OWKG on link prediction task with two common datasets FB15k-237-OWE and DBPedia50k. The experimental results are better than other baselines, and achieve the state-of-the-art performance.


2019 ◽  
Vol 8 (6) ◽  
pp. 254 ◽  
Author(s):  
Peiyuan Qiu ◽  
Jialiang Gao ◽  
Li Yu ◽  
Feng Lu

A Geographic Knowledge Graph (GeoKG) links geographic relation triplets into a large-scale semantic network utilizing the semantic of geo-entities and geo-relations. Unfortunately, the sparsity of geo-related information distribution on the web leads to a situation where information extraction systems can hardly detect enough references of geographic information in the massive web resource to be able to build relatively complete GeoKGs. This incompleteness, due to missing geo-entities or geo-relations in GeoKG fact triplets, seriously impacts the performance of GeoKG applications. In this paper, a method with geospatial distance restriction is presented to optimize knowledge embedding for GeoKG completion. This method aims to encode both the semantic information and geospatial distance restriction of geo-entities and geo-relations into a continuous, low-dimensional vector space. Then, the missing facts of the GeoKG can be supplemented through vector operations. Specifically, the geospatial distance restriction is realized as the weights of the objective functions of current translation knowledge embedding models. These optimized models output the optimized representations of geo-entities and geo-relations for the GeoKG’s completion. The effects of the presented method are validated with a real GeoKG. Compared with the results of the original models, the presented method improves the metric Hits@10(Filter) by an average of 6.41% for geo-entity prediction, and the Hits@1(Filter) by an average of 31.92%, for geo-relation prediction. Furthermore, the capacity of the proposed method to predict the locations of unknown entities is validated. The results show the geospatial distance restriction reduced the average error distance of prediction by between 54.43% and 57.24%. All the results support the geospatial distance restriction hiding in the GeoKG contributing to refining the embedding representations of geo-entities and geo-relations, which plays a crucial role in improving the quality of GeoKG completion.


2018 ◽  
Vol 2018 ◽  
pp. 1-12
Author(s):  
Zhen Tan ◽  
Xiang Zhao ◽  
Yang Fang ◽  
Bin Ge ◽  
Weidong Xiao

Knowledge graph, a typical multi-relational structure, includes large-scale facts of the world, yet it is still far away from completeness. Knowledge graph embedding, as a representation method, constructs a low-dimensional and continuous space to describe the latent semantic information and predict the missing facts. Among various solutions, almost all embedding models have high time and memory-space complexities and, hence, are difficult to apply to large-scale knowledge graphs. Some other embedding models, such as TransE and DistMult, although with lower complexity, ignore inherent features and only use correlations between different entities to represent the features of each entity. To overcome these shortcomings, we present a novel low-complexity embedding model, namely, SimE-ER, to calculate the similarity of entities in independent and associated spaces. In SimE-ER, each entity (relation) is described as two parts. The entity (relation) features in independent space are represented by the features entity (relation) intrinsically owns and, in associated space, the entity (relation) features are expressed by the entity (relation) features they connect. And the similarity between the embeddings of the same entities in different representation spaces is high. In experiments, we evaluate our model with two typical tasks: entity prediction and relation prediction. Compared with the state-of-the-art models, our experimental results demonstrate that SimE-ER outperforms existing competitors and has low time and memory-space complexities.


Symmetry ◽  
2019 ◽  
Vol 11 (3) ◽  
pp. 392 ◽  
Author(s):  
Zhiying Cao ◽  
Xinghao Qiao ◽  
Shuo Jiang ◽  
Xiuguo Zhang

Using semantic information can help to accurately find suitable services from a variety of available (different semantics) services, and the semantic information of Web services can be described in detail in a Web service knowledge graph. In this paper, a Web service recommendation algorithm based on knowledge graph representation learning (kg-WSR) is proposed. The algorithm embeds the entities and relationships of the knowledge graph into the low-dimensional vector space. By calculating the distance between service entities in low-dimensional space, the relationship information of services which is not considered in recommendation approaches using a collaborative filtering algorithm is incorporated into the recommendation algorithm to enhance the accurateness of the result. The experimental results show that this algorithm can not only effectively improve the accuracy rate, recall rate, and coverage rate of recommendation but also solve the cold start problem to some extent.


Author(s):  
Pengyang Wang ◽  
Yanjie Fu ◽  
Yuanchun Zhou ◽  
Kunpeng Liu ◽  
Xiaolin Li ◽  
...  

In this paper, we design and evaluate a new substructure-aware Graph Representation Learning (GRL) approach. GRL aims to map graph structure information into low-dimensional representations. While extensive efforts have been made for modeling global and/or local structure information, GRL can be improved by substructure information. Some recent studies exploit adversarial learning to incorporate substructure awareness, but hindered by unstable convergence. This study will address the major research question: is there a better way to integrate substructure awareness into GRL? As subsets of the graph structure, interested substructures (i.e., subgraph) are unique and representative for differentiating graphs, leading to the high correlation between the representation of the graph-level structure and substructures. Since mutual information (MI) is to evaluate the mutual dependence between two variables, we develop a MI inducted substructure-aware GRL method. We decompose the GRL pipeline into two stages: (1) node-level, where we introduce to maximize MI between the original and learned representation by the intuition that the original and learned representation should be highly correlated; (2) graph-level, where we preserve substructures by maximizing MI between the graph-level structure and substructure representation. Finally, we present extensive experimental results to demonstrate the improved performances of our method with real-world data.


2021 ◽  
Vol 4 ◽  
Author(s):  
Linmei Hu ◽  
Mengmei Zhang ◽  
Shaohua Li ◽  
Jinghan Shi ◽  
Chuan Shi ◽  
...  

Knowledge Graphs (KGs) such as Freebase and YAGO have been widely adopted in a variety of NLP tasks. Representation learning of Knowledge Graphs (KGs) aims to map entities and relationships into a continuous low-dimensional vector space. Conventional KG embedding methods (such as TransE and ConvE) utilize only KG triplets and thus suffer from structure sparsity. Some recent works address this issue by incorporating auxiliary texts of entities, typically entity descriptions. However, these methods usually focus only on local consecutive word sequences, but seldom explicitly use global word co-occurrence information in a corpus. In this paper, we propose to model the whole auxiliary text corpus with a graph and present an end-to-end text-graph enhanced KG embedding model, named Teger. Specifically, we model the auxiliary texts with a heterogeneous entity-word graph (called text-graph), which entails both local and global semantic relationships among entities and words. We then apply graph convolutional networks to learn informative entity embeddings that aggregate high-order neighborhood information. These embeddings are further integrated with the KG triplet embeddings via a gating mechanism, thus enriching the KG representations and alleviating the inherent structure sparsity. Experiments on benchmark datasets show that our method significantly outperforms several state-of-the-art methods.


Genes ◽  
2021 ◽  
Vol 12 (7) ◽  
pp. 998
Author(s):  
Peng Zhang ◽  
Yi Bu ◽  
Peng Jiang ◽  
Xiaowen Shi ◽  
Bing Lun ◽  
...  

This study builds a coronavirus knowledge graph (KG) by merging two information sources. The first source is Analytical Graph (AG), which integrates more than 20 different public datasets related to drug discovery. The second source is CORD-19, a collection of published scientific articles related to COVID-19. We combined both chemo genomic entities in AG with entities extracted from CORD-19 to expand knowledge in the COVID-19 domain. Before populating KG with those entities, we perform entity disambiguation on CORD-19 collections using Wikidata. Our newly built KG contains at least 21,700 genes, 2500 diseases, 94,000 phenotypes, and other biological entities (e.g., compound, species, and cell lines). We define 27 relationship types and use them to label each edge in our KG. This research presents two cases to evaluate the KG’s usability: analyzing a subgraph (ego-centered network) from the angiotensin-converting enzyme (ACE) and revealing paths between biological entities (hydroxychloroquine and IL-6 receptor; chloroquine and STAT1). The ego-centered network captured information related to COVID-19. We also found significant COVID-19-related information in top-ranked paths with a depth of three based on our path evaluation.


Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1407
Author(s):  
Peng Wang ◽  
Jing Zhou ◽  
Yuzhang Liu ◽  
Xingchen Zhou

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.


2021 ◽  
Author(s):  
Erik Cambria ◽  
Shaoxiong Ji ◽  
Shirui Pan ◽  
Philip S. Yu

Sign in / Sign up

Export Citation Format

Share Document