scholarly journals Representation Learning of Knowledge Graphs with Embedding Subspaces

2020 ◽  
Vol 2020 ◽  
pp. 1-10 ◽  
Author(s):  
Chunhua Li ◽  
Xuefeng Xian ◽  
Xusheng Ai ◽  
Zhiming Cui

Most of the existing knowledge graph embedding models are supervised methods and largely relying on the quality and quantity of obtainable labelled training data. The cost of obtaining high quality triples is high and the data sources are facing a serious problem of data sparsity, which may result in insufficient training of long-tail entities. However, unstructured text encoding entities and relational knowledge can be obtained anywhere in large quantities. Word vectors of entity names estimated from the unlabelled raw text using natural language model encode syntax and semantic properties of entities. Yet since these feature vectors are estimated through minimizing prediction error on unsupervised entity names, they may not be the best for knowledge graphs. We propose a two-phase approach to adapt unsupervised entity name embeddings to a knowledge graph subspace and jointly learn the adaptive matrix and knowledge representation. Experiments on Freebase show that our method can rely less on the labelled data and outperforms the baselines when the labelled data is relatively less. Especially, it is applicable to zero-shot scenario.

Author(s):  
Jian Sun ◽  
Yu Zhou ◽  
Chengqing Zong

The relation learning between two entities is an essential task in knowledge graph (KG) completion that has received much attention recently. Previous work almost exclusively focused on relations widely seen in the original KGs, which means that enough training data are available for modeling. However, long-tail relations that only show in a few triples are actually much more common in practical KGs. Without sufficiently large training data, the performance of existing models on predicting long-tail relations drops impressively. This work aims to predict the relation under a challenging setting where only one instance is available for training. We propose a path-based one-shot relation prediction framework, which can extract neighborhood information of an entity based on the relation query attention mechanism to learn transferable knowledge among the same relation. Simultaneously, to reduce the impact of long-tail entities on relation prediction, we selectively fuse path information between entity pairs as auxiliary information of relation features. Experiments in three one-shot relation learning datasets show that our proposed framework substantially outperforms existing models on one-shot link prediction and relation prediction.


Author(s):  
Minh Pham ◽  
Craig A. Knoblock ◽  
Muhao Chen ◽  
Binh Vu ◽  
Jay Pujara

Error detection is one of the most important steps in data cleaning and usually requires extensive human interaction to ensure quality. Existing supervised methods in error detection require a significant amount of training data while unsupervised methods rely on fixed inductive biases, which are usually hard to generalize, to solve the problem. In this paper, we present SPADE, a novel semi-supervised probabilistic approach for error detection. SPADE introduces a novel probabilistic active learning model, where the system suggests examples to be labeled based on the agreements between user labels and indicative signals, which are designed to capture potential errors. SPADE uses a two-phase data augmentation process to enrich a dataset before training a deep learning classifier to detect unlabeled errors. In our evaluation, SPADE achieves an average F1-score of 0.91 over five datasets and yields a 10% improvement compared with the state-of-the-art systems.


Author(s):  
Yuhan Wang ◽  
Weidong Xiao ◽  
Zhen Tan ◽  
Xiang Zhao

AbstractKnowledge graphs are typical multi-relational structures, which is consisted of many entities and relations. Nonetheless, existing knowledge graphs are still sparse and far from being complete. To refine the knowledge graphs, representation learning is utilized to embed entities and relations into low-dimensional spaces. Many existing knowledge graphs embedding models focus on learning latent features in close-world assumption but omit the changeable of each knowledge graph.In this paper, we propose a knowledge graph representation learning model, called Caps-OWKG, which leverages the capsule network to capture the both known and unknown triplets features in open-world knowledge graph. It combines the descriptive text and knowledge graph to get descriptive embedding and structural embedding, simultaneously. Then, the both above embeddings are used to calculate the probability of triplet authenticity. We verify the performance of Caps-OWKG on link prediction task with two common datasets FB15k-237-OWE and DBPedia50k. The experimental results are better than other baselines, and achieve the state-of-the-art performance.


Author(s):  
Yuan Sun ◽  
Andong Chen ◽  
Chaofan Chen ◽  
Tianci Xia ◽  
Xiaobing Zhao

Learning the representation of a knowledge graph is critical to the field of natural language processing. There is a lot of research for English knowledge graph representation. However, for the low-resource languages, such as Tibetan, how to represent sparse knowledge graphs is a key problem. In this article, aiming at scarcity of Tibetan knowledge graphs, we extend the Tibetan knowledge graph by using the triples of the high-resource language knowledge graphs and Point of Information map information. To improve the representation learning of the Tibetan knowledge graph, we propose a joint model to merge structure and entity description information based on the Translating Embeddings and Convolution Neural Networks models. In addition, to solve the segmentation errors, we use character and word embedding to learn more complex information in Tibetan. Finally, the experimental results show that our model can make a better representation of the Tibetan knowledge graph than the baseline.


Author(s):  
Zequn Sun ◽  
Wei Hu ◽  
Qingheng Zhang ◽  
Yuzhong Qu

Embedding-based entity alignment represents different knowledge graphs (KGs) as low-dimensional embeddings and finds entity alignment by measuring the similarities between entity embeddings. Existing approaches have achieved promising results, however, they are still challenged by the lack of enough prior alignment as labeled training data. In this paper, we propose a bootstrapping approach to embedding-based entity alignment. It iteratively labels likely entity alignment as training data for learning alignment-oriented KG embeddings. Furthermore, it employs an alignment editing method to reduce error accumulation during iterations. Our experiments on real-world datasets showed that the proposed approach significantly outperformed the state-of-the-art embedding-based ones for entity alignment. The proposed alignment-oriented KG embedding, bootstrapping process and alignment editing method all contributed to the performance improvement.


Author(s):  
Sebastijan Dumancic ◽  
Alberto Garcia-Duran ◽  
Mathias Niepert

Many real-world domains can be expressed as graphs and, more generally, as multi-relational knowledge graphs. Though reasoning and learning with knowledge graphs has traditionally been addressed by symbolic approaches such as Statistical relational learning, recent methods in (deep) representation learning have shown promising results for specialised tasks such as knowledge base completion. These approaches, also known as distributional, abandon the traditional symbolic paradigm by replacing symbols with vectors in Euclidean space. With few exceptions, symbolic and distributional approaches are explored in different communities and little is known about their respective strengths and weaknesses. In this work, we compare distributional and symbolic relational learning approaches on various standard relational classification and knowledge base completion tasks. Furthermore, we analyse the properties of the datasets and relate them to the performance of the methods in the comparison. The results reveal possible indicators that could help in choosing one approach over the other for particular knowledge graphs.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Suzanna Schmeelk ◽  
Lixin Tao

Many organizations, to save costs, are movinheg to t Bring Your Own Mobile Device (BYOD) model and adopting applications built by third-parties at an unprecedented rate.  Our research examines software assurance methodologies specifically focusing on security analysis coverage of the program analysis for mobile malware detection, mitigation, and prevention.  This research focuses on secure software development of Android applications by developing knowledge graphs for threats reported by the Open Web Application Security Project (OWASP).  OWASP maintains lists of the top ten security threats to web and mobile applications.  We develop knowledge graphs based on the two most recent top ten threat years and show how the knowledge graph relationships can be discovered in mobile application source code.  We analyze 200+ healthcare applications from GitHub to gain an understanding of their software assurance of their developed software for one of the OWASP top ten moble threats, the threat of “Insecure Data Storage.”  We find that many of the applications are storing personally identifying information (PII) in potentially vulnerable places leaving users exposed to higher risks for the loss of their sensitive data.


Author(s):  
Bo Wang ◽  
Tao Shen ◽  
Guodong Long ◽  
Tianyi Zhou ◽  
Ying Wang ◽  
...  

Electronics ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 1407
Author(s):  
Peng Wang ◽  
Jing Zhou ◽  
Yuzhang Liu ◽  
Xingchen Zhou

Knowledge graph embedding aims to embed entities and relations into low-dimensional vector spaces. Most existing methods only focus on triple facts in knowledge graphs. In addition, models based on translation or distance measurement cannot fully represent complex relations. As well-constructed prior knowledge, entity types can be employed to learn the representations of entities and relations. In this paper, we propose a novel knowledge graph embedding model named TransET, which takes advantage of entity types to learn more semantic features. More specifically, circle convolution based on the embeddings of entity and entity types is utilized to map head entity and tail entity to type-specific representations, then translation-based score function is used to learn the presentation triples. We evaluated our model on real-world datasets with two benchmark tasks of link prediction and triple classification. Experimental results demonstrate that it outperforms state-of-the-art models in most cases.


Sign in / Sign up

Export Citation Format

Share Document