scholarly journals Reinforcement Learning Based Meta-Path Discovery in Large-Scale Heterogeneous Information Networks

2020 ◽  
Vol 34 (04) ◽  
pp. 6094-6101
Author(s):  
Guojia Wan ◽  
Bo Du ◽  
Shirui Pan ◽  
Gholameza Haffari

Meta-paths are important tools for a wide variety of data mining and network analysis tasks in Heterogeneous Information Networks (HINs), due to their flexibility and interpretability to capture the complex semantic relation among objects. To date, most HIN analysis still relies on hand-crafting meta-paths, which requires rich domain knowledge that is extremely difficult to obtain in complex, large-scale, and schema-rich HINs. In this work, we present a novel framework, Meta-path Discovery with Reinforcement Learning (MPDRL), to identify informative meta-paths from complex and large-scale HINs. To capture different semantic information between objects, we propose a novel multi-hop reasoning strategy in a reinforcement learning framework which aims to infer the next promising relation that links a source entity to a target entity. To improve the efficiency, moreover, we develop a type context representation embedded approach to scale the RL framework to handle million-scale HINs. As multi-hop reasoning generates rich meta-paths with various length, we further perform a meta-path induction step to summarize the important meta-paths using Lowest Common Ancestor principle. Experimental results on two large-scale HINs, Yago and NELL, validate our approach and demonstrate that our algorithm not only achieves superior performance in the link prediction task, but also identifies useful meta-paths that would have been ignored by human experts.

Author(s):  
Phuc Do

Meta-path is an important concept of heterogeneous information networks (HINs). Meta-paths were used in many tasks such as information retrieval, decision making, and product recommendation. Normally meta-paths were proposed by human experts. Recently, works on meta-path discovery have proposed in-memory solutions that fit in one computer. With large HINs, the whole HIN cannot be loaded in the memory. In this chapter, the authors proposed distributed algorithms to discover meta-paths of large HINs on cloud. They develop the distributed algorithms to discover the significant meta-path, maximal significant meta-path, and top-k meta-paths between two vertices of HIN. Calculation of the support of meta-paths or performing breadth first search can be computational costly in very large HINs. Conveniently, the distributed algorithms utilize the GraphFrames library of Apache Spark on cloud computing environment to efficiently query large HINs. The authors conduct the experiments on large DBLP dataset to prove the performance of our algorithms on cloud.


Author(s):  
Yang Fang ◽  
Xiang Zhao ◽  
Zhen Tan

In this paper, we propose a novel network representation learning model TransPath to encode heterogeneous information networks (HINs). Traditional network representation learning models aim to learn the embeddings of a homogeneous network. TransPath is able to capture the rich semantic and structure information of a HIN via meta-paths. We take advantage of the concept of translation mechanism in knowledge graph which regards a meta-path, instead of an edge, as a translating operation from the first node to the last node. Moreover, we propose a user-guided meta-path sampling strategy which takes users' preference as a guidance, which could explore the semantics of a path more precisely, and meanwhile improve model efficiency via the avoidance of other noisy and meaningless meta-paths. We evaluate our model on two large-scale real-world datasets DBLP and YELP, and two benchmark tasks similarity search and node classification. We observe that TransPath outperforms other state-of-the-art baselines consistently and significantly.


2021 ◽  
pp. 016555152110474
Author(s):  
Weiwei Deng ◽  
Wei Du ◽  
Cong Han

Communities of interest promote knowledge sharing and discovery in social network platforms. However, platform users face difficulties of finding suitable communities, given their increasing number. Although recommendations have been proposed to help users find communities of interest, these methods ignore or exclude heterogeneous interactions between users and communities. In addition, widely used meta-paths help capture the complex semantic relation among entities but heavily rely on domain knowledge. In this study, we propose a novel recommendation model based on informative meta-path discovery in heterogeneous information networks and deep learning. Users, communities, relevant items and their relations are considered as entities in a heterogeneous information network, from where informative meta-paths are extracted on the basis of information theory to measure user-community similarities. Finally, similarities are incorporated in a deep learning model to predict whether target users join candidate communities. The proposed recommendation model is evaluated and compared against baseline methods using two data sets. Results demonstrate the superior performance of the present model in terms of precision, recall and F score.


2016 ◽  
Vol 13 (10) ◽  
pp. 6747-6753
Author(s):  
Pingjian Ding ◽  
Xiangtao Chen ◽  
Zipin Guan

The goal of inductive classification approaches is to infer the correct mapping from test set to labels, while the goal of transductive inference is to predict the correct labels for the given unlabeled data. Hence, the increased unlabeled samples can’t be classified by transductive classification. In this paper, we focus on studying the inductive classification problems in heterogeneous networks, which involve multiple types of objects interconnected by multiple types of links. Moreover, the objects and the links are gradually increasing over time. To accommodate characteristics of heterogeneous networks, a meta-path-based heterogeneous inductive classification (Hic) was proposed. First, the different sub-networks were constructed according to the selected meta-path. Second, the characteristic paths of each sub-network were extracted via the specified minimum support, and were assigned appropriate weights. Then, Hic model based on characteristic path was built. Finally, the Hic scores of each classification label for each test sample was calculated via links between test samples and sub-networks. Experiments on the DBLP showed that the proposed method significantly improves the accuracy and stability over the existing state-of-the-art methods for classification in dynamic heterogeneous network.


Sign in / Sign up

Export Citation Format

Share Document