Embedding Heterogeneous Information Network in Hyperbolic Spaces

2022 ◽  
Vol 16 (2) ◽  
pp. 1-23
Author(s):  
Yiding Zhang ◽  
Xiao Wang ◽  
Nian Liu ◽  
Chuan Shi

Heterogeneous information network (HIN) embedding, aiming to project HIN into a low-dimensional space, has attracted considerable research attention. Most of the existing HIN embedding methods focus on preserving the inherent network structure and semantic correlations in Euclidean spaces. However, one fundamental problem is whether the Euclidean spaces are the intrinsic spaces of HIN? Recent researches find the complex network with hyperbolic geometry can naturally reflect some properties, e.g., hierarchical and power-law structure. In this article, we make an effort toward embedding HIN in hyperbolic spaces. We analyze the structures of three HINs and discover some properties, e.g., the power-law distribution, also exist in HINs. Therefore, we propose a novel HIN embedding model HHNE. Specifically, to capture the structure and semantic relations between nodes, HHNE employs the meta-path guided random walk to sample the sequences for each node. Then HHNE exploits the hyperbolic distance as the proximity measurement. We also derive an effective optimization strategy to update the hyperbolic embeddings iteratively. Since HHNE optimizes different relations in a single space, we further propose the extended model HHNE++. HHNE++ models different relations in different spaces, which enables it to learn complex interactions in HINs. The optimization strategy of HHNE++ is also derived to update the parameters of HHNE++ in a principle manner. The experimental results demonstrate the effectiveness of our proposed models.

Author(s):  
Xiao Wang ◽  
Yiding Zhang ◽  
Chuan Shi

Heterogeneous information network (HIN) embedding, aiming to project HIN into a low-dimensional space, has attracted considerable research attention. Most of the exiting HIN embedding methods focus on preserving the inherent network structure and semantic correlations in Euclidean spaces. However, one fundamental problem is that whether the Euclidean spaces are the appropriate or intrinsic isometric spaces of HIN? Recent researches argue that the complex network may have the hyperbolic geometry underneath, because the underlying hyperbolic geometry can naturally reflect some properties of complex network, e.g., hierarchical and power-law structure. In this paper, we make the first effort toward HIN embedding in hyperbolic spaces. We analyze the structures of two real-world HINs and discover some properties, e.g., the power-law distribution, also exist in HIN. Therefore, we propose a novel hyperbolic heterogeneous information network embedding model. Specifically, to capture the structure and semantic relations between nodes, we employ the meta-path guided random walk to sample the sequences for each node. Then we exploit the distance in hyperbolic spaces as the proximity measurement. The hyperbolic distance is able to meet the triangle inequality and well preserve the transitivity in HIN. Our model enables the nodes and their neighborhoods have small hyperbolic distances. We further derive the effective optimization strategy to update the hyperbolic embeddings iteratively. The experimental results, in comparison with the state-of-the-art, demonstrate that our proposed model not only has superior performance on network reconstruction and link prediction tasks but also shows its ability of capture hierarchy structure in HIN via visualization.


2019 ◽  
Vol 17 (04) ◽  
pp. 1950020
Author(s):  
P. V. Sunil Kumar ◽  
G. Gopakumar

Recent findings from biological experiments demonstrate that long non-coding RNAs (lncRNAs) are actively involved in critical cellular processes and are associated with innumerable diseases. Computational prediction of lncRNA–disease association draws tremendous research attention nowadays. This paper proposes a machine learning model that predicts lncRNA–disease associations using Heterogeneous Information Network (HIN) of lncRNAs and diseases. A Support Vector Machine classifier is developed using the feature set extracted from a meta-path-based parameter, Association Index derived from the HIN. Performance of the model is validated using standard statistical metrics and it generated an AUC value of 0.87, which is better than the existing methods in the literature. Results are further validated using the recent literature and many of the predicted lncRNA–disease associations are identified as actually existing. This paper also proposes an HIN-based methodology to associate lncRNAs with pathways in which they may have biological influence. A case study on the pathway associations of four well-known lncRNAs (HOTAIR, TUG1, NEAT1, and MALAT1) has been conducted. It has been observed that many times the same lncRNA is associated with more than one biologically related pathways. Further exploration is needed to substantiate whether such lncRNAs have any role in determining the pathway interplay. The script and sample data for the model construction is freely available at http://bdbl.nitc.ac.in/LncDisPath/index.html .


2021 ◽  
Vol 25 (3) ◽  
pp. 711-738
Author(s):  
Phu Pham ◽  
Phuc Do

Link prediction on heterogeneous information network (HIN) is considered as a challenge problem due to the complexity and diversity in types of nodes and links. Currently, there are remained challenges of meta-path-based link prediction in HIN. Previous works of link prediction in HIN via network embedding approach are mainly focused on exploiting features of node rather than existing relations in forms of meta-paths between nodes. In fact, predicting the existence of new links between non-linked nodes is absolutely inconvincible. Moreover, recent HIN-based embedding models also lack of thorough evaluations on the topic similarity between text-based nodes along given meta-paths. To tackle these challenges, in this paper, we proposed a novel approach of topic-driven multiple meta-path-based HIN representation learning framework, namely W-MMP2Vec. Our model leverages the quality of node representations by combining multiple meta-paths as well as calculating the topic similarity weight for each meta-path during the processes of network embedding learning in content-based HINs. To validate our approach, we apply W-TMP2Vec model in solving several link prediction tasks in both content-based and non-content-based HINs (DBLP, IMDB and BlogCatalog). The experimental outputs demonstrate the effectiveness of proposed model which outperforms recent state-of-the-art HIN representation learning models.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1671
Author(s):  
Jibing Gong ◽  
Cheng Wang ◽  
Zhiyong Zhao ◽  
Xinghao Zhang

In MOOCs, generally speaking, curriculum designing, course selection, and knowledge concept recommendation are the three major steps that systematically instruct users to learn. This paper focuses on the knowledge concept recommendation in MOOCs, which recommends related topics to users to facilitate their online study. The existing approaches only consider the historical behaviors of users, but ignore various kinds of auxiliary information, which are also critical for user embedding. In addition, traditional recommendation models only consider the immediate user response to the recommended items, and do not explicitly consider the long-term interests of users. To deal with the above issues, this paper proposes AGMKRec, a novel reinforced concept recommendation model with a heterogeneous information network. We first clarify the concept recommendation in MOOCs as a reinforcement learning problem to offer a personalized and dynamic knowledge concept label list to users. To consider more auxiliary information of users, we construct a heterogeneous information network among users, courses, and concepts, and use a meta-path-based method which can automatically identify useful meta-paths and multi-hop connections to learn a new graph structure for learning effective node representations on a graph. Comprehensive experiments and analyses on a real-world dataset collected from XuetangX show that our proposed model outperforms some state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document