Attributed Network Embedding with Micro-Meso Structure

2021 ◽  
Vol 15 (4) ◽  
pp. 1-26
Author(s):  
Juan-Hui Li ◽  
Ling Huang ◽  
Chang-Dong Wang ◽  
Dong Huang ◽  
Jian-Huang Lai ◽  
...  

Recently, network embedding has received a large amount of attention in network analysis. Although some network embedding methods have been developed from different perspectives, on one hand, most of the existing methods only focus on leveraging the plain network structure, ignoring the abundant attribute information of nodes. On the other hand, for some methods integrating the attribute information, only the lower-order proximities (e.g., microscopic proximity structure) are taken into account, which may suffer if there exists the sparsity issue and the attribute information is noisy. To overcome this problem, the attribute information and mesoscopic community structure are utilized. In this article, we propose a novel network embedding method termed Attributed Network Embedding with Micro-Meso structure, which is capable of preserving both the attribute information and the structural information including the microscopic proximity structure and mesoscopic community structure. In particular, both the microscopic proximity structure and node attributes are factorized by Nonnegative Matrix Factorization (NMF), from which the low-dimensional node representations can be obtained. For the mesoscopic community structure, a community membership strength matrix is inferred by a generative model (i.e., BigCLAM) or modularity from the linkage structure, which is then factorized by NMF to obtain the low-dimensional node representations. The three components are jointly correlated by the low-dimensional node representations, from which two objective functions (i.e., ANEM_B and ANEM_M) can be defined. Two efficient alternating optimization schemes are proposed to solve the optimization problems. Extensive experiments have been conducted to confirm the superior performance of the proposed models over the state-of-the-art network embedding methods.

Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Dong Liu ◽  
Yan Ru ◽  
Qinpeng Li ◽  
Shibin Wang ◽  
Jianwei Niu

Network embedding aims to learn the low-dimensional representations of nodes in networks. It preserves the structure and internal attributes of the networks while representing nodes as low-dimensional dense real-valued vectors. These vectors are used as inputs of machine learning algorithms for network analysis tasks such as node clustering, classification, link prediction, and network visualization. The network embedding algorithms, which considered the community structure, impose a higher level of constraint on the similarity of nodes, and they make the learned node embedding results more discriminative. However, the existing network representation learning algorithms are mostly unsupervised models; the pairwise constraint information, which represents community membership, is not effectively utilized to obtain node embedding results that are more consistent with prior knowledge. This paper proposes a semisupervised modularized nonnegative matrix factorization model, SMNMF, while preserving the community structure for network embedding; the pairwise constraints (must-link and cannot-link) information are effectively fused with the adjacency matrix and node similarity matrix of the network so that the node representations learned by the model are more interpretable. Experimental results on eight real network datasets show that, comparing with the representative network embedding methods, the node representations learned after incorporating the pairwise constraints can obtain higher accuracy in node clustering task and the results of link prediction, and network visualization tasks indicate that the semisupervised model SMNMF is more discriminative than unsupervised ones.


2021 ◽  
Vol 11 (5) ◽  
pp. 2371
Author(s):  
Junjian Zhan ◽  
Feng Li ◽  
Yang Wang ◽  
Daoyu Lin ◽  
Guangluan Xu

As most networks come with some content in each node, attributed network embedding has aroused much research interest. Most existing attributed network embedding methods aim at learning a fixed representation for each node encoding its local proximity. However, those methods usually neglect the global information between nodes distant from each other and distribution of the latent codes. We propose Structural Adversarial Variational Graph Auto-Encoder (SAVGAE), a novel framework which encodes the network structure and node content into low-dimensional embeddings. On one hand, our model captures the local proximity and proximities at any distance of a network by exploiting a high-order proximity indicator named Rooted Pagerank. On the other hand, our method learns the data distribution of each node representation while circumvents the side effect its sampling process causes on learning a robust embedding through adversarial training. On benchmark datasets, we demonstrate that our method performs competitively compared with state-of-the-art models.


Author(s):  
Yu Li ◽  
Ying Wang ◽  
Tingting Zhang ◽  
Jiawei Zhang ◽  
Yi Chang

Network embedding is an effective approach to learn the low-dimensional representations of vertices in networks, aiming to capture and preserve the structure and inherent properties of networks. The vast majority of existing network embedding methods exclusively focus on vertex proximity of networks, while ignoring the network internal community structure. However, the homophily principle indicates that vertices within the same community are more similar to each other than those from different communities, thus vertices within the same community should have similar vertex representations. Motivated by this, we propose a novel network embedding framework NECS to learn the Network Embedding with Community Structural information, which preserves the high-order proximity and incorporates the community structure in vertex representation learning. We formulate the problem into a principled optimization framework and provide an effective alternating algorithm to solve it. Extensive experimental results on several benchmark network datasets demonstrate the effectiveness of the proposed framework in various network analysis tasks including network reconstruction, link prediction and vertex classification.


Author(s):  
Wei Wu ◽  
Bin Li ◽  
Ling Chen ◽  
Chengqi Zhang

Attributed network embedding aims to learn a low-dimensional representation for each node of a network, considering both attributes and structure information of the node. However, the learning based methods usually involve substantial cost in time, which makes them impractical without the help of a powerful workhorse. In this paper, we propose a simple yet effective algorithm, named NetHash, to solve this problem only with moderate computing capacity. NetHash employs the randomized hashing technique to encode shallow trees, each of which is rooted at a node of the network. The main idea is to efficiently encode both attributes and structure information of each node by recursively sketching the corresponding rooted tree from bottom (i.e., the predefined highest-order neighboring nodes) to top (i.e., the root node), and particularly, to preserve as much information closer to the root node as possible. Our extensive experimental results show that the proposed algorithm, which does not need learning, runs significantly faster than the state-of-the-art learning-based network embedding methods while achieving competitive or even better performance in accuracy. 


Author(s):  
Zhenyu Qiu ◽  
Wenbin Hu ◽  
Jia Wu ◽  
ZhongZheng Tang ◽  
Xiaohua Jia

Network embedding assigns nodes in a network to low-dimensional representations and effectively preserves the structure and inherent properties of the network. Most existing network embedding methods didn't consider network noise. However, it is almost impossible to observe the actual structure of a real-world network without noise.  The noise in the network will affect the performance of network embedding dramatically. In this paper, we aim to exploit node similarity to address the problem of social network embedding with noise and propose a node similarity preserving (NSP) embedding method. NSP exploits a comprehensive similarity index to quantify the authenticity of the observed network structure. Then we propose an algorithm to construct a correction matrix to reduce the influence of noise. Finally, an objective function for accurate network embedding is proposed and an efficient algorithm to solve the optimization problem is provided. Extensive experimental results on a variety of applications of real-world networks with noise show the superior performance of the proposed method over the state-of-the-art methods. 


2020 ◽  
Vol 34 (04) ◽  
pp. 4091-4098 ◽  
Author(s):  
Tao He ◽  
Lianli Gao ◽  
Jingkuan Song ◽  
Xin Wang ◽  
Kejie Huang ◽  
...  

Learning accurate low-dimensional embeddings for a network is a crucial task as it facilitates many network analytics tasks. Moreover, the trained embeddings often require a significant amount of space to store, making storage and processing a challenge, especially as large-scale networks become more prevalent. In this paper, we present a novel semi-supervised network embedding and compression method, SNEQ, that is competitive with state-of-art embedding methods while being far more space- and time-efficient. SNEQ incorporates a novel quantisation method based on a self-attention layer that is trained in an end-to-end fashion, which is able to dramatically compress the size of the trained embeddings, thus reduces storage footprint and accelerates retrieval speed. Our evaluation on four real-world networks of diverse characteristics shows that SNEQ outperforms a number of state-of-the-art embedding methods in link prediction, node classification and node recommendation. Moreover, the quantised embedding shows a great advantage in terms of storage and time compared with continuous embeddings as well as hashing methods.


2021 ◽  
Vol 15 (4) ◽  
pp. 1-23
Author(s):  
Guojie Song ◽  
Yun Wang ◽  
Lun Du ◽  
Yi Li ◽  
Junshan Wang

Network embedding is a method of learning a low-dimensional vector representation of network vertices under the condition of preserving different types of network properties. Previous studies mainly focus on preserving structural information of vertices at a particular scale, like neighbor information or community information, but cannot preserve the hierarchical community structure, which would enable the network to be easily analyzed at various scales. Inspired by the hierarchical structure of galaxies, we propose the Galaxy Network Embedding (GNE) model, which formulates an optimization problem with spherical constraints to describe the hierarchical community structure preserving network embedding. More specifically, we present an approach of embedding communities into a low-dimensional spherical surface, the center of which represents the parent community they belong to. Our experiments reveal that the representations from GNE preserve the hierarchical community structure and show advantages in several applications such as vertex multi-class classification, network visualization, and link prediction. The source code of GNE is available online.


2021 ◽  
pp. 1-12
Author(s):  
JinFang Sheng ◽  
Huaiyu Zuo ◽  
Bin Wang ◽  
Qiong Li

 In a complex network system, the structure of the network is an extremely important element for the analysis of the system, and the study of community detection algorithms is key to exploring the structure of the complex network. Traditional community detection algorithms would represent the network using an adjacency matrix based on observations, which may contain redundant information or noise that interferes with the detection results. In this paper, we propose a community detection algorithm based on density clustering. In order to improve the performance of density clustering, we consider an algorithmic framework for learning the continuous representation of network nodes in a low-dimensional space. The network structure is effectively preserved through network embedding, and density clustering is applied in the embedded low-dimensional space to compute the similarity of nodes in the network, which in turn reveals the implied structure in a given network. Experiments show that the algorithm has superior performance compared to other advanced community detection algorithms for real-world networks in multiple domains as well as synthetic networks, especially when the network data chaos is high.


Author(s):  
Yuanfu Lu ◽  
Chuan Shi ◽  
Linmei Hu ◽  
Zhiyuan Liu

Heterogeneous information network (HIN) embedding aims to embed multiple types of nodes into a low-dimensional space. Although most existing HIN embedding methods consider heterogeneous relations in HINs, they usually employ one single model for all relations without distinction, which inevitably restricts the capability of network embedding. In this paper, we take the structural characteristics of heterogeneous relations into consideration and propose a novel Relation structure-aware Heterogeneous Information Network Embedding model (RHINE). By exploring the real-world networks with thorough mathematical analysis, we present two structure-related measures which can consistently distinguish heterogeneous relations into two categories: Affiliation Relations (ARs) and Interaction Relations (IRs). To respect the distinctive characteristics of relations, in our RHINE, we propose different models specifically tailored to handle ARs and IRs, which can better capture the structures and semantics of the networks. At last, we combine and optimize these models in a unified and elegant manner. Extensive experiments on three real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods in various tasks, including node clustering, link prediction, and node classification.


Molecules ◽  
2019 ◽  
Vol 24 (17) ◽  
pp. 3099 ◽  
Author(s):  
Xuan ◽  
Li ◽  
Zhang ◽  
Zhang ◽  
Song

Identifying disease-associated microRNAs (disease miRNAs) contributes to the understanding of disease pathogenesis. Most previous computational biology studies focused on multiple kinds of connecting edges of miRNAs and diseases, including miRNA–miRNA similarities, disease–disease similarities, and miRNA–disease associations. Few methods exploited the node attribute information related to miRNA family and cluster. The previous methods do not completely consider the sparsity of node attributes. Additionally, it is challenging to deeply integrate the node attributes of miRNAs and the similarities and associations related to miRNAs and diseases. In the present study, we propose a novel method, known as MDAPred, based on nonnegative matrix factorization to predict candidate disease miRNAs. MDAPred integrates the node attributes of miRNAs and the related similarities and associations of miRNAs and diseases. Since a miRNA is typically subordinate to a family or a cluster, the node attributes of miRNAs are sparse. Similarly, the data for miRNA and disease similarities are sparse. Projecting the miRNA and disease similarities and miRNA node attributes into a common low-dimensional space contributes to estimating miRNA-disease associations. Simultaneously, the possibility that a miRNA is associated with a disease depends on the miRNA’s neighbour information. Therefore, MDAPred deeply integrates projections of multiple kinds of connecting edges, projections of miRNAs node attributes, and neighbour information of miRNAs. The cross-validation results showed that MDAPred achieved superior performance compared to other state-of-the-art methods for predicting disease-miRNA associations. MDAPred can also retrieve more actual miRNA-disease associations at the top of prediction results, which is very important for biologists. Additionally, case studies of breast, lung, and pancreatic cancers further confirmed the ability of MDAPred to discover potential miRNA–disease associations.


Sign in / Sign up

Export Citation Format

Share Document