scholarly journals Principled approach to the selection of the embedding dimension of networks

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Weiwei Gu ◽  
Aditya Tandon ◽  
Yong-Yeol Ahn ◽  
Filippo Radicchi

AbstractNetwork embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension – small enough to be efficient and large enough to be effective – is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible.

2021 ◽  
Vol 15 (4) ◽  
pp. 1-23
Author(s):  
Guojie Song ◽  
Yun Wang ◽  
Lun Du ◽  
Yi Li ◽  
Junshan Wang

Network embedding is a method of learning a low-dimensional vector representation of network vertices under the condition of preserving different types of network properties. Previous studies mainly focus on preserving structural information of vertices at a particular scale, like neighbor information or community information, but cannot preserve the hierarchical community structure, which would enable the network to be easily analyzed at various scales. Inspired by the hierarchical structure of galaxies, we propose the Galaxy Network Embedding (GNE) model, which formulates an optimization problem with spherical constraints to describe the hierarchical community structure preserving network embedding. More specifically, we present an approach of embedding communities into a low-dimensional spherical surface, the center of which represents the parent community they belong to. Our experiments reveal that the representations from GNE preserve the hierarchical community structure and show advantages in several applications such as vertex multi-class classification, network visualization, and link prediction. The source code of GNE is available online.


Author(s):  
Yuanfu Lu ◽  
Chuan Shi ◽  
Linmei Hu ◽  
Zhiyuan Liu

Heterogeneous information network (HIN) embedding aims to embed multiple types of nodes into a low-dimensional space. Although most existing HIN embedding methods consider heterogeneous relations in HINs, they usually employ one single model for all relations without distinction, which inevitably restricts the capability of network embedding. In this paper, we take the structural characteristics of heterogeneous relations into consideration and propose a novel Relation structure-aware Heterogeneous Information Network Embedding model (RHINE). By exploring the real-world networks with thorough mathematical analysis, we present two structure-related measures which can consistently distinguish heterogeneous relations into two categories: Affiliation Relations (ARs) and Interaction Relations (IRs). To respect the distinctive characteristics of relations, in our RHINE, we propose different models specifically tailored to handle ARs and IRs, which can better capture the structures and semantics of the networks. At last, we combine and optimize these models in a unified and elegant manner. Extensive experiments on three real-world datasets demonstrate that our model significantly outperforms the state-of-the-art methods in various tasks, including node clustering, link prediction, and node classification.


2020 ◽  
Vol 14 (Supplement_1) ◽  
pp. S395-S396
Author(s):  
U N Shivaji ◽  
A Bazarova ◽  
T Critchlow ◽  
O M Nardone ◽  
S C Smith ◽  
...  

Abstract Background In real-world clinical practice, biologics may be discontinued due to variety of reasons, including discontinuation by gastroenterologists, in the UK. The aim of the study was to report on outcomes after discontinuation in IBD patients after a minimum follow-up of 24 months. Methods All IBD patients who discontinued their first-use biologics between January 2013 and Dec 2016 were identified from the EMR at a tertiary referral centre to ensure at least 24 months follow-up. Reasons for discontinuation and pre-defined adverse outcomes (steroid and other rescue therapies, hospitalisations, surgery including perianal) were recorded. The data were analysed using multivariable and univariable logistic regressions within a machine learning technique in order to predict adverse outcomes within the stated timeframe. We performed Kaplan-Meier survival analysis for those patients who had biologics electively discontinued. Results In total, 147 IBD patients who discontinued biologics (M = 74.median age 39 years; CD = 110) were identified. Follow-up ranged from 24 to 60 months (median 40 months). One hundred and forty-four patients (98%) discontinued anti-TNF (1% biosimilar anti-TNF and 1% vedolizumab) and 69 (47%) were on thiopurines at start of biologics. Fifty-nine patients (40%) had biologics electively discontinued by their gastroenterologists and they were analysed separately. Of the 59 elective discontinuations, 22(37%) continued thiopurines after biologic stoppage; 37/59 (63%) patients had at least one adverse outcome (AO) within 6 months of discontinuation and 42/59 (71%) patients restarted biologics. Conclusion The majority of patients who electively discontinued biologics had IBD-related AO rapidly after the stoppage and needed to restart biologics. Almost all patients had AO by the end of the study follow-up period. This should be discussed with patients when considering discontinuation of biologics.


Author(s):  
Yu Li ◽  
Ying Wang ◽  
Tingting Zhang ◽  
Jiawei Zhang ◽  
Yi Chang

Network embedding is an effective approach to learn the low-dimensional representations of vertices in networks, aiming to capture and preserve the structure and inherent properties of networks. The vast majority of existing network embedding methods exclusively focus on vertex proximity of networks, while ignoring the network internal community structure. However, the homophily principle indicates that vertices within the same community are more similar to each other than those from different communities, thus vertices within the same community should have similar vertex representations. Motivated by this, we propose a novel network embedding framework NECS to learn the Network Embedding with Community Structural information, which preserves the high-order proximity and incorporates the community structure in vertex representation learning. We formulate the problem into a principled optimization framework and provide an effective alternating algorithm to solve it. Extensive experimental results on several benchmark network datasets demonstrate the effectiveness of the proposed framework in various network analysis tasks including network reconstruction, link prediction and vertex classification.


Author(s):  
Lun Du ◽  
Zhicong Lu ◽  
Yun Wang ◽  
Guojie Song ◽  
Yiming Wang ◽  
...  

Network embedding is a method of learning a low-dimensional vector representation of network vertices under the condition of preserving different types of network properties. Previous studies mainly focus on preserving structural information of vertices at a particular scale, like neighbor information or community information, but cannot preserve the hierarchical community structure, which would enable the network to be easily analyzed at various scales. Inspired by the hierarchical structure of galaxies, we propose the Galaxy Network Embedding (GNE) model, which formulates an optimization problem with spherical constraints to describe the hierarchical community structure preserving network embedding. More specifically, we present an approach of embedding communities into a low dimensional spherical surface, the center of which represents the parent community they belong to. Our experiments reveal that the representations from GNE preserve the hierarchical community structure and show advantages in several applications such as vertex multi-class classification and network visualization. The source code of GNE is available online.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-18
Author(s):  
Sezin Kircali Ata ◽  
Yuan Fang ◽  
Min Wu ◽  
Jiaqi Shi ◽  
Chee Keong Kwoh ◽  
...  

Real-world networks often exist with multiple views, where each view describes one type of interaction among a common set of nodes. For example, on a video-sharing network, while two user nodes are linked, if they have common favorite videos in one view, then they can also be linked in another view if they share common subscribers. Unlike traditional single-view networks, multiple views maintain different semantics to complement each other. In this article, we propose M ulti-view coll A borative N etwork E mbedding (MANE), a multi-view network embedding approach to learn low-dimensional representations. Similar to existing studies, MANE hinges on diversity and collaboration—while diversity enables views to maintain their individual semantics, collaboration enables views to work together. However, we also discover a novel form of second-order collaboration that has not been explored previously, and further unify it into our framework to attain superior node representations. Furthermore, as each view often has varying importance w.r.t. different nodes, we propose MANE , an attention -based extension of MANE, to model node-wise view importance. Finally, we conduct comprehensive experiments on three public, real-world multi-view networks, and the results demonstrate that our models consistently outperform state-of-the-art approaches.


2019 ◽  
Vol 2019 ◽  
pp. 1-15
Author(s):  
Wei Zhuo ◽  
Qianyi Zhan ◽  
Yuan Liu ◽  
Zhenping Xie ◽  
Jing Lu

Network embedding (NE), which maps nodes into a low-dimensional latent Euclidean space to represent effective features of each node in the network, has obtained considerable attention in recent years. Many popular NE methods, such as DeepWalk, Node2vec, and LINE, are capable of handling homogeneous networks. However, nodes are always fully accompanied by heterogeneous information (e.g., text descriptions, node properties, and hashtags) in the real-world network, which remains a great challenge to jointly project the topological structure and different types of information into the fixed-dimensional embedding space due to heterogeneity. Besides, in the unweighted network, how to quantify the strength of edges (tightness of connections between nodes) accurately is also a difficulty faced by existing methods. To bridge the gap, in this paper, we propose CAHNE (context attention heterogeneous network embedding), a novel network embedding method, to accurately determine the learning result. Specifically, we propose the concept of node importance to measure the strength of edges, which can better preserve the context relations of a node in unweighted networks. Moreover, text information is a widely ubiquitous feature in real-world networks, e.g., online social networks and citation networks. On account of the sophisticated interactions between the network structure and text features of nodes, CAHNE learns context embeddings for nodes by introducing the context node sequence, and the attention mechanism is also integrated into our model to better reflect the impact of context nodes on the current node. To corroborate the efficacy of CAHNE, we apply our method and various baseline methods on several real-world datasets. The experimental results show that CAHNE achieves higher quality compared to a number of state-of-the-art network embedding methods on the tasks of network reconstruction, link prediction, node classification, and visualization.


2020 ◽  
Vol 34 (04) ◽  
pp. 4772-4779 ◽  
Author(s):  
Yu Li ◽  
Yuan Tian ◽  
Jiawei Zhang ◽  
Yi Chang

Learning the low-dimensional representations of graphs (i.e., network embedding) plays a critical role in network analysis and facilitates many downstream tasks. Recently graph convolutional networks (GCNs) have revolutionized the field of network embedding, and led to state-of-the-art performance in network analysis tasks such as link prediction and node classification. Nevertheless, most of the existing GCN-based network embedding methods are proposed for unsigned networks. However, in the real world, some of the networks are signed, where the links are annotated with different polarities, e.g., positive vs. negative. Since negative links may have different properties from the positive ones and can also significantly affect the quality of network embedding. Thus in this paper, we propose a novel network embedding framework SNEA to learn Signed Network Embedding via graph Attention. In particular, we propose a masked self-attentional layer, which leverages self-attention mechanism to estimate the importance coefficient for pair of nodes connected by different type of links during the embedding aggregation process. Then SNEA utilizes the masked self-attentional layers to aggregate more important information from neighboring nodes to generate the node embeddings based on balance theory. Experimental results demonstrate the effectiveness of the proposed framework through signed link prediction task on several real-world signed network datasets.


Author(s):  
Zhenyu Qiu ◽  
Wenbin Hu ◽  
Jia Wu ◽  
ZhongZheng Tang ◽  
Xiaohua Jia

Network embedding assigns nodes in a network to low-dimensional representations and effectively preserves the structure and inherent properties of the network. Most existing network embedding methods didn't consider network noise. However, it is almost impossible to observe the actual structure of a real-world network without noise.  The noise in the network will affect the performance of network embedding dramatically. In this paper, we aim to exploit node similarity to address the problem of social network embedding with noise and propose a node similarity preserving (NSP) embedding method. NSP exploits a comprehensive similarity index to quantify the authenticity of the observed network structure. Then we propose an algorithm to construct a correction matrix to reduce the influence of noise. Finally, an objective function for accurate network embedding is proposed and an efficient algorithm to solve the optimization problem is provided. Extensive experimental results on a variety of applications of real-world networks with noise show the superior performance of the proposed method over the state-of-the-art methods. 


Sign in / Sign up

Export Citation Format

Share Document