Link Prediction through Deep Generative Model

AbstractInferring missing links or predicting future ones based on the currently observed network is known as link prediction, which has tremendous real-world applications in biomedicine1–3, e-commerce4, social media5 and criminal intelligence6. Numerous methods have been proposed to solve the link prediction problem7–9. Yet, many of these existing methods are designed for undirected networks only. Moreover, most methods are based on domain-specific heuristics10, and hence their performances differ greatly for networks from different domains. Here we developed a new link prediction method based on deep generative models11 in machine learning. This method does not rely on any domain-specific heuristic and works for general undirected or directed complex networks. Our key idea is to represent the adjacency matrix of a network as an image and then learn hierarchical feature representations of the image by training a deep generative model. Those features correspond to structural patterns in the network at different scales, from small subgraphs to mesoscopic communities12. Conceptually, taking into account structural patterns at different scales all together should outperform any domain-specific heuristics that typically focus on structural patterns at a particular scale. Indeed, when applied to various real-world networks from different domains13–17, our method shows overall superior performance against existing methods. Moreover, it can be easily parallelized by splitting a large network into several small subnetworks and then perform link prediction for each subnetwork in parallel. Our results imply that deep learning techniques can be effectively applied to complex networks and solve the classical link prediction problem with robust and superior performance.SummaryWe propose a new link prediction method based on deep generative models.

Download Full-text

Link prediction based on nonequilibrium cooperation effect

International Journal of Modern Physics B ◽

10.1142/s021797921850128x ◽

2018 ◽

Vol 32 (11) ◽

pp. 1850128 ◽

Cited By ~ 1

Author(s):

LanXi Li ◽

XuZhen Zhu ◽

Hui Tian

Keyword(s):

Numerical Analysis ◽

Theoretical Analysis ◽

Complex Networks ◽

Real World ◽

Link Prediction ◽

Prediction Method ◽

Large Degree ◽

Node Degree ◽

Improve Accuracy

Link prediction in complex networks has become a common focus of many researchers. But most existing methods concentrate on neighbors, and rarely consider degree heterogeneity of two endpoints. Node degree represents the importance or status of endpoints. We describe the large-degree heterogeneity as the nonequilibrium between nodes. This nonequilibrium facilitates a stable cooperation between endpoints, so that two endpoints with large-degree heterogeneity tend to connect stably. We name such a phenomenon as the nonequilibrium cooperation effect. Therefore, this paper proposes a link prediction method based on the nonequilibrium cooperation effect to improve accuracy. Theoretical analysis will be processed in advance, and at the end, experiments will be performed in 12 real-world networks to compare the mainstream methods with our indices in the network through numerical analysis.

Download Full-text

Accurate similarity index based on activity and connectivity of node for link prediction

International Journal of Modern Physics B ◽

10.1142/s0217979215501088 ◽

2015 ◽

Vol 29 (17) ◽

pp. 1550108 ◽

Cited By ~ 9

Author(s):

Longjie Li ◽

Lvjian Qian ◽

Xiaoping Wang ◽

Shishun Luo ◽

Xiaoyun Chen

Keyword(s):

Complex Networks ◽

Real World ◽

Link Prediction ◽

Similarity Index ◽

Experimental Results ◽

Network Data ◽

Prediction Problem ◽

Similarity Indices ◽

Average Activity ◽

Fundamental Requirement

Recent years have witnessed the increasing of available network data; however, much of those data is incomplete. Link prediction, which can find the missing links of a network, plays an important role in the research and analysis of complex networks. Based on the assumption that two unconnected nodes which are highly similar are very likely to have an interaction, most of the existing algorithms solve the link prediction problem by computing nodes' similarities. The fundamental requirement of those algorithms is accurate and effective similarity indices. In this paper, we propose a new similarity index, namely similarity based on activity and connectivity (SAC), which performs link prediction more accurately. To compute the similarity between two nodes, this index employs the average activity of these two nodes in their common neighborhood and the connectivities between them and their common neighbors. The higher the average activity is and the stronger the connectivities are, the more similar the two nodes are. The proposed index not only commendably distinguishes the contributions of paths but also incorporates the influence of endpoints. Therefore, it can achieve a better predicting result. To verify the performance of SAC, we conduct experiments on 10 real-world networks. Experimental results demonstrate that SAC outperforms the compared baselines.

Download Full-text

Deep learning based network similarity for model selection

Data Science ◽

10.3233/ds-210033 ◽

2021 ◽

pp. 1-21

Author(s):

Kushal Veer Singh ◽

Ajay Kumar Verma ◽

Lovekesh Vig

Keyword(s):

Deep Learning ◽

Model Selection ◽

Complex Networks ◽

Real World ◽

Network Architecture ◽

Large Scale ◽

Generative Models ◽

Generative Model ◽

Data Set ◽

Feed Forward Network

Capturing data in the form of networks is becoming an increasingly popular approach for modeling, analyzing and visualising complex phenomena, to understand the important properties of the underlying complex processes. Access to many large-scale network datasets is restricted due to the privacy and security concerns. Also for several applications (such as functional connectivity networks), generating large scale real data is expensive. For these reasons, there is a growing need for advanced mathematical and statistical models (also called generative models) that can account for the structure of these large-scale networks, without having to materialize them in the real world. The objective is to provide a comprehensible description of the network properties and to be able to infer previously unobserved properties. Various models have been developed by researchers, which generate synthetic networks that adhere to the structural properties of real networks. However, the selection of the appropriate generative model for a given real-world network remains an important challenge. In this paper, we investigate this problem and provide a novel technique (named as TripletFit) for model selection (or network classification) and estimation of structural similarities of the complex networks. The goal of network model selection is to select a generative model that is able to generate a structurally similar synthetic network for a given real-world (target) network. We consider six outstanding generative models as the candidate models. The existing model selection methods mostly suffer from sensitivity to network perturbations, dependency on the size of the networks, and low accuracy. To overcome these limitations, we considered a broad array of network features, with the aim of representing different structural aspects of the network and employed deep learning techniques such as deep triplet network architecture and simple feed-forward network for model selection and estimation of structural similarities of the complex networks. Our proposed method, outperforms existing methods with respect to accuracy, noise-tolerance, and size independence on a number of gold standard data set used in previous studies.

Download Full-text

Structural Patterns and Generative Models of Real-world Hypergraphs

Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining ◽

10.1145/3394486.3403060 ◽

2020 ◽

Author(s):

Manh Tuan Do ◽

Se-eun Yoon ◽

Bryan Hooi ◽

Kijung Shin

Keyword(s):

Real World ◽

Generative Models ◽

Structural Patterns

Download Full-text

An information theoretic approach to link prediction in multiplex networks

Scientific Reports ◽

10.1038/s41598-021-92427-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Seyed Hossein Jafari ◽

Amir Mahdi Abdolhosseini-Qomi ◽

Masoud Asadpour ◽

Maseud Rahgozar ◽

Naser Yazdani

Keyword(s):

Real World ◽

Link Prediction ◽

Large Scale ◽

Similarity Measures ◽

Prediction Method ◽

General Purpose ◽

Fast Method ◽

Theoretic Approach ◽

Multiplex Networks ◽

Wide Range

AbstractThe entities of real-world networks are connected via different types of connections (i.e., layers). The task of link prediction in multiplex networks is about finding missing connections based on both intra-layer and inter-layer correlations. Our observations confirm that in a wide range of real-world multiplex networks, from social to biological and technological, a positive correlation exists between connection probability in one layer and similarity in other layers. Accordingly, a similarity-based automatic general-purpose multiplex link prediction method—SimBins—is devised that quantifies the amount of connection uncertainty based on observed inter-layer correlations in a multiplex network. Moreover, SimBins enhances the prediction quality in the target layer by incorporating the effect of link overlap across layers. Applying SimBins to various datasets from diverse domains, our findings indicate that SimBins outperforms the compared methods (both baseline and state-of-the-art methods) in most instances when predicting links. Furthermore, it is discussed that SimBins imposes minor computational overhead to the base similarity measures making it a potentially fast method, suitable for large-scale multiplex networks.

Download Full-text

Link prediction based on local weighted paths for complex networks

International Journal of Modern Physics C ◽

10.1142/s012918311750053x ◽

2017 ◽

Vol 28 (04) ◽

pp. 1750053

Author(s):

Yabing Yao ◽

Ruisheng Zhang ◽

Fan Yang ◽

Yongna Yuan ◽

Rongjing Hu ◽

...

Keyword(s):

Complex Networks ◽

Real World ◽

Link Prediction ◽

Structural Similarity ◽

Prediction Performance ◽

Topological Feature ◽

Topological Features ◽

Node Similarity ◽

Weighted Paths ◽

Path Dependent

As a significant problem in complex networks, link prediction aims to find the missing and future links between two unconnected nodes by estimating the existence likelihood of potential links. It plays an important role in understanding the evolution mechanism of networks and has broad applications in practice. In order to improve prediction performance, a variety of structural similarity-based methods that rely on different topological features have been put forward. As one topological feature, the path information between node pairs is utilized to calculate the node similarity. However, many path-dependent methods neglect the different contributions of paths for a pair of nodes. In this paper, a local weighted path (LWP) index is proposed to differentiate the contributions between paths. The LWP index considers the effect of the link degrees of intermediate links and the connectivity influence of intermediate nodes on paths to quantify the path weight in the prediction procedure. The experimental results on 12 real-world networks show that the LWP index outperforms other seven prediction baselines.

Download Full-text

Finding Missing Links in Complex Networks: A Multiple-Attribute Decision-Making Method

Complexity ◽

10.1155/2018/3579758 ◽

2018 ◽

Vol 2018 ◽

pp. 1-16 ◽

Cited By ~ 3

Author(s):

Longjie Li ◽

Shenshen Bai ◽

Mingwei Leng ◽

Lu Wang ◽

Xiaoyun Chen

Keyword(s):

Decision Making ◽

Complex Networks ◽

Similarity Measure ◽

Real World ◽

Link Prediction ◽

State Of The Art ◽

Multiple Attribute Decision Making ◽

Ideal Solution ◽

Multiple Attribute ◽

Novel Method

Link prediction, which aims to forecast potential or missing links in a complex network based on currently observed information, has drawn growing attention from researchers. To date, a host of similarity-based methods have been put forward. Usually, one method harbors the idea that one similarity measure is applicable to various networks, and thus has performance fluctuation on different networks. In this paper, we propose a novel method to solve this issue by regarding link prediction as a multiple-attribute decision-making (MADM) problem. In the proposed method, we consider RA, LP, and CAR indices as the multiattribute for node pairs. The technique for order performance by similarity to ideal solution (TOPSIS) is adopted to aggregate the multiattribute and rank node pairs. The proposed method is not limited to only one similarity measure, but takes separate measures into account, since different networks may have different topological structures. Experimental results on 10 real-world networks manifest that the proposed method is superior in comparison to state-of-the-art methods.

Download Full-text

Relative Assortativity Index: A Quantitative Metric to Assess the Impact of Link Prediction Techniques on Assortativity of Complex Networks

The Computer Journal ◽

10.1093/comjnl/bxz089 ◽

2019 ◽

Vol 63 (9) ◽

pp. 1417-1437

Author(s):

Natarajan Meghanathan

Keyword(s):

Complex Networks ◽

Real World ◽

Biological Networks ◽

Link Prediction ◽

Preferential Attachment ◽

Prediction Technique ◽

Prediction Techniques ◽

The Impact

Abstract We propose a quantitative metric (called relative assortativity index, RAI) to assess the extent with which a real-world network would become relatively more assortative due to link addition(s) using a link prediction technique. Our methodology is as follows: for a link prediction technique applied on a particular real-world network, we keep track of the assortativity index values incurred during the sequence of link additions until there is negligible change in the assortativity index values for successive link additions. We count the number of network instances for which the assortativity index after a link addition is greater or lower than the assortativity index prior to the link addition and refer to these counts as relative assortativity count and relative dissortativity count, respectively. RAI is computed as (relative assortativity count − relative dissortativity count) / (relative assortativity count + relative dissortativity count). We analyzed a suite of 80 real-world networks across different domains using 3 representative neighborhood-based link prediction techniques (Preferential attachment, Adamic Adar and Jaccard coefficients [JACs]). We observe the RAI values for the JAC technique to be positive and larger for several real-world networks, while most of the biological networks exhibited positive RAI values for all the three techniques.

Download Full-text

Evolving Fisher Kernels for Biological Sequence Classification

Evolutionary Computation ◽

10.1162/evco_a_00065 ◽

2013 ◽

Vol 21 (1) ◽

pp. 83-105 ◽

Cited By ~ 2

Author(s):

K.-J. Won ◽

C. Saunders ◽

A. Prügel-Bennett

Keyword(s):

Sequence Similarity ◽

Generative Models ◽

Complex Model ◽

Generative Model ◽

Support Vector ◽

Homologous Sequence ◽

Sequence Information ◽

Biological Sequence ◽

Domain Specific ◽

Fisher Kernel

Fisher kernels have been successfully applied to many problems in bioinformatics. However, their success depends on the quality of the generative model upon which they are built. For Fisher kernel techniques to be used on novel problems, a mechanism for creating accurate generative models is required. A novel framework is presented for automatically creating domain-specific generative models that can be used to produce Fisher kernels for support vector machines (SVMs) and other kernel methods. The framework enables the capture of prior knowledge and addresses the issue of domain-specific kernels, both of which are current areas that are lacking in many kernel-based methods. To obtain the generative model, genetic algorithms are used to evolve the structure of hidden Markov models (HMMs). A Fisher kernel is subsequently created from the HMM, and used in conjunction with an SVM, to improve the discriminative power. This paper investigates the effectiveness of the proposed method, named GA-SVM. We show that its performance is comparable if not better than other state of the art methods in classifying secretory protein sequences of malaria. More interestingly, it showed better results than the sequence-similarity-based approach, without the need for additional homologous sequence information in protein enzyme family classification. The experiments clearly demonstrate that the GA-SVM is a novel way to find features with good performance from biological sequences, that does not require extensive tuning of a complex model.

Download Full-text

Research on Threat Information Network Based on Link Prediction

International Journal of Digital Crime and Forensics ◽

10.4018/ijdcf.2021030106 ◽

2021 ◽

Vol 13 (2) ◽

pp. 94-102

Author(s):

Jin Du ◽

Feng Yuan ◽

Liping Ding ◽

Guangxuan Chen ◽

Xuehua Liu

Keyword(s):

Complex Networks ◽

Link Prediction ◽

Situational Awareness ◽

Prediction Method ◽

Interdisciplinary Approach ◽

Network Evolution ◽

Information Network ◽

Research Perspective ◽

Intrinsic Factors ◽

Node Similarity

The study of complex networks is to discover the characteristics of these connections and to discover the nature of the system between them. Link prediction method is a classic in the study of complex networks. It ca not only reflect the relationship between the node similarity. More can be estimated through the edge, which reveals the intrinsic factors of network evolution, namely the network evolution mechanism. Threat information network is the evolution and development of the network. The introduction of such a complex network of interdisciplinary approach is an innovative research perspective to observe that the threat intelligence occurs. The characteristics of the network show, at the same time, also can predict what will happen. The evolution of the network for network security situational awareness of the research provides a new approach.

Download Full-text