Supervised temporal link prediction in large-scale real-world networks

Gerrit Jan de Bruin; Cor J. Veenman; H. Jaap van den Herik; Frank W. Takes

doi:10.1007/s13278-021-00787-3

Supervised temporal link prediction in large-scale real-world networks

Social Network Analysis and Mining ◽

10.1007/s13278-021-00787-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Gerrit Jan de Bruin ◽

Cor J. Veenman ◽

H. Jaap van den Herik ◽

Frank W. Takes

Keyword(s):

Real World ◽

Link Prediction ◽

Large Scale ◽

Systematic Investigation ◽

Prediction Performance ◽

Temporal Information ◽

Temporal Networks ◽

Discrete Events ◽

New Approach ◽

Topological Features

AbstractLink prediction is a well-studied technique for inferring the missing edges between two nodes in some static representation of a network. In modern day social networks, the timestamps associated with each link can be used to predict future links between so-far unconnected nodes. In these so-called temporal networks, we speak of temporal link prediction. This paper presents a systematic investigation of supervised temporal link prediction on 26 temporal, structurally diverse, real-world networks ranging from thousands to a million nodes and links. We analyse the relation between global structural properties of each network and the obtained temporal link prediction performance, employing a set of well-established topological features commonly used in the link prediction literature. We report on four contributions. First, using temporal information, an improvement of prediction performance is observed. Second, our experiments show that degree disassortative networks perform better in temporal link prediction than assortative networks. Third, we present a new approach to investigate the distinction between networks modelling discrete events and networks modelling persistent relations. Unlike earlier work, our approach utilises information on all past events in a systematic way, resulting in substantially higher link prediction performance. Fourth, we report on the influence of the temporal activity of the node or the edge on the link prediction performance, and show that the performance differs depending on the considered network type. In the studied information networks, temporal information on the node appears most important. The findings in this paper demonstrate how link prediction can effectively be improved in temporal networks, explicitly taking into account the type of connectivity modelled by the temporal edge. More generally, the findings contribute to a better understanding of the mechanisms behind the evolution of networks.

Download Full-text

Link prediction based on local weighted paths for complex networks

International Journal of Modern Physics C ◽

10.1142/s012918311750053x ◽

2017 ◽

Vol 28 (04) ◽

pp. 1750053

Author(s):

Yabing Yao ◽

Ruisheng Zhang ◽

Fan Yang ◽

Yongna Yuan ◽

Rongjing Hu ◽

...

Keyword(s):

Complex Networks ◽

Real World ◽

Link Prediction ◽

Structural Similarity ◽

Prediction Performance ◽

Topological Feature ◽

Topological Features ◽

Node Similarity ◽

Weighted Paths ◽

Path Dependent

As a significant problem in complex networks, link prediction aims to find the missing and future links between two unconnected nodes by estimating the existence likelihood of potential links. It plays an important role in understanding the evolution mechanism of networks and has broad applications in practice. In order to improve prediction performance, a variety of structural similarity-based methods that rely on different topological features have been put forward. As one topological feature, the path information between node pairs is utilized to calculate the node similarity. However, many path-dependent methods neglect the different contributions of paths for a pair of nodes. In this paper, a local weighted path (LWP) index is proposed to differentiate the contributions between paths. The LWP index considers the effect of the link degrees of intermediate links and the connectivity influence of intermediate nodes on paths to quantify the path weight in the prediction procedure. The experimental results on 12 real-world networks show that the LWP index outperforms other seven prediction baselines.

Download Full-text

Higher-order temporal network effects through triplet evolution

Scientific Reports ◽

10.1038/s41598-021-94389-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Qing Yao ◽

Bingsheng Chen ◽

Tim S. Evans ◽

Kim Christensen

Keyword(s):

Real World ◽

Link Prediction ◽

Higher Order ◽

Prediction Algorithm ◽

Interaction Patterns ◽

Temporal Networks ◽

World Systems ◽

Order Interaction ◽

Space And Time ◽

Pairwise Interactions

AbstractWe study the evolution of networks through ‘triplets’—three-node graphlets. We develop a method to compute a transition matrix to describe the evolution of triplets in temporal networks. To identify the importance of higher-order interactions in the evolution of networks, we compare both artificial and real-world data to a model based on pairwise interactions only. The significant differences between the computed matrix and the calculated matrix from the fitted parameters demonstrate that non-pairwise interactions exist for various real-world systems in space and time, such as our data sets. Furthermore, this also reveals that different patterns of higher-order interaction are involved in different real-world situations. To test our approach, we then use these transition matrices as the basis of a link prediction algorithm. We investigate our algorithm’s performance on four temporal networks, comparing our approach against ten other link prediction methods. Our results show that higher-order interactions in both space and time play a crucial role in the evolution of networks as we find our method, along with two other methods based on non-local interactions, give the best overall performance. The results also confirm the concept that the higher-order interaction patterns, i.e., triplet dynamics, can help us understand and predict the evolution of different real-world systems.

Download Full-text

An information theoretic approach to link prediction in multiplex networks

Scientific Reports ◽

10.1038/s41598-021-92427-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Seyed Hossein Jafari ◽

Amir Mahdi Abdolhosseini-Qomi ◽

Masoud Asadpour ◽

Maseud Rahgozar ◽

Naser Yazdani

Keyword(s):

Real World ◽

Link Prediction ◽

Large Scale ◽

Similarity Measures ◽

Prediction Method ◽

General Purpose ◽

Fast Method ◽

Theoretic Approach ◽

Multiplex Networks ◽

Wide Range

AbstractThe entities of real-world networks are connected via different types of connections (i.e., layers). The task of link prediction in multiplex networks is about finding missing connections based on both intra-layer and inter-layer correlations. Our observations confirm that in a wide range of real-world multiplex networks, from social to biological and technological, a positive correlation exists between connection probability in one layer and similarity in other layers. Accordingly, a similarity-based automatic general-purpose multiplex link prediction method—SimBins—is devised that quantifies the amount of connection uncertainty based on observed inter-layer correlations in a multiplex network. Moreover, SimBins enhances the prediction quality in the target layer by incorporating the effect of link overlap across layers. Applying SimBins to various datasets from diverse domains, our findings indicate that SimBins outperforms the compared methods (both baseline and state-of-the-art methods) in most instances when predicting links. Furthermore, it is discussed that SimBins imposes minor computational overhead to the base similarity measures making it a potentially fast method, suitable for large-scale multiplex networks.

Download Full-text

Susceptible-infected-spreading-based network embedding in static and temporal networks

EPJ Data Science ◽

10.1140/epjds/s13688-020-00248-5 ◽

2020 ◽

Vol 9 (1) ◽

Author(s):

Xiu-Xiu Zhan ◽

Ziyu Li ◽

Naoki Masuda ◽

Petter Holme ◽

Huijuan Wang

Keyword(s):

Random Walk ◽

Link Prediction ◽

Large Scale ◽

Language Model ◽

Network Evolution ◽

Temporal Networks ◽

Network Embedding ◽

Comparison Task ◽

Missing Link ◽

Sample Paths

Abstract Link prediction can be used to extract missing information, identify spurious interactions as well as forecast network evolution. Network embedding is a methodology to assign coordinates to nodes in a low-dimensional vector space. By embedding nodes into vectors, the link prediction problem can be converted into a similarity comparison task. Nodes with similar embedding vectors are more likely to be connected. Classic network embedding algorithms are random-walk-based. They sample trajectory paths via random walks and generate node pairs from the trajectory paths. The node pair set is further used as the input for a Skip-Gram model, a representative language model that embeds nodes (which are regarded as words) into vectors. In the present study, we propose to replace random walk processes by a spreading process, namely the susceptible-infected (SI) model, to sample paths. Specifically, we propose two susceptible-infected-spreading-based algorithms, i.e., Susceptible-Infected Network Embedding (SINE) on static networks and Temporal Susceptible-Infected Network Embedding (TSINE) on temporal networks. The performance of our algorithms is evaluated by the missing link prediction task in comparison with state-of-the-art static and temporal network embedding algorithms. Results show that SINE and TSINE outperform the baselines across all six empirical datasets. We further find that the performance of SINE is mostly better than TSINE, suggesting that temporal information does not necessarily improve the embedding for missing link prediction. Moreover, we study the effect of the sampling size, quantified as the total length of the trajectory paths, on the performance of the embedding algorithms. The better performance of SINE and TSINE requires a smaller sampling size in comparison with the baseline algorithms. Hence, SI-spreading-based embedding tends to be more applicable to large-scale networks.

Download Full-text

Link prediction in real-world multiplex networks via layer reconstruction method

Royal Society Open Science ◽

10.1098/rsos.191928 ◽

2020 ◽

Vol 7 (7) ◽

pp. 191928

Author(s):

Amir Mahdi Abdolhosseini-Qomi ◽

Seyed Hossein Jafari ◽

Amirheckmat Taghizadeh ◽

Naser Yazdani ◽

Masoud Asadpour ◽

...

Keyword(s):

Real World ◽

Link Prediction ◽

Large Body ◽

Single Layer ◽

Structural Features ◽

Prediction Performance ◽

Reconstruction Method ◽

Multiplex Networks ◽

Technological Complex ◽

Different Types

Networks are invaluable tools to study real biological, social and technological complex systems in which connected elements form a purposeful phenomenon. A higher resolution image of these systems shows that the connection types do not confine to one but to a variety of types. Multiplex networks encode this complexity with a set of nodes which are connected in different layers via different types of links. A large body of research on link prediction problem is devoted to finding missing links in single-layer (simplex) networks. In recent years, the problem of link prediction in multiplex networks has gained the attention of researchers from different scientific communities. Although most of these studies suggest that prediction performance can be enhanced by using the information contained in different layers of the network, the exact source of this enhancement remains obscure. Here, it is shown that similarity w.r.t. structural features (eigenvectors) is a major source of enhancements for link prediction task in multiplex networks using the proposed layer reconstruction method and experiments on real-world multiplex networks from different disciplines. Moreover, we characterize how low values of similarity w.r.t. structural features result in cases where improving prediction performance is substantially hard.

Download Full-text

TensorCast: Forecasting Time-Evolving Networks with Contextual Information

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/721 ◽

2018 ◽

Cited By ~ 2

Author(s):

Miguel Araújo ◽

Pedro Ribeiro ◽

Christos Faloutsos

Keyword(s):

Social Network ◽

Real World ◽

Large Scale ◽

Information Sources ◽

Contextual Information ◽

Temporal Networks ◽

Evolving Networks ◽

Award Winning ◽

Data Source

Can we forecast future connections in a social network? Can we predict who will start using a given hashtag in Twitter, leveraging contextual information such as who follows or retweets whom to improve our predictions? In this paper we present an abridged report of TensorCast, an award winning method for forecasting time-evolving networks, that uses coupled tensors to incorporate multiple information sources. TensorCast is scalable (linearithmic on the number of connections), effective (more precise than competing methods) and general (applicable to any data source representable by a tensor). We also showcase our method when applied to forecast two large scale heterogeneous real world temporal networks, namely Twitter and DBLP.

Download Full-text

Analyzing the Bills-Voting Dynamics and Predicting Corruption-Convictions Among Brazilian Congressmen Through Temporal Networks

Scientific Reports ◽

10.1038/s41598-019-53252-9 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 4

Author(s):

Tiago Colliri ◽

Liang Zhao

Keyword(s):

Link Prediction ◽

Large Scale ◽

Prediction Method ◽

High Accuracy ◽

Financial Information ◽

Prediction Methods ◽

Temporal Networks ◽

Public Data ◽

Financial Crimes ◽

Prediction Techniques

AbstractIn this paper, we propose a network-based technique to analyze bills-voting data comprising the votes of Brazilian congressmen for a period of 28 years. The voting sessions are initially mapped into static networks, where each node represents a congressman and each edge stands for the similarity of votes between a pair of congressmen. Afterwards, the constructed static networks are converted to temporal networks. Our analyses on the temporal networks capture some of the main political changes happened in Brazil during the period of time under consideration. Moreover, we find out that the bills-voting networks can be used to identify convicted politicians, who commit corruption or other financial crimes. Therefore, we propose two conviction prediction methods, one is based on the highest weighted convicted neighbor and the other is based on link prediction techniques. It is a surprise to us that the high accuracy (up to 90% by the link prediction method) on predicting convictions is achieved only through bills-voting data, without taking into account any financial information beforehand. Such a feature makes possible to monitor congressmen just by considering their legal public activities. In this way, our work contributes to the large scale public data study using complex networks.

Download Full-text

Keywords-Driven and Weight-aware Paper Recommendation via Paper Correlation Pattern Mining

10.21203/rs.3.rs-144551/v1 ◽

2021 ◽

Author(s):

Hanwen Liu ◽

Jun Hou ◽

Qianmu Li ◽

Jian Jiang

Keyword(s):

Real World ◽

Link Prediction ◽

Large Scale ◽

Pattern Mining ◽

Academic Research ◽

Correlation Pattern ◽

Keyword Query ◽

Correlation Graph ◽

Paper Citation ◽

Citation Graph

Abstract Currently, readers often prefer to search for their interested papers based on a set of typed query keywords. As the keywords of a paper is often limited, paper recommender systems often need to recommend a set of papers which collectively satisfy the readers’ keyword query. However, the topics of recommended papers are probably not correlated with each other, which fail to meet the readers’ requirements on in-depth and continuous academic research. Furthermore, although existing paper citation graphs can model the papers’ correlations, they often face the data sparse problem which blocks accurate paper recommendations. To address these issues, we propose a keywords-driven and weight-aware paper recommendation approach, named LP-PRk+w (link prediction-paper recommendation), based on a weighted paper correlation graph. Concretely, we firstly optimize the existing paper citation graph modes by introducing a weighted similarity, after which we obtain a weighted paper correlation graph. Then we recommend a set of correlated papers based on the weighted paper correlation graph and the query keywords from readers. At last, we conduct large-scale experiments on a real-world Hep-Th dataset. Experimental results demonstrate that our proposal can improve the paper recommendation performances considerably, compared to other related solutions.

Download Full-text

Improved Bounded Matrix Completion for Large-Scale Recommender Systems

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/229 ◽

2017 ◽

Cited By ~ 1

Author(s):

Huang Fang ◽

Zhang Zhen ◽

Yiqun Shao ◽

Cho-Jui Hsieh

Keyword(s):

Real World ◽

Large Scale ◽

Matrix Completion ◽

Stationary Points ◽

Original Matrix ◽

New Approach ◽

Completion Problem ◽

Objective Value ◽

Real World Datasets ◽

Personalized Recommender System

Matrix completion is a widely used technique for personalized recommender system. In this paper, we focus on the idea of Bounded Matrix Completion (BMC) which imposes bounded constraint into the original matrix completion problem. It has been shown that BMC works well for several real world datasets, and an efficient coordinate descent solver called BMA has been proposed in~\cite{bma}. However, we observe that the BMA algorithm sometimes fails to converge to a stationary point, resulting in a relatively poor accuracy in those cases. To overcome this issue, we propose our new approach for solving BMC under the ADMM framework. The proposed algorithm is gauranteed to converge to stationary points. Experimental results on real world datasets show that our algorithm can reach a lower objective value, obtain a higher predict accuracy rate and have better scalability compared with BMA. We also present that our method outperforms the state-of-art standard matrix factorization in most cases.

Download Full-text

1588-P: Therapy Trends in Initial 6 Months of the First Large-Scale Longitudinal Nationwide Study on Management and Real-World Outcomes of Diabetes in India (LANDMARC)

Diabetes ◽

10.2337/db20-1588-p ◽

2020 ◽

Vol 69 (Supplement 1) ◽

pp. 1588-P ◽

Cited By ~ 1

Author(s):

ROMIK GHOSH ◽

ASHOK K. DAS ◽

AMBRISH MITHAL ◽

SHASHANK JOSHI ◽

K.M. PRASANNA KUMAR ◽

...

Keyword(s):

Real World ◽

Large Scale ◽

Nationwide Study

Download Full-text