large graphs Latest Research Papers

SimRank is an attractive link-based similarity measure used in fertile fields of Web search and sociometry. However, the existing deterministic method by Kusumoto et al. [ 24 ] for retrieving SimRank does not always produce high-quality similarity results, as it fails to accurately obtain diagonal correction matrix D . Moreover, SimRank has a “connectivity trait” problem: increasing the number of paths between a pair of nodes would decrease its similarity score. The best-known remedy, SimRank++ [ 1 ], cannot completely fix this problem, since its score would still be zero if there are no common in-neighbors between two nodes. In this article, we study fast high-quality link-based similarity search on billion-scale graphs. (1) We first devise a “varied- D ” method to accurately compute SimRank in linear memory. We also aggregate duplicate computations, which reduces the time of [ 24 ] from quadratic to linear in the number of iterations. (2) We propose a novel “cosine-based” SimRank model to circumvent the “connectivity trait” problem. (3) To substantially speed up the partial-pairs “cosine-based” SimRank search on large graphs, we devise an efficient dimensionality reduction algorithm, PSR # , with guaranteed accuracy. (4) We give mathematical insights to the semantic difference between SimRank and its variant, and correct an argument in [ 24 ] that “if D is replaced by a scaled identity matrix (1-Ɣ)I, their top-K rankings will not be affected much”. (5) We propose a novel method that can accurately convert from Li et al. SimRank ~{S} to Jeh and Widom’s SimRank S . (6) We propose GSR # , a generalisation of our “cosine-based” SimRank model, to quantify pairwise similarities across two distinct graphs, unlike SimRank that would assess nodes across two graphs as completely dissimilar. Extensive experiments on various datasets demonstrate the superiority of our proposed approaches in terms of high search quality, computational efficiency, accuracy, and scalability on billion-edge graphs.

Download Full-text

sGrapp: Butterfly Approximation in Streaming Graphs

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3495011 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-43

Author(s):

Aida Sheshbolouki ◽

M. Tamer Özsu

Keyword(s):

Fundamental Problem ◽

Experimental Studies ◽

Data Driven ◽

Superior Performance ◽

Temporal Behavior ◽

Large Graphs ◽

Adaptive Window ◽

Computationally Expensive ◽

Dynamic Phenomena ◽

Organizing Principles

We study the fundamental problem of butterfly (i.e., (2,2)-bicliques) counting in bipartite streaming graphs. Similar to triangles in unipartite graphs, enumerating butterflies is crucial in understanding the structure of bipartite graphs. This benefits many applications where studying the cohesion in a graph shaped data is of particular interest. Examples include investigating the structure of computational graphs or input graphs to the algorithms, as well as dynamic phenomena and analytic tasks over complex real graphs. Butterfly counting is computationally expensive, and known techniques do not scale to large graphs; the problem is even harder in streaming graphs. In this article, following a data-driven methodology, we first conduct an empirical analysis to uncover temporal organizing principles of butterflies in real streaming graphs and then we introduce an approximate adaptive window-based algorithm, sGrapp, for counting butterflies as well as its optimized version sGrapp-x. sGrapp is designed to operate efficiently and effectively over any graph stream with any temporal behavior. Experimental studies of sGrapp and sGrapp-x show superior performance in terms of both accuracy and efficiency.

Download Full-text

A proof of the upper matching conjecture for large graphs

Journal of Combinatorial Theory Series B ◽

10.1016/j.jctb.2021.07.005 ◽

2021 ◽

Vol 151 ◽

pp. 393-416

Author(s):

Ewan Davies ◽

Matthew Jenssen ◽

Will Perkins

Keyword(s):

Large Graphs

Download Full-text

Research of NP-Complete Problems in the Class of Prefractal Graphs

Mathematics ◽

10.3390/math9212764 ◽

2021 ◽

Vol 9 (21) ◽

pp. 2764

Author(s):

Rasul Kochkarov

Keyword(s):

Social Networks ◽

Spanning Trees ◽

Approximate Solutions ◽

Transport Networks ◽

Large Graphs ◽

Graphs And Networks ◽

Complete Problems ◽

Np Complete ◽

Prefractal Graphs ◽

Selection Of

NP-complete problems in graphs, such as enumeration and the selection of subgraphs with given characteristics, become especially relevant for large graphs and networks. Herein, particular statements with constraints are proposed to solve such problems, and subclasses of graphs are distinguished. We propose a class of prefractal graphs and review particular statements of NP-complete problems. As an example, algorithms for searching for spanning trees and packing bipartite graphs are proposed. The developed algorithms are polynomial and based on well-known algorithms and are used in the form of procedures. We propose to use the class of prefractal graphs as a tool for studying NP-complete problems and identifying conditions for their solvability. Using prefractal graphs for the modeling of large graphs and networks, it is possible to obtain approximate solutions, and some exact solutions, for problems on natural objects—social networks, transport networks, etc.

Download Full-text

Budget-constrained Truss Maximization over Large Graphs

10.1145/3459637.3482324 ◽

2021 ◽

Author(s):

Xin Sun ◽

Xin Huang ◽

Zitan Sun ◽

Di Jin

Keyword(s):

Large Graphs

Download Full-text

A Case Study on Stochastic Games on Large Graphs in Mean Field and Sparse Regimes

Mathematics of Operations Research ◽

10.1287/moor.2021.1179 ◽

2021 ◽

Author(s):

Daniel Lacker ◽

Agathe Soret

Keyword(s):

Large Population ◽

Mean Field ◽

Laplacian Matrix ◽

Linear Quadratic ◽

Large Graphs ◽

Mean Field Game ◽

Dense Graph ◽

Correlation Decay ◽

The Mean

We study a class of linear-quadratic stochastic differential games in which each player interacts directly only with its nearest neighbors in a given graph. We find a semiexplicit Markovian equilibrium for any transitive graph, in terms of the empirical eigenvalue distribution of the graph’s normalized Laplacian matrix. This facilitates large-population asymptotics for various graph sequences, with several sparse and dense examples discussed in detail. In particular, the mean field game is the correct limit only in the dense graph case, that is, when the degrees diverge in a suitable sense. Although equilibrium strategies are nonlocal, depending on the behavior of all players, we use a correlation decay estimate to prove a propagation of chaos result in both the dense and sparse regimes, with the sparse case owing to the large distances between typical vertices. Without assuming the graphs are transitive, we show also that the mean field game solution can be used to construct decentralized approximate equilibria on any sufficiently dense graph sequence.

Download Full-text

Towards Distributed Square Counting in Large Graphs

10.1109/hpec49654.2021.9622799 ◽

2021 ◽

Author(s):

Trevor Steil ◽

Geoffrey Sanders ◽

Roger Pearce

Keyword(s):

Large Graphs

Download Full-text

An Experimental Study on Centrality Measures Using Clustering

Computers ◽

10.3390/computers10090115 ◽

2021 ◽

Vol 10 (9) ◽

pp. 115

Author(s):

Péter Marjai ◽

Bence Szabari ◽

Attila Kiss

Keyword(s):

Clustering Algorithm ◽

Graph Clustering ◽

Great Accuracy ◽

Centrality Measures ◽

Graph Database ◽

Modern Life ◽

Original Graph ◽

Large Graphs ◽

A Value ◽

Networks Biology

Graphs can be found in almost every part of modern life: social networks, road networks, biology, and so on. Finding the most important node is a vital issue. Up to this date, numerous centrality measures were proposed to address this problem; however, each has its drawbacks, for example, not scaling well on large graphs. In this paper, we investigate the ranking efficiency and the execution time of a method that uses graph clustering to reduce the time that is needed to define the vital nodes. With graph clustering, the neighboring nodes representing communities are selected into groups. These groups are then used to create subgraphs from the original graph, which are smaller and easier to measure. To classify the efficiency, we investigate different aspects of accuracy. First, we compare the top 10 nodes that resulted from the original closeness and betweenness methods with the nodes that resulted from the use of this method. Then, we examine what percentage of the first n nodes are equal between the original and the clustered ranking. Centrality measures also assign a value to each node, so lastly we investigate the sum of the centrality values of the top n nodes. We also evaluate the runtime of the investigated method, and the original measures in plain implementation, with the use of a graph database. Based on our experiments, our method greatly reduces the time consumption of the investigated centrality measures, especially in the case of the Louvain algorithm. The first experiment regarding the accuracy yielded that the examination of the top 10 nodes is not good enough to properly evaluate the precision. The second experiment showed that the investigated algorithm in par with the Paris algorithm has around 45–60% accuracy in the case of betweenness centrality. On the other hand, the last experiment resulted that the investigated method has great accuracy in the case of closeness centrality especially in the case of Louvain clustering algorithm.

Download Full-text

A Semi-exact Algorithm for Quickly Computing A Maximum Weight Clique in Large Sparse Graphs

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.12327 ◽

2021 ◽

Vol 72 ◽

pp. 39-67

Author(s):

Shaowei Cai ◽

Jinkun Lin ◽

Yiyuan Wang ◽

Darren Strash

Keyword(s):

Large Scale ◽

Heuristic Algorithms ◽

State Of The Art ◽

Exact Algorithm ◽

Exact Algorithms ◽

Maximum Weight ◽

Large Graphs ◽

Sparse Graphs ◽

Short Time ◽

Maximum Weight Clique

This paper explores techniques to quickly solve the maximum weight clique problem (MWCP) in very large scale sparse graphs. Due to their size, and the hardness of MWCP, it is infeasible to solve many of these graphs with exact algorithms. Although recent heuristic algorithms make progress in solving MWCP in large graphs, they still need considerable time to get a high-quality solution. In this work, we focus on solving MWCP for large sparse graphs within a short time limit. We propose a new method for MWCP which interleaves clique finding with data reduction rules. We propose novel ideas to make this process efficient, and develop an algorithm called FastWClq. Experiments on a broad range of large sparse graphs show that FastWClq finds better solutions than state-of-the-art algorithms while the running time of FastWClq is much shorter than the competitors for most instances. Further, FastWClq proves the optimality of its solutions for roughly half of the graphs, all with at least 105 vertices, with an average time of 21 seconds.

Download Full-text

ABCDE: Approximating Betweenness-Centrality ranking with progressive-DropEdge

PeerJ Computer Science ◽

10.7717/peerj-cs.699 ◽

2021 ◽

Vol 7 ◽

pp. e699

Author(s):

Martin Mirakyan

Keyword(s):

Betweenness Centrality ◽

Shortest Paths ◽

Main Interest ◽

Large Graphs ◽

Computing Power ◽

Convolutional Networks ◽

Rank Score ◽

Order Of Magnitude ◽

Synthetic Datasets ◽

Important Nodes

Betweenness-centrality is a popular measure in network analysis that aims to describe the importance of nodes in a graph. It accounts for the fraction of shortest paths passing through that node and is a key measure in many applications including community detection and network dismantling. The computation of betweenness-centrality for each node in a graph requires an excessive amount of computing power, especially for large graphs. On the other hand, in many applications, the main interest lies in finding the top-k most important nodes in the graph. Therefore, several approximation algorithms were proposed to solve the problem faster. Some recent approaches propose to use shallow graph convolutional networks to approximate the top-k nodes with the highest betweenness-centrality scores. This work presents a deep graph convolutional neural network that outputs a rank score for each node in a given graph. With careful optimization and regularization tricks, including an extended version of DropEdge which is named Progressive-DropEdge, the system achieves better results than the current approaches. Experiments on both real-world and synthetic datasets show that the presented algorithm is an order of magnitude faster in inference and requires several times fewer resources and time to train.

Download Full-text

large graphs
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge Graphs

sGrapp: Butterfly Approximation in Streaming Graphs

A proof of the upper matching conjecture for large graphs

Research of NP-Complete Problems in the Class of Prefractal Graphs

Budget-constrained Truss Maximization over Large Graphs

A Case Study on Stochastic Games on Large Graphs in Mean Field and Sparse Regimes

Towards Distributed Square Counting in Large Graphs

An Experimental Study on Centrality Measures Using Clustering

A Semi-exact Algorithm for Quickly Computing A Maximum Weight Clique in Large Sparse Graphs

ABCDE: Approximating Betweenness-Centrality ranking with progressive-DropEdge

Export Citation Format

large graphsRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Scaling High-Quality Pairwise Link-Based Similarity Retrieval on Billion-Edge Graphs

sGrapp: Butterfly Approximation in Streaming Graphs

A proof of the upper matching conjecture for large graphs

Research of NP-Complete Problems in the Class of Prefractal Graphs

Budget-constrained Truss Maximization over Large Graphs

A Case Study on Stochastic Games on Large Graphs in Mean Field and Sparse Regimes

Towards Distributed Square Counting in Large Graphs

An Experimental Study on Centrality Measures Using Clustering

A Semi-exact Algorithm for Quickly Computing A Maximum Weight Clique in Large Sparse Graphs

ABCDE: Approximating Betweenness-Centrality ranking with progressive-DropEdge

large graphs
Recently Published Documents