large graphs
Recently Published Documents


TOTAL DOCUMENTS

598
(FIVE YEARS 146)

H-INDEX

41
(FIVE YEARS 5)

2022 ◽  
Vol 40 (4) ◽  
pp. 1-45
Author(s):  
Weiren Yu ◽  
Julie McCann ◽  
Chengyuan Zhang ◽  
Hakan Ferhatosmanoglu

SimRank is an attractive link-based similarity measure used in fertile fields of Web search and sociometry. However, the existing deterministic method by Kusumoto et al. [ 24 ] for retrieving SimRank does not always produce high-quality similarity results, as it fails to accurately obtain diagonal correction matrix  D . Moreover, SimRank has a “connectivity trait” problem: increasing the number of paths between a pair of nodes would decrease its similarity score. The best-known remedy, SimRank++ [ 1 ], cannot completely fix this problem, since its score would still be zero if there are no common in-neighbors between two nodes. In this article, we study fast high-quality link-based similarity search on billion-scale graphs. (1) We first devise a “varied- D ” method to accurately compute SimRank in linear memory. We also aggregate duplicate computations, which reduces the time of [ 24 ] from quadratic to linear in the number of iterations. (2) We propose a novel “cosine-based” SimRank model to circumvent the “connectivity trait” problem. (3) To substantially speed up the partial-pairs “cosine-based” SimRank search on large graphs, we devise an efficient dimensionality reduction algorithm, PSR # , with guaranteed accuracy. (4) We give mathematical insights to the semantic difference between SimRank and its variant, and correct an argument in [ 24 ] that “if D is replaced by a scaled identity matrix (1-Ɣ)I, their top-K rankings will not be affected much”. (5) We propose a novel method that can accurately convert from Li et al.  SimRank ~{S} to Jeh and Widom’s SimRank S . (6) We propose GSR # , a generalisation of our “cosine-based” SimRank model, to quantify pairwise similarities across two distinct graphs, unlike SimRank that would assess nodes across two graphs as completely dissimilar. Extensive experiments on various datasets demonstrate the superiority of our proposed approaches in terms of high search quality, computational efficiency, accuracy, and scalability on billion-edge graphs.


2022 ◽  
Vol 16 (4) ◽  
pp. 1-43
Author(s):  
Aida Sheshbolouki ◽  
M. Tamer Özsu

We study the fundamental problem of butterfly (i.e., (2,2)-bicliques) counting in bipartite streaming graphs. Similar to triangles in unipartite graphs, enumerating butterflies is crucial in understanding the structure of bipartite graphs. This benefits many applications where studying the cohesion in a graph shaped data is of particular interest. Examples include investigating the structure of computational graphs or input graphs to the algorithms, as well as dynamic phenomena and analytic tasks over complex real graphs. Butterfly counting is computationally expensive, and known techniques do not scale to large graphs; the problem is even harder in streaming graphs. In this article, following a data-driven methodology, we first conduct an empirical analysis to uncover temporal organizing principles of butterflies in real streaming graphs and then we introduce an approximate adaptive window-based algorithm, sGrapp, for counting butterflies as well as its optimized version sGrapp-x. sGrapp is designed to operate efficiently and effectively over any graph stream with any temporal behavior. Experimental studies of sGrapp and sGrapp-x show superior performance in terms of both accuracy and efficiency.


2021 ◽  
Vol 151 ◽  
pp. 393-416
Author(s):  
Ewan Davies ◽  
Matthew Jenssen ◽  
Will Perkins
Keyword(s):  

Mathematics ◽  
2021 ◽  
Vol 9 (21) ◽  
pp. 2764
Author(s):  
Rasul Kochkarov

NP-complete problems in graphs, such as enumeration and the selection of subgraphs with given characteristics, become especially relevant for large graphs and networks. Herein, particular statements with constraints are proposed to solve such problems, and subclasses of graphs are distinguished. We propose a class of prefractal graphs and review particular statements of NP-complete problems. As an example, algorithms for searching for spanning trees and packing bipartite graphs are proposed. The developed algorithms are polynomial and based on well-known algorithms and are used in the form of procedures. We propose to use the class of prefractal graphs as a tool for studying NP-complete problems and identifying conditions for their solvability. Using prefractal graphs for the modeling of large graphs and networks, it is possible to obtain approximate solutions, and some exact solutions, for problems on natural objects—social networks, transport networks, etc.


2021 ◽  
Author(s):  
Xin Sun ◽  
Xin Huang ◽  
Zitan Sun ◽  
Di Jin
Keyword(s):  

Author(s):  
Daniel Lacker ◽  
Agathe Soret

We study a class of linear-quadratic stochastic differential games in which each player interacts directly only with its nearest neighbors in a given graph. We find a semiexplicit Markovian equilibrium for any transitive graph, in terms of the empirical eigenvalue distribution of the graph’s normalized Laplacian matrix. This facilitates large-population asymptotics for various graph sequences, with several sparse and dense examples discussed in detail. In particular, the mean field game is the correct limit only in the dense graph case, that is, when the degrees diverge in a suitable sense. Although equilibrium strategies are nonlocal, depending on the behavior of all players, we use a correlation decay estimate to prove a propagation of chaos result in both the dense and sparse regimes, with the sparse case owing to the large distances between typical vertices. Without assuming the graphs are transitive, we show also that the mean field game solution can be used to construct decentralized approximate equilibria on any sufficiently dense graph sequence.


2021 ◽  
Author(s):  
Trevor Steil ◽  
Geoffrey Sanders ◽  
Roger Pearce
Keyword(s):  

Computers ◽  
2021 ◽  
Vol 10 (9) ◽  
pp. 115
Author(s):  
Péter Marjai ◽  
Bence Szabari ◽  
Attila Kiss

Graphs can be found in almost every part of modern life: social networks, road networks, biology, and so on. Finding the most important node is a vital issue. Up to this date, numerous centrality measures were proposed to address this problem; however, each has its drawbacks, for example, not scaling well on large graphs. In this paper, we investigate the ranking efficiency and the execution time of a method that uses graph clustering to reduce the time that is needed to define the vital nodes. With graph clustering, the neighboring nodes representing communities are selected into groups. These groups are then used to create subgraphs from the original graph, which are smaller and easier to measure. To classify the efficiency, we investigate different aspects of accuracy. First, we compare the top 10 nodes that resulted from the original closeness and betweenness methods with the nodes that resulted from the use of this method. Then, we examine what percentage of the first n nodes are equal between the original and the clustered ranking. Centrality measures also assign a value to each node, so lastly we investigate the sum of the centrality values of the top n nodes. We also evaluate the runtime of the investigated method, and the original measures in plain implementation, with the use of a graph database. Based on our experiments, our method greatly reduces the time consumption of the investigated centrality measures, especially in the case of the Louvain algorithm. The first experiment regarding the accuracy yielded that the examination of the top 10 nodes is not good enough to properly evaluate the precision. The second experiment showed that the investigated algorithm in par with the Paris algorithm has around 45–60% accuracy in the case of betweenness centrality. On the other hand, the last experiment resulted that the investigated method has great accuracy in the case of closeness centrality especially in the case of Louvain clustering algorithm.


2021 ◽  
Vol 72 ◽  
pp. 39-67
Author(s):  
Shaowei Cai ◽  
Jinkun Lin ◽  
Yiyuan Wang ◽  
Darren Strash

This paper explores techniques to quickly solve the maximum weight clique problem (MWCP) in very large scale sparse graphs. Due to their size, and the hardness of MWCP, it is infeasible to solve many of these graphs with exact algorithms. Although recent heuristic algorithms make progress in solving MWCP in large graphs, they still need considerable time to get a high-quality solution. In this work, we focus on solving MWCP for large sparse graphs within a short time limit. We propose a new method for MWCP which interleaves clique finding with data reduction rules. We propose novel ideas to make this process efficient, and develop an algorithm called FastWClq. Experiments on a broad range of large sparse graphs show that FastWClq finds better solutions than state-of-the-art algorithms while the running time of FastWClq is much shorter than the competitors for most instances. Further, FastWClq proves the optimality of its solutions for roughly half of the graphs, all with at least 105 vertices, with an average time of 21 seconds.


2021 ◽  
Vol 7 ◽  
pp. e699
Author(s):  
Martin Mirakyan

Betweenness-centrality is a popular measure in network analysis that aims to describe the importance of nodes in a graph. It accounts for the fraction of shortest paths passing through that node and is a key measure in many applications including community detection and network dismantling. The computation of betweenness-centrality for each node in a graph requires an excessive amount of computing power, especially for large graphs. On the other hand, in many applications, the main interest lies in finding the top-k most important nodes in the graph. Therefore, several approximation algorithms were proposed to solve the problem faster. Some recent approaches propose to use shallow graph convolutional networks to approximate the top-k nodes with the highest betweenness-centrality scores. This work presents a deep graph convolutional neural network that outputs a rank score for each node in a given graph. With careful optimization and regularization tricks, including an extended version of DropEdge which is named Progressive-DropEdge, the system achieves better results than the current approaches. Experiments on both real-world and synthetic datasets show that the presented algorithm is an order of magnitude faster in inference and requires several times fewer resources and time to train.


Sign in / Sign up

Export Citation Format

Share Document