massive graphs
Recently Published Documents


TOTAL DOCUMENTS

100
(FIVE YEARS 34)

H-INDEX

15
(FIVE YEARS 2)

2021 ◽  
Vol 11 (19) ◽  
pp. 9051
Author(s):  
Xijuan Liu ◽  
Xiaoyang Wang

Cohesive subgraph identification is a fundamental problem in bipartite graph analysis. In real applications, to better represent the co-relationship between entities, edges are usually associated with weights or frequencies, which are neglected by most existing research. To fill the gap, we propose a new cohesive subgraph model, (k,ω)-core, by considering both subgraph cohesiveness and frequency for weighted bipartite graphs. Specifically, (k,ω)-core requires each node on the left layer to have at least k neighbors (cohesiveness) and each node on the right layer to have a weight of at least ω (frequency). In real scenarios, different users may have different parameter requirements. To handle massive graphs and queries, index-based strategies are developed. In addition, effective optimization techniques are proposed to improve the index construction phase. Compared with the baseline, extensive experiments on six datasets validate the superiority of our proposed methods.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zhenqi Lu ◽  
Johan Wahlström ◽  
Arye Nehorai

AbstractGraph clustering, a fundamental technique in network science for understanding structures in complex systems, presents inherent problems. Though studied extensively in the literature, graph clustering in large systems remains particularly challenging because massive graphs incur a prohibitively large computational load. The heat kernel PageRank provides a quantitative ranking of nodes, and a local cluster can be efficiently found by performing a sweep over the heat kernel PageRank vector. But computing an exact heat kernel PageRank vector may be expensive, and approximate algorithms are often used instead. Most approximate algorithms compute the heat kernel PageRank vector on the whole graph, and thus are dependent on global structures. In this paper, we present an algorithm for approximating the heat kernel PageRank on a local subgraph. Moreover, we show that the number of computations required by the proposed algorithm is sublinear in terms of the expected size of the local cluster of interest, and that it provides a good approximation of the heat kernel PageRank, with approximation errors bounded by a probabilistic guarantee. Numerical experiments verify that the local clustering algorithm using our approximate heat kernel PageRank achieves state-of-the-art performance.


Author(s):  
Hua Jiang ◽  
Dongming Zhu ◽  
Zhichao Xie ◽  
Shaowen Yao ◽  
Zhang-Hua Fu

Given an undirected graph, the Maximum k-plex Problem (MKP) is to find a largest induced subgraph in which each vertex has at most k−1 non-adjacent vertices. The problem arises in social network analysis and has found applications in many important areas employing graph-based data mining. Existing exact algorithms usually implement a branch-and-bound approach that requires a tight upper bound to reduce the search space. In this paper, we propose a new upper bound for MKP, which is a partitioning of the candidate vertex set with respect to the constructing solution. We implement a new branch-and-bound algorithm that employs the upper bound to reduce the number of branches. Experimental results show that the upper bound is very effective in reducing the search space. The new algorithm outperforms the state-of-the-art algorithms significantly on real-world massive graphs, DIMACS graphs and random graphs.


2021 ◽  
Vol 15 (5) ◽  
pp. 1-52
Author(s):  
Lorenzo De Stefani ◽  
Erisa Terolli ◽  
Eli Upfal

We introduce Tiered Sampling , a novel technique for estimating the count of sparse motifs in massive graphs whose edges are observed in a stream. Our technique requires only a single pass on the data and uses a memory of fixed size M , which can be magnitudes smaller than the number of edges. Our methods address the challenging task of counting sparse motifs—sub-graph patterns—that have a low probability of appearing in a sample of M edges in the graph, which is the maximum amount of data available to the algorithms in each step. To obtain an unbiased and low variance estimate of the count, we partition the available memory into tiers (layers) of reservoir samples. While the base layer is a standard reservoir sample of edges, other layers are reservoir samples of sub-structures of the desired motif. By storing more frequent sub-structures of the motif, we increase the probability of detecting an occurrence of the sparse motif we are counting, thus decreasing the variance and error of the estimate. While we focus on the designing and analysis of algorithms for counting 4-cliques, we present a method which allows generalizing Tiered Sampling to obtain high-quality estimates for the number of occurrence of any sub-graph of interest, while reducing the analysis effort due to specific properties of the pattern of interest. We present a complete analytical analysis and extensive experimental evaluation of our proposed method using both synthetic and real-world data. Our results demonstrate the advantage of our method in obtaining high-quality approximations for the number of 4 and 5-cliques for large graphs using a very limited amount of memory, significantly outperforming the single edge sample approach for counting sparse motifs in large scale graphs.


2021 ◽  
Author(s):  
Yuli Jiang ◽  
Xin Huang ◽  
Hong Cheng

Author(s):  
Xin Jin ◽  
Zhengyi Yang ◽  
Xuemin Lin ◽  
Shiyu Yang ◽  
Lu Qin ◽  
...  

2021 ◽  
Vol 127 ◽  
pp. 105131
Author(s):  
Xiaoyu Chen ◽  
Yi Zhou ◽  
Jin-Kao Hao ◽  
Mingyu Xiao
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document