scholarly journals Triangle Counting by Adaptively Resampling over Evolving Graph Streams

2021 ◽  
Author(s):  
Wei Xuan
Algorithmica ◽  
2015 ◽  
Vol 76 (1) ◽  
pp. 259-278 ◽  
Author(s):  
Laurent Bulteau ◽  
Vincent Froese ◽  
Konstantin Kutzkov ◽  
Rasmus Pagh

2020 ◽  
Vol 29 (6) ◽  
pp. 1501-1525
Author(s):  
Dongjin Lee ◽  
Kijung Shin ◽  
Christos Faloutsos

2020 ◽  
Vol 14 (2) ◽  
pp. 1-39 ◽  
Author(s):  
Kijung Shin ◽  
Sejoon Oh ◽  
Jisu Kim ◽  
Bryan Hooi ◽  
Christos Faloutsos

Algorithms ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 221
Author(s):  
Zhihui Du ◽  
Oliver Alvarado Rodriguez ◽  
Joseph Patchett ◽  
David A. Bader

Data from emerging applications, such as cybersecurity and social networking, can be abstracted as graphs whose edges are updated sequentially in the form of a stream. The challenging problem of interactive graph stream analytics is the quick response of the queries on terabyte and beyond graph stream data from end users. In this paper, a succinct and efficient double index data structure is designed to build the sketch of a graph stream to meet general queries. A single pass stream model, which includes general sketch building, distributed sketch based analysis algorithms and regression based approximation solution generation, is developed, and a typical graph algorithm—triangle counting—is implemented to evaluate the proposed method. Experimental results on power law and normal distribution graph streams show that our method can generate accurate results (mean relative error less than 4%) with a high performance. All our methods and code have been implemented in an open source framework, Arkouda, and are available from our GitHub repository, Bader-Research. This work provides the large and rapidly growing Python community with a powerful way to handle terabyte and beyond graph stream data using their laptops.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-30
Author(s):  
Kijung Shin ◽  
Euiwoong Lee ◽  
Jinoh Oh ◽  
Mohammad Hammoud ◽  
Christos Faloutsos

Given a graph stream, how can we estimate the number of triangles in it using multiple machines with limited storage? Specifically, how should edges be processed and sampled across the machines for rapid and accurate estimation? The count of triangles (i.e., cliques of size three) has proven useful in numerous applications, including anomaly detection, community detection, and link recommendation. For triangle counting in large and dynamic graphs, recent work has focused largely on streaming algorithms and distributed algorithms but little on their combinations for “the best of both worlds.” In this work, we propose CoCoS , a fast and accurate distributed streaming algorithm for estimating the counts of global triangles (i.e., all triangles) and local triangles incident to each node. Making one pass over the input stream, CoCoS carefully processes and stores the edges across multiple machines so that the redundant use of computational and storage resources is minimized. Compared to baselines, CoCoS is: (a) accurate: giving up to smaller estimation error; (b) fast : up to faster, scaling linearly with the size of the input stream; and (c) theoretically sound : yielding unbiased estimates.


Sign in / Sign up

Export Citation Format

Share Document