Many disjoint dense subgraphs versus large <mml:math altimg="si1.gif" display="inline" overflow="scroll" xmlns:xocs="http://www.elsevier.com/xml/xocs/dtd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.elsevier.com/xml/ja/dtd" xmlns:ja="http://www.elsevier.com/xml/ja/dtd" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:tb="http://www.elsevier.com/xml/common/table/dtd" xmlns:sb="http://www.elsevier.com/xml/common/struct-bib/dtd" xmlns:ce="http://www.elsevier.com/xml/common/dtd" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:cals="http://www.elsevier.com/xml/common/cals/dtd"><mml:mi>k</mml:mi></mml:math>-connected subgraphs in large graphs with given edge density

We address the problem of computing the distribution of induced connected subgraphs, aka graphlets or motifs , in large graphs. The current state-of-the-art algorithms estimate the motif counts via uniform sampling by leveraging the color coding technique by Alon, Yuster, and Zwick. In this work, we extend the applicability of this approach by introducing a set of algorithmic optimizations and techniques that reduce the running time and space usage of color coding and improve the accuracy of the counts. To this end, we first show how to optimize color coding to efficiently build a compact table of a representative subsample of all graphlets in the input graph. For 8-node motifs, we can build such a table in one hour for a graph with 65M nodes and 1.8B edges, which is times larger than the state of the art. We then introduce a novel adaptive sampling scheme that breaks the “additive error barrier” of uniform sampling, guaranteeing multiplicative approximations instead of just additive ones. This allows us to count not only the most frequent motifs, but also extremely rare ones. For instance, on one graph we accurately count nearly 10.000 distinct 8-node motifs whose relative frequency is so small that uniform sampling would literally take centuries to find them. Our results show that color coding is still the most promising approach to scalable motif counting.

Download Full-text

THE EXACT MINIMUM NUMBER OF TRIANGLES IN GRAPHS WITH GIVEN ORDER AND SIZE

Forum of Mathematics Pi ◽

10.1017/fmp.2020.7 ◽

2020 ◽

Vol 8 ◽

Author(s):

HONG LIU ◽

OLEG PIKHURKO ◽

KATHERINE STADEN

Keyword(s):

Exact Solution ◽

Edge Density ◽

Extremal Graphs ◽

Large Graphs ◽

Minimum Number

What is the minimum number of triangles in a graph of given order and size? Motivated by earlier results of Mantel and Turán, Rademacher solved the first nontrivial case of this problem in 1941. The problem was revived by Erdős in 1955; it is now known as the Erdős–Rademacher problem. After attracting much attention, it was solved asymptotically in a major breakthrough by Razborov in 2008. In this paper, we provide an exact solution for all large graphs whose edge density is bounded away from $1$ , which in this range confirms a conjecture of Lovász and Simonovits from 1975. Furthermore, we give a description of the extremal graphs.

Download Full-text

Efficient size-bounded community search over large networks

Proceedings of the VLDB Endowment ◽

10.14778/3457390.3457407 ◽

2021 ◽

Vol 14 (8) ◽

pp. 1441-1453

Author(s):

Kai Yao ◽

Lijun Chang

Keyword(s):

Heuristic Algorithm ◽

State Of The Art ◽

Minimum Degree ◽

Exact Algorithms ◽

Edge Density ◽

Search Problem ◽

Community Search ◽

Connected Subgraphs ◽

General Size ◽

Baseline Approach

The problem of community search, which aims to find a cohesive subgraph containing user-given query vertices, has been extensively studied recently. Most of the existing studies mainly focus on the cohesiveness of the returned community, while ignoring the size of the community, and may yield communities of very large sizes. However, many applications naturally require that the number of vertices/members in a community should fall within a certain range. In this paper, we design exact algorithms for the general size-bounded community search problem that aims to find a subgraph with the largest min-degree among all connected subgraphs that contain the query vertex q and have at least l and at most h vertices, where q, l, h are specified by the query. As the problem is NP-hard, we propose a branch-reduce-and-bound algorithm SC-BRB by developing nontrivial reducing techniques, upper bounding techniques, and branching techniques. Experiments on large real graphs show that SC-BRB on average increases the minimum degree of the community returned by the state-of-the-art heuristic algorithm GreedyF by a factor of 2.41 and increases the edge density by a factor of 2.2. In addition, SC-BRB is several orders of magnitude faster than a baseline approach, and all of our proposed techniques contribute to the efficiency of SC-BRB.

Download Full-text

Detecting dense subgraphs in complex networks based on edge density coefficient

2010 IEEE Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA) ◽

10.1109/bicta.2010.5645354 ◽

2010 ◽

Cited By ~ 1

Author(s):

Hang Zhang ◽

Xiangzhen Zan ◽

Changcheng Huang ◽

Xiangou Zhu ◽

Chengwen Wu ◽

...

Keyword(s):

Complex Networks ◽

Edge Density ◽

Dense Subgraphs ◽

Density Coefficient

Download Full-text

Querying Minimal Steiner Maximum-Connected Subgraphs in Large Graphs

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management - CIKM '16 ◽

10.1145/2983323.2983748 ◽

2016 ◽

Cited By ~ 15

Author(s):

Jiafeng Hu ◽

Xiaowei Wu ◽

Reynold Cheng ◽

Siqiang Luo ◽

Yixiang Fang

Keyword(s):

Large Graphs ◽

Connected Subgraphs

Download Full-text

The Laplacian Spectrum of Large Graphs Sampled from Graphons

IEEE Transactions on Network Science and Engineering ◽

10.1109/tnse.2021.3069675 ◽

2021 ◽

pp. 1-1

Author(s):

Renato Vizuete ◽

Federica Garin ◽

Paolo Frasca

Keyword(s):

Laplacian Spectrum ◽

Large Graphs

Download Full-text

VColor*: a practical approach for coloring large graphs

Frontiers of Computer Science ◽

10.1007/s11704-020-9205-y ◽

2021 ◽

Vol 15 (4) ◽

Author(s):

Yun Peng ◽

Xin Lin ◽

Byron Choi ◽

Bingsheng He

Keyword(s):

Practical Approach ◽

Large Graphs

Download Full-text

Tiered Sampling

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3441299 ◽

2021 ◽

Vol 15 (5) ◽

pp. 1-52

Author(s):

Lorenzo De Stefani ◽

Erisa Terolli ◽

Eli Upfal

Keyword(s):

Large Scale ◽

Analysis Of Algorithms ◽

Base Layer ◽

Single Edge ◽

Real World Data ◽

High Quality ◽

Large Graphs ◽

Massive Graphs ◽

Variance Estimate ◽

Low Probability

We introduce Tiered Sampling , a novel technique for estimating the count of sparse motifs in massive graphs whose edges are observed in a stream. Our technique requires only a single pass on the data and uses a memory of fixed size M , which can be magnitudes smaller than the number of edges. Our methods address the challenging task of counting sparse motifs—sub-graph patterns—that have a low probability of appearing in a sample of M edges in the graph, which is the maximum amount of data available to the algorithms in each step. To obtain an unbiased and low variance estimate of the count, we partition the available memory into tiers (layers) of reservoir samples. While the base layer is a standard reservoir sample of edges, other layers are reservoir samples of sub-structures of the desired motif. By storing more frequent sub-structures of the motif, we increase the probability of detecting an occurrence of the sparse motif we are counting, thus decreasing the variance and error of the estimate. While we focus on the designing and analysis of algorithms for counting 4-cliques, we present a method which allows generalizing Tiered Sampling to obtain high-quality estimates for the number of occurrence of any sub-graph of interest, while reducing the analysis effort due to specific properties of the pattern of interest. We present a complete analytical analysis and extensive experimental evaluation of our proposed method using both synthetic and real-world data. Our results demonstrate the advantage of our method in obtaining high-quality approximations for the number of 4 and 5-cliques for large graphs using a very limited amount of memory, significantly outperforming the single edge sample approach for counting sparse motifs in large scale graphs.

Download Full-text

On the number of connected subgraphs of graphs

Indian Journal of Pure and Applied Mathematics ◽

10.1007/s13226-021-00061-4 ◽

2021 ◽

Author(s):

Dinesh Pandey ◽

Kamal Lochan Patra

Keyword(s):

Connected Subgraphs

Download Full-text

Landscape structure affects the sunflower visiting frequency of insect pollinators

Scientific Reports ◽

10.1038/s41598-021-87650-9 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Károly Lajos ◽

Ferenc Samu ◽

Áron Domonkos Bihaly ◽

Dávid Fülöp ◽

Miklós Sárospataki

Keyword(s):

Landscape Structure ◽

Honey Bees ◽

Agricultural Landscape ◽

Edge Density ◽

Wild Bees ◽

Natural Habitats ◽

Mass Flowering ◽

Positive Effects ◽

Pollinator Community ◽

Foraging Ranges

AbstractMass-flowering crop monocultures, like sunflower, cannot harbour a permanent pollinator community. Their pollination is best secured if both managed honey bees and wild pollinators are present in the agricultural landscape. Semi-natural habitats are known to be the main foraging and nesting areas of wild pollinators, thus benefiting their populations, whereas crops flowering simultaneously may competitively dilute pollinator densities. In our study we asked how landscape structure affects major pollinator groups’ visiting frequency on 36 focal sunflower fields, hypothesising that herbaceous semi-natural (hSNH) and sunflower patches in the landscape neighbourhood will have a scale-dependent effect. We found that an increasing area and/or dispersion of hSNH areas enhanced the visitation of all pollinator groups. These positive effects were scale-dependent and corresponded well with the foraging ranges of the observed bee pollinators. In contrast, an increasing edge density of neighbouring sunflower fields resulted in considerably lower visiting frequencies of wild bees. Our results clearly indicate that the pollination of sunflower is dependent on the composition and configuration of the agricultural landscape. We conclude that an optimization of the pollination can be achieved if sufficient amount of hSNH areas with good dispersion are provided and mass flowering crops do not over-dominate the agricultural landscape.

Download Full-text

Many disjoint dense subgraphs versus large k-connected subgraphs in large graphs with given edge density

Faster Motif Counting via Succinct Color Coding and Adaptive Sampling

THE EXACT MINIMUM NUMBER OF TRIANGLES IN GRAPHS WITH GIVEN ORDER AND SIZE

Efficient size-bounded community search over large networks

Detecting dense subgraphs in complex networks based on edge density coefficient

Querying Minimal Steiner Maximum-Connected Subgraphs in Large Graphs

The Laplacian Spectrum of Large Graphs Sampled from Graphons

VColor*: a practical approach for coloring large graphs

Tiered Sampling

On the number of connected subgraphs of graphs

Landscape structure affects the sunflower visiting frequency of insect pollinators

Export Citation Format