2PGraph: Accelerating GNN Training over Large Graphs on GPU Clusters

Author(s):  
Lizhi Zhang ◽  
Zhiquan Lai ◽  
Shengwei Li ◽  
Yu Tang ◽  
Feng Liu ◽  
...  
Keyword(s):  
2021 ◽  
Vol 15 (4) ◽  
Author(s):  
Yun Peng ◽  
Xin Lin ◽  
Byron Choi ◽  
Bingsheng He

2021 ◽  
Vol 22 (10) ◽  
pp. 5212
Author(s):  
Andrzej Bak

A key question confronting computational chemists concerns the preferable ligand geometry that fits complementarily into the receptor pocket. Typically, the postulated ‘bioactive’ 3D ligand conformation is constructed as a ‘sophisticated guess’ (unnecessarily geometry-optimized) mirroring the pharmacophore hypothesis—sometimes based on an erroneous prerequisite. Hence, 4D-QSAR scheme and its ‘dialects’ have been practically implemented as higher level of model abstraction that allows the examination of the multiple molecular conformation, orientation and protonation representation, respectively. Nearly a quarter of a century has passed since the eminent work of Hopfinger appeared on the stage; therefore the natural question occurs whether 4D-QSAR approach is still appealing to the scientific community? With no intention to be comprehensive, a review of the current state of art in the field of receptor-independent (RI) and receptor-dependent (RD) 4D-QSAR methodology is provided with a brief examination of the ‘mainstream’ algorithms. In fact, a myriad of 4D-QSAR methods have been implemented and applied practically for a diverse range of molecules. It seems that, 4D-QSAR approach has been experiencing a promising renaissance of interests that might be fuelled by the rising power of the graphics processing unit (GPU) clusters applied to full-atom MD-based simulations of the protein-ligand complexes.


2021 ◽  
Vol 15 (6) ◽  
pp. 1-27
Author(s):  
Marco Bressan ◽  
Stefano Leucci ◽  
Alessandro Panconesi

We address the problem of computing the distribution of induced connected subgraphs, aka graphlets or motifs , in large graphs. The current state-of-the-art algorithms estimate the motif counts via uniform sampling by leveraging the color coding technique by Alon, Yuster, and Zwick. In this work, we extend the applicability of this approach by introducing a set of algorithmic optimizations and techniques that reduce the running time and space usage of color coding and improve the accuracy of the counts. To this end, we first show how to optimize color coding to efficiently build a compact table of a representative subsample of all graphlets in the input graph. For 8-node motifs, we can build such a table in one hour for a graph with 65M nodes and 1.8B edges, which is times larger than the state of the art. We then introduce a novel adaptive sampling scheme that breaks the “additive error barrier” of uniform sampling, guaranteeing multiplicative approximations instead of just additive ones. This allows us to count not only the most frequent motifs, but also extremely rare ones. For instance, on one graph we accurately count nearly 10.000 distinct 8-node motifs whose relative frequency is so small that uniform sampling would literally take centuries to find them. Our results show that color coding is still the most promising approach to scalable motif counting.


2021 ◽  
Vol 15 (5) ◽  
pp. 1-52
Author(s):  
Lorenzo De Stefani ◽  
Erisa Terolli ◽  
Eli Upfal

We introduce Tiered Sampling , a novel technique for estimating the count of sparse motifs in massive graphs whose edges are observed in a stream. Our technique requires only a single pass on the data and uses a memory of fixed size M , which can be magnitudes smaller than the number of edges. Our methods address the challenging task of counting sparse motifs—sub-graph patterns—that have a low probability of appearing in a sample of M edges in the graph, which is the maximum amount of data available to the algorithms in each step. To obtain an unbiased and low variance estimate of the count, we partition the available memory into tiers (layers) of reservoir samples. While the base layer is a standard reservoir sample of edges, other layers are reservoir samples of sub-structures of the desired motif. By storing more frequent sub-structures of the motif, we increase the probability of detecting an occurrence of the sparse motif we are counting, thus decreasing the variance and error of the estimate. While we focus on the designing and analysis of algorithms for counting 4-cliques, we present a method which allows generalizing Tiered Sampling to obtain high-quality estimates for the number of occurrence of any sub-graph of interest, while reducing the analysis effort due to specific properties of the pattern of interest. We present a complete analytical analysis and extensive experimental evaluation of our proposed method using both synthetic and real-world data. Our results demonstrate the advantage of our method in obtaining high-quality approximations for the number of 4 and 5-cliques for large graphs using a very limited amount of memory, significantly outperforming the single edge sample approach for counting sparse motifs in large scale graphs.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Michał Ławniczak ◽  
Adam Sawicki ◽  
Małgorzata Białous ◽  
Leszek Sirko

AbstractWe identify and investigate isoscattering strings of concatenating quantum graphs possessing n units and 2n infinite external leads. We give an insight into the principles of designing large graphs and networks for which the isoscattering properties are preserved for $$n \rightarrow \infty $$ n → ∞ . The theoretical predictions are confirmed experimentally using $$n=2$$ n = 2 units, four-leads microwave networks. In an experimental and mathematical approach our work goes beyond prior results by demonstrating that using a trace function one can address the unsettled until now problem of whether scattering properties of open complex graphs and networks with many external leads are uniquely connected to their shapes. The application of the trace function reduces the number of required entries to the $$2n \times 2n $$ 2 n × 2 n scattering matrices $${\hat{S}}$$ S ^ of the systems to 2n diagonal elements, while the old measures of isoscattering require all $$(2n)^2$$ ( 2 n ) 2 entries. The studied problem generalizes a famous question of Mark Kac “Can one hear the shape of a drum?”, originally posed in the case of isospectral dissipationless systems, to the case of infinite strings of open graphs and networks.


Sign in / Sign up

Export Citation Format

Share Document