Fast and accurate estimation of shortest paths in large graphs

Author(s):  
Andrey Gubichev ◽  
Srikanta Bedathur ◽  
Stephan Seufert ◽  
Gerhard Weikum
2021 ◽  
Vol 15 (5) ◽  
pp. 1-32
Author(s):  
Sunil Kumar Maurya ◽  
Xin Liu ◽  
Tsuyoshi Murata

Graphs arise naturally in numerous situations, including social graphs, transportation graphs, web graphs, protein graphs, etc. One of the important problems in these settings is to identify which nodes are important in the graph and how they affect the graph structure as a whole. Betweenness centrality and closeness centrality are two commonly used node ranking measures to find out influential nodes in the graphs in terms of information spread and connectivity. Both of these are considered as shortest path based measures as the calculations require the assumption that the information flows between the nodes via the shortest paths. However, exact calculations of these centrality measures are computationally expensive and prohibitive, especially for large graphs. Although researchers have proposed approximation methods, they are either less efficient or suboptimal or both. We propose the first graph neural network (GNN) based model to approximate betweenness and closeness centrality. In GNN, each node aggregates features of the nodes in multihop neighborhood. We use this feature aggregation scheme to model paths and learn how many nodes are reachable to a specific node. We demonstrate that our approach significantly outperforms current techniques while taking less amount of time through extensive experiments on a series of synthetic and real-world datasets. A benefit of our approach is that the model is inductive, which means it can be trained on one set of graphs and evaluated on another set of graphs with varying structures. Thus, the model is useful for both static graphs and dynamic graphs. Source code is available at https://github.com/sunilkmaurya/GNN_Ranking


2021 ◽  
Vol 7 ◽  
pp. e699
Author(s):  
Martin Mirakyan

Betweenness-centrality is a popular measure in network analysis that aims to describe the importance of nodes in a graph. It accounts for the fraction of shortest paths passing through that node and is a key measure in many applications including community detection and network dismantling. The computation of betweenness-centrality for each node in a graph requires an excessive amount of computing power, especially for large graphs. On the other hand, in many applications, the main interest lies in finding the top-k most important nodes in the graph. Therefore, several approximation algorithms were proposed to solve the problem faster. Some recent approaches propose to use shallow graph convolutional networks to approximate the top-k nodes with the highest betweenness-centrality scores. This work presents a deep graph convolutional neural network that outputs a rank score for each node in a given graph. With careful optimization and regularization tricks, including an extended version of DropEdge which is named Progressive-DropEdge, the system achieves better results than the current approaches. Experiments on both real-world and synthetic datasets show that the presented algorithm is an order of magnitude faster in inference and requires several times fewer resources and time to train.


Author(s):  
W. R. Schucany ◽  
G. H. Kelsoe ◽  
V. F. Allison

Accurate estimation of the size of spheroid organelles from thin sectioned material is often necessary, as uniquely homogenous populations of organelles such as vessicles, granules, or nuclei often are critically important in the morphological identification of similar cell types. However, the difficulty in obtaining accurate diameter measurements of thin sectioned organelles is well known. This difficulty is due to the extreme tenuity of the sectioned material as compared to the size of the intact organelle. In populations where low variance is suspected the traditional method of diameter estimation has been to measure literally hundreds of profiles and to describe the “largest” as representative of the “approximate maximal diameter”.


Author(s):  
Virginie Crollen ◽  
Julie Castronovo ◽  
Xavier Seron

Over the last 30 years, numerical estimation has been largely studied. Recently, Castronovo and Seron (2007) proposed the bi-directional mapping hypothesis in order to account for the finding that dependent on the type of estimation task (perception vs. production of numerosities), reverse patterns of performance are found (i.e., under- and over-estimation, respectively). Here, we further investigated this hypothesis by submitting adult participants to three types of numerical estimation task: (1) a perception task, in which participants had to estimate the numerosity of a non-symbolic collection; (2) a production task, in which participants had to approximately produce the numerosity of a symbolic numerical input; and (3) a reproduction task, in which participants had to reproduce the numerosity of a non-symbolic numerical input. Our results gave further support to the finding that different patterns of performance are found according to the type of estimation task: (1) under-estimation in the perception task; (2) over-estimation in the production task; and (3) accurate estimation in the reproduction task. Moreover, correlation analyses revealed that the more a participant under-estimated in the perception task, the more he/she over-estimated in the production task. We discussed these empirical data by showing how they can be accounted by the bi-directional mapping hypothesis ( Castronovo & Seron, 2007 ).


1969 ◽  
Vol 62 (4_Suppla) ◽  
pp. S23-S35
Author(s):  
B.-A. Lamberg ◽  
O. P. Heinonen ◽  
K. Liewendahl ◽  
G. Kvist ◽  
M. Viherkoski ◽  
...  

ABSTRACT The distributions of 13 variables based on 10 laboratory tests measuring thyroid function were studied in euthyroid controls and in patients with toxic diffuse or toxic multinodular goitre. Density functions were fitted to the empirical data and the goodness of fit was evaluated by the use of the χ2-test. In a few instances there was a significant difference but the material available was in some respects too small to allow a very accurate estimation. The normal limits for each variable was defined by the 2.5 and 97.5 percentiles. It appears that in some instances these limits are too rigorous from the practical point of view. It is emphasized that the crossing point of the functions for euthyroid controls and hyperthyroid patients may be a better limit to use. In a preliminary analysis of the diagnostic efficiency the variables of total or free hormone concentration in the blood proved clearily superior to all other variables.


2019 ◽  
Author(s):  
Ruslan N. Tazhigulov ◽  
James R. Gayvert ◽  
Melissa Wei ◽  
Ksenia B. Bravaya

<p>eMap is a web-based platform for identifying and visualizing electron or hole transfer pathways in proteins based on their crystal structures. The underlying model can be viewed as a coarse-grained version of the Pathways model, where each tunneling step between hopping sites represented by electron transfer active (ETA) moieties is described with one effective decay parameter that describes protein-mediated tunneling. ETA moieties include aromatic amino acid residue side chains and aromatic fragments of cofactors that are automatically detected, and, in addition, electron/hole residing sites that can be specified by the users. The software searches for the shortest paths connecting the user-specified electron/hole source to either all surface-exposed ETA residues or to the user-specified target. The identified pathways are ranked based on their length. The pathways are visualized in 2D as a graph, in which each node represents an ETA site, and in 3D using available protein visualization tools. Here, we present the capability and user interface of eMap 1.0, which is available at https://emap.bu.edu.</p>


Sign in / Sign up

Export Citation Format

Share Document