ABCDE: Approximating Betweenness-Centrality ranking with progressive-DropEdge

Betweenness-centrality is a popular measure in network analysis that aims to describe the importance of nodes in a graph. It accounts for the fraction of shortest paths passing through that node and is a key measure in many applications including community detection and network dismantling. The computation of betweenness-centrality for each node in a graph requires an excessive amount of computing power, especially for large graphs. On the other hand, in many applications, the main interest lies in finding the top-k most important nodes in the graph. Therefore, several approximation algorithms were proposed to solve the problem faster. Some recent approaches propose to use shallow graph convolutional networks to approximate the top-k nodes with the highest betweenness-centrality scores. This work presents a deep graph convolutional neural network that outputs a rank score for each node in a given graph. With careful optimization and regularization tricks, including an extended version of DropEdge which is named Progressive-DropEdge, the system achieves better results than the current approaches. Experiments on both real-world and synthetic datasets show that the presented algorithm is an order of magnitude faster in inference and requires several times fewer resources and time to train.

Download Full-text

Betweenness centrality profiles in trees

Journal of Complex Networks ◽

10.1093/comnet/cnx007 ◽

2017 ◽

Vol 5 (5) ◽

pp. 776-794

Author(s):

Benjamin Fish ◽

Rahul Kushwaha ◽

György Turán

Keyword(s):

Betweenness Centrality ◽

Shortest Paths ◽

Random Trees ◽

Experimental Results ◽

Basic Notion ◽

Worst Case ◽

Scale Free ◽

Order Of Magnitude

Abstract Betweenness centrality of a vertex in a graph measures the fraction of shortest paths going through the vertex. This is a basic notion for determining the importance of a vertex in a network. The $k$-betweenness centrality of a vertex is defined similarly, but only considers shortest paths of length at most $k$. The sequence of $k$-betweenness centralities for all possible values of $k$ forms the betweenness centrality profile of a vertex. We study properties of betweenness centrality profiles in trees. We show that for scale-free random trees, for fixed $k$, the expectation of $k$-betweenness centrality strictly decreases as the index of the vertex increases. We also analyse worst-case properties of profiles in terms of the distance of profiles from being monotone, and the number of times pairs of profiles can cross. This is related to whether $k$-betweenness centrality, for small values of $k$, may be used instead of having to consider all shortest paths. Bounds are given that are optimal in order of magnitude. We also present some experimental results for scale-free random trees.

Download Full-text

Detection, Averaging, and 3D Reconstruction of Biological Specimens on Hypercubes (Transputer-based) Computers

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100181026 ◽

1990 ◽

Vol 48 (1) ◽

pp. 454-455

Author(s):

Jose-Maria Carazo ◽

I. Benavides ◽

S. Marco ◽

J.L. Carrascosa ◽

E.L. Zapata

Keyword(s):

3D Reconstruction ◽

3D Structure ◽

Three Dimensional ◽

Software Tools ◽

Mapping Method ◽

Computer Architectures ◽

Parallel Computer ◽

Reconstruction Process ◽

Computing Power ◽

Order Of Magnitude

Obtaining the three-dimensional (3D) structure of negatively stained biological specimens at a resolution of, typically, 2 - 4 nm is becoming a relatively common practice in an increasing number of laboratories. A combination of new conceptual approaches, new software tools, and faster computers have made this situation possible. However, all these 3D reconstruction processes are quite computer intensive, and the middle term future is full of suggestions entailing an even greater need of computing power. Up to now all published 3D reconstructions in this field have been performed on conventional (sequential) computers, but it is a fact that new parallel computer architectures represent the potential of order-of-magnitude increases in computing power and should, therefore, be considered for their possible application in the most computing intensive tasks.We have studied both shared-memory-based computer architectures, like the BBN Butterfly, and local-memory-based architectures, mainly hypercubes implemented on transputers, where we have used the algorithmic mapping method proposed by Zapata el at. In this work we have developed the basic software tools needed to obtain a 3D reconstruction from non-crystalline specimens (“single particles”) using the so-called Random Conical Tilt Series Method. We start from a pair of images presenting the same field, first tilted (by ≃55°) and then untilted. It is then assumed that we can supply the system with the image of the particle we are looking for (ideally, a 2D average from a previous study) and with a matrix describing the geometrical relationships between the tilted and untilted fields (this step is now accomplished by interactively marking a few pairs of corresponding features in the two fields). From here on the 3D reconstruction process may be run automatically.

Download Full-text

Betweenness centrality for temporal multiplexes

Scientific Reports ◽

10.1038/s41598-021-84418-z ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Silvia Zaoli ◽

Piero Mazzarisi ◽

Fabrizio Lillo

Keyword(s):

Information Flow ◽

Real World ◽

Betweenness Centrality ◽

Temporal Structure ◽

Shortest Paths ◽

Single Layer ◽

Distance Metric ◽

Definition Of

AbstractBetweenness centrality quantifies the importance of a vertex for the information flow in a network. The standard betweenness centrality applies to static single-layer networks, but many real world networks are both dynamic and made of several layers. We propose a definition of betweenness centrality for temporal multiplexes. This definition accounts for the topological and temporal structure and for the duration of paths in the determination of the shortest paths. We propose an algorithm to compute the new metric using a mapping to a static graph. We apply the metric to a dataset of $$\sim 20$$ ∼ 20 k European flights and compare the results with those obtained with static or single-layer metrics. The differences in the airports rankings highlight the importance of considering the temporal multiplex structure and an appropriate distance metric.

Download Full-text

The Power of Quasi-Shortest Paths: $\rho$ -Geodesic Betweenness Centrality

IEEE Transactions on Network Science and Engineering ◽

10.1109/tnse.2017.2708705 ◽

2017 ◽

Vol 4 (3) ◽

pp. 187-200

Author(s):

Dianne S. V. de Medeiros ◽

Miguel Elias M. Campista ◽

Nathalie Mitton ◽

Marcelo Dias de Amorim ◽

Guy Pujolle

Keyword(s):

Betweenness Centrality ◽

Shortest Paths

Download Full-text

Revisiting particle sizing using greyscale optical array probes: evaluation using laboratory experiments and synthetic data

Atmospheric Measurement Techniques ◽

10.5194/amt-12-3067-2019 ◽

2019 ◽

Vol 12 (6) ◽

pp. 3067-3079

Author(s):

Sebastian J. O'Shea ◽

Jonathan Crosier ◽

James Dorsey ◽

Waldemar Schledewitz ◽

Ian Crawford ◽

...

Keyword(s):

Climate Models ◽

Synthetic Data ◽

Mie Scattering ◽

Sample Volume ◽

Research Aircraft ◽

Order Of Magnitude ◽

Ambient Data ◽

In Situ Observations ◽

Synthetic Datasets

Abstract. In situ observations from research aircraft and instrumented ground sites are important contributions to developing our collective understanding of clouds and are used to inform and validate numerical weather and climate models. Unfortunately, biases in these datasets may be present, which can limit their value. In this paper, we discuss artefacts which may bias data from a widely used family of instrumentation in the field of cloud physics, optical array probes (OAPs). Using laboratory and synthetic datasets, we demonstrate how greyscale analysis can be used to filter data, constraining the sample volume of the OAP and improving data quality, particularly at small sizes where OAP data are considered unreliable. We apply the new methodology to ambient data from two contrasting case studies: one warm cloud and one cirrus cloud. In both cases the new methodology reduces the concentration of small particles (<60 µm) by approximately an order of magnitude. This significantly improves agreement with a Mie-scattering spectrometer for the liquid case and with a holographic imaging probe for the cirrus case. Based on these results, we make specific recommendations to instrument manufacturers, instrument operators and data processors about the optimal use of greyscale OAPs. The data from monoscale OAPs are unreliable and should not be used for particle diameters below approximately 100 µm.

Download Full-text

Resolving Clinicians’ Queries Across a Grid’s Infrastructure

Methods of Information in Medicine ◽

10.1055/s-0038-1633936 ◽

2005 ◽

Vol 44 (02) ◽

pp. 149-153 ◽

Cited By ~ 2

Author(s):

F. Estrella ◽

C. del Frate ◽

T. Hauer ◽

M. Odeh ◽

D. Rogulin ◽

...

Keyword(s):

Image Analysis ◽

Data Storage ◽

Medical Image ◽

Large Scale ◽

Medical Image Analysis ◽

Large Data ◽

Computing Power ◽

European Database ◽

Order Of Magnitude ◽

Network Speed

Summary Objectives: The past decade has witnessed order of magnitude increases in computing power, data storage capacity and network speed, giving birth to applications which may handle large data volumes of increased complexity, distributed over the internet. Methods: Medical image analysis is one of the areas for which this unique opportunity likely brings revolutionary advances both for the scientist’s research study and the clinician’s everyday work. Grids [1] computing promises to resolve many of the difficulties in facilitating medical image analysis to allow radiologists to collaborate without having to co-locate. Results: The EU-funded MammoGrid project [2] aims to investigate the feasibility of developing a Grid-enabled European database of mammograms and provide an information infrastructure which federates multiple mammogram databases. This will enable clinicians to develop new common, collaborative and co-operative approaches to the analysis of mammographic data. Conclusion: This paper focuses on one of the key requirements for large-scale distributed mammogram analysis: resolving queries across a grid-connected federation of images.

Download Full-text

Parallel Algorithm for Incremental Betweenness Centrality on Large Graphs

IEEE Transactions on Parallel and Distributed Systems ◽

10.1109/tpds.2017.2763951 ◽

2018 ◽

Vol 29 (3) ◽

pp. 659-672 ◽

Cited By ~ 16

Author(s):

Fuad Jamour ◽

Spiros Skiadopoulos ◽

Panos Kalnis

Keyword(s):

Parallel Algorithm ◽

Betweenness Centrality ◽

Large Graphs

Download Full-text

Fast and accurate estimation of shortest paths in large graphs

Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10 ◽

10.1145/1871437.1871503 ◽

2010 ◽

Cited By ~ 66

Author(s):

Andrey Gubichev ◽

Srikanta Bedathur ◽

Stephan Seufert ◽

Gerhard Weikum

Keyword(s):

Shortest Paths ◽

Accurate Estimation ◽

Large Graphs

Download Full-text

From paths to blocks: New measures for street patterns

Environment and Planning B Urban Analytics and City Science ◽

10.1177/0265813515599982 ◽

2016 ◽

Vol 44 (2) ◽

pp. 256-271 ◽

Cited By ~ 12

Author(s):

Marc Barthelemy

Keyword(s):

Betweenness Centrality ◽

Spatial Organization ◽

Shortest Paths ◽

Urban Systems ◽

Conditional Probability Distribution ◽

Planar Network ◽

Formation And Evolution ◽

Straight Lines ◽

The Impact ◽

Planar Networks

The street network is an important aspect of cities and contains crucial information about their organization and evolution. Characterizing and comparing various street networks could then be helpful for a better understanding of the mechanisms governing the formation and evolution of these systems. Their characterization is however not easy: there are no simple tools to classify planar networks and most of the measures developed for complex networks are not useful when space is relevant. Here, we describe recent efforts in this direction and new methods adapted to spatial networks. We will first discuss measures based on the structure of shortest paths, among which the betweenness centrality. In particular for time-evolving road networks, we will show that the spatial distribution of the betweenness centrality is able to reveal the impact of important structural transformations. Shortest paths are however not the only relevant ones. In particular, they can be very different from those with the smallest number of turns—the simplest paths. The statistical comparison of the lengths of the shortest and simplest paths provides a nontrivial and nonlocal information about the spatial organization of planar graphs. We define the simplicity index as the average ratio of these lengths and the simplicity profile characterizes the simplicity at different scales. Measuring these quantities on artificial (roads, highways, railways) and natural networks (leaves, insect wings) show that there are fundamental differences—probably related to their different function—in the organization of urban and biological systems: there is a clear hierarchy of the lengths of straight lines in biological cases, but they are randomly distributed in urban systems. The paths are however not enough to fully characterize the spatial pattern of planar networks such as streets and roads. Another promising direction is to analyze the statistics of blocks of the planar network. More precisely, we can use the conditional probability distribution of the shape factor of blocks with a given area, and define what could constitute the fingerprint of a city. These fingerprints can then serve as a basis for a classification of cities based on their street patterns. This method applied on more than 130 cities in the world leads to four broad families of cities characterized by different abundances of blocks of a certain area and shape. This classification will be helpful for identifying dominant mechanisms governing the formation and evolution of street patterns.

Download Full-text

Scalable Computing of Betweenness Centrality based on Graph Reduction with a Case Study on Breast Cancer Analytics

10.21203/rs.3.rs-72273/v1 ◽

2020 ◽

Author(s):

Chung-Hsien Chou ◽

Shaoting Wang ◽

Hsiang-Shun Shih ◽

Phillip C-Y. Sheu

Keyword(s):

Breast Cancer ◽

Betweenness Centrality ◽

Reduction Rate ◽

Cancer Cell Line ◽

Network Size ◽

Reduction Algorithm ◽

Graph Reduction ◽

Original Graph ◽

Large Graphs ◽

Bounded Error

Abstract BackgroundGraph theory has been widely applied to the studies in biomedicine such as structural measures including betweenness centrality. However, if the network size is too large, the result of betweenness centrality would be difficult to obtain in a reasonable amount of time.ResultIn this paper, we describe an approach, 1+ɛ lossy graph reduction algorithm, to computing betweenness centrality on large graphs. The approach is able to guarantee a bounded approximation result. We use GSE48216, a breast cancer cell line co-expression network, to show that our algorithms can achieve a higher reduction rate with a trade-off of some bounded errors in query results. Furthermore, by comparing the betweenness centrality of the original graph and the reduced graph, it can be shown that a higher reduction rate does not sacrifice the accuracy of betweenness centrality when providing faster execution time.ConclusionsOur proposed 1+ɛ lossy graph reduction algorithm is validated by the experiment results which show that the approach achieves a faster execution within a bounded error rate.

Download Full-text