Rank correlation between centrality metrics in complex networks: an empirical study
Abstract Centrality is widely used to measure which nodes are important in a network. In recent decades, numerous metrics have been proposed with varying computation complexity. To test the idea that approximating a high-complexity metric by a low-complexity metric, researchers have studied the correlation between them. However, these works are based on Pearson correlation which is sensitive to the data distribution. Intuitively, a centrality metric is a ranking of nodes (or edges). It would be more reasonable to use rank correlation to do the measurement. In this paper, we use degree, a low-complexity metric, as the base to approximate three other metrics: closeness, betweenness, and eigenvector. We first demonstrate that rank correlation performs better than the Pearson one in scale-free networks. Then we study the correlation between centrality metrics in real networks, and find that the betweenness occupies the highest coefficient, closeness is at the middle level, and eigenvector fluctuates dramatically. At last, we evaluate the performance of using top degree nodes to approximate three other metrics in the real networks. We find that the intersection ratio of betweenness is the highest, and closeness and eigenvector follows; most often, the largest degree nodes could approximate largest betweenness and closeness nodes, but not the largest eigenvector nodes.