scholarly journals Lightweight Label Propagation for Large-Scale Network Data

Author(s):  
De-Ming Liang ◽  
Yu-Feng Li

Label propagation spreads the soft labels from few labeled data to a large amount of unlabeled data according to the intrinsic graph structure. Nonetheless, most label propagation solutions work under relatively small-scale data and fail to cope with many real applications, such as social network analysis, where graphs usually have millions of nodes. In this paper, we propose a novel algorithm named \algo to deal with large-scale data. A lightweight iterative process derived from the well-known stochastic gradient descent strategy is used to reduce memory overhead and accelerate the solving process. We also give a theoretical analysis on the necessity of the warm-start technique for label propagation. Experiments show that our algorithm can handle million-scale graphs in few seconds while achieving highly competitive performance with existing algorithms.

2021 ◽  
Author(s):  
XiaoWei Wu ◽  
FanLiang Bu ◽  
ZhiWen Hou

Abstract Aiming at the problem of event prediction in large-scale event network, a collapse subgraph convolution (CSGCN) algorithm is proposed, which uses event subgraph to predict the subsequent events of event group. CSGCN algorithm collapses the edge induced event subgraph in large-scale event network, removes the irrelevant event nodes from the subgraph, and forms a new event subgraph. GCN algorithm is used to learn the graph embedding representation of the event subgraph, and the subsequent events of the event group are predicted by comparing the similarity between the graph embedding representation of the event group and the subsequent events. Because only some related nodes are processed each time, the application of the model in large-scale data graph is feasible. Through experiments, we explore and verify the effectiveness of extracting features from subgraphs of large-scale graph by using graph convolution training to obtain graph embedding representation. We find that GCN has better event prediction effect than Euclidean distance and co rotation similarity, which further shows that graph convolution algorithm has good performance in the field of graph feature extraction.


2019 ◽  
Vol 33 (30) ◽  
pp. 1950363
Author(s):  
Chen Song ◽  
Guoyan Huang ◽  
Bo Yin ◽  
Bing Zhang ◽  
Xinqian Liu

Label propagation algorithm (LPA) attracts wide attention in community detection field for its near linear time complexity in large scale network. However, the algorithm adopts a random selection scheme in label updating strategy, which results in unstable division and poor accuracy. In this paper, five different indicators of node similarity are introduced based on network local information to distinguish nodes and a new label updating method is proposed. When there are multiple maximum neighbor labels in the propagation process, the maximum label corresponding to the most similar node is selected for updating instead of a random one. Five different forms of improved LPA are proposed which are named as SAL-LPA, SOR-LPA, JAC-LPA, SOR-LPA, HDI-LPA and HPI-LPA. The experiment results on real-world and artificial benchmark networks show that the improved LPA greatly improves the performance of the original algorithm, among which HPI-LPA is the best.


2019 ◽  
Vol 9 (11) ◽  
pp. 2343 ◽  
Author(s):  
Swagatika Sahoo ◽  
Akshay M. Fajge ◽  
Raju Halder ◽  
Agostino Cortesi

In the nine years since its launch, amid intense research, scalability is always a serious concern in blockchain, especially in case of large-scale network generating huge number of transaction-records. In this paper, we propose a hierarchical blockchain model characterized by: (1) each level maintains multiple local blockchain networks, (2) each local blockchain records local transactional activities, and (3) partial views (tunable w.r.t. precision) of different subsets of local blockchain-records are maintained in the blockchains at next level of the hierarchy. To meet this objective, we apply abstractions on a set of transaction-records in a regular time interval by following the Abstract Interpretation framework, which provides a tunable precision in various abstract domain and guarantees the soundness of the system. While this model suitably fits to the real-worlds organizational structures, the proposal is powerful enough to scale when large number of nodes participate in a network resulting into an enormous growth of the network-size and the number of transaction-records. We discuss experimental results on a small-scale network with three sub networks at lower-level and by abstracting the transaction-records in the abstract domain of intervals. The results are encouraging and clearly indicate the effectiveness of this approach to control exponential growth of blockchain size w.r.t. the total number of participants in the network.


2013 ◽  
Vol 2013 ◽  
pp. 1-7
Author(s):  
Hui He ◽  
Guotao Fan ◽  
Jianwei Ye ◽  
Weizhe Zhang

It is of great significance to research the early warning system for large-scale network security incidents. It can improve the network system’s emergency response capabilities, alleviate the cyber attacks’ damage, and strengthen the system’s counterattack ability. A comprehensive early warning system is presented in this paper, which combines active measurement and anomaly detection. The key visualization algorithm and technology of the system are mainly discussed. The large-scale network system’s plane visualization is realized based on the divide and conquer thought. First, the topology of the large-scale network is divided into some small-scale networks by the MLkP/CR algorithm. Second, the sub graph plane visualization algorithm is applied to each small-scale network. Finally, the small-scale networks’ topologies are combined into a topology based on the automatic distribution algorithm of force analysis. As the algorithm transforms the large-scale network topology plane visualization problem into a series of small-scale network topology plane visualization and distribution problems, it has higher parallelism and is able to handle the display of ultra-large-scale network topology.


2014 ◽  
Vol 687-691 ◽  
pp. 1342-1345 ◽  
Author(s):  
Jie Ding ◽  
Li Peng Zhu ◽  
Bin Hu ◽  
Ren Long Hang ◽  
Yu Bao Sun

With the rapid advance of data collection and storage technique, it is easy to acquire tens of millions or even billions of data sets. How to explore and exploit the useful or interesting information for human beings from these data sets has become an urgent issue. Traditional k-means clustering algorithm has been widely used in data mining community. First, randomly initialize k clustering centres. Then, all instances are classified into k different classes according to their distances to clustering centres. Lastly, update the clustering centres by the mean of its corresponding constituent instances. This whole process will be iterated until convergence. Obviously, at each iteration, distance matrix from all instances to k clustering centres must be calculated which will cost so much time when encounter large scale data sets. To address this issue, in this paper, we proposed a fast optimization algorithm based on stochastic gradient descent (SGD). At each iteration, randomly choose an instance, search its corresponding clustering centre and then update it immediately. Experimental results show that our proposed method achieves a competitive clustering results with less time cost.


Sign in / Sign up

Export Citation Format

Share Document