scholarly journals Missing link prediction and spurious link detection based on attractive force and community

2021 ◽  
Vol 104 (2) ◽  
pp. 003685042110185
Author(s):  
Hui Qu ◽  
Wei Chen ◽  
Kuo Chi

With the rapid development of Internet and information technology, networks have become an important media of information diffusion in the global. In view of the increasing scale of network data, how to ensure the completeness and accuracy of the obtainable links from networks has been an urgent problem that needs to be solved. Different from most traditional link prediction methods only focus on the missing links, a novel link prediction approach is proposed in this paper to handle both the missing links and the spurious links in networks. At first, we define the attractive force for any pair of nodes to denote the strength of the relation between them. Then, all the nodes can be divided into some communities according to their degrees and the attractive force on them. Next, we define the connection probability for each pair of unconnected nodes to measure the possibility if they are connected, the missing links can be predicted by calculating and comparing the connection probabilities of all the pairs of unconnected nodes. Moreover, we define the break probability for each pair of connected nodes to measure the possibility if they are broken, the spurious links can also be detected by calculating and comparing the break probabilities of all the pairs of connected nodes. To verify the validity of the proposed approach, we conduct experiments on some real-world networks. The results show the proposed approach can achieve higher prediction accuracy and more stable performance compared with some existing methods.

2019 ◽  
Vol 33 (31) ◽  
pp. 1950382
Author(s):  
Shenshen Bai ◽  
Shiyu Fang ◽  
Longjie Li ◽  
Rui Liu ◽  
Xiaoyun Chen

With the proliferation of available network data, link prediction has become increasingly important and captured growing attention from various disciplines. To enhance the prediction accuracy by making full use of community structure information, this paper proposes a new link prediction model, namely CMS, in which different community memberships of nodes are investigated. In the opinion of CMS, different memberships can have different influence to link’s formation. To estimate the connection likelihood between two nodes, the CMS model weights the contribution of each shared neighbor according to the corresponding community membership. Three CMS-based methods are derived by introducing three forms of contribution that neighbors make. Extensive experiments on 12 networks are conducted to evaluate the performance of CMS-based methods. The results manifest that CMS-based methods are more effective and robust than baselines.


2021 ◽  
Vol 11 (11) ◽  
pp. 5043
Author(s):  
Xi Chen ◽  
Bo Kang ◽  
Jefrey Lijffijt ◽  
Tijl De Bie

Many real-world problems can be formalized as predicting links in a partially observed network. Examples include Facebook friendship suggestions, the prediction of protein–protein interactions, and the identification of hidden relationships in a crime network. Several link prediction algorithms, notably those recently introduced using network embedding, are capable of doing this by just relying on the observed part of the network. Often, whether two nodes are linked can be queried, albeit at a substantial cost (e.g., by questionnaires, wet lab experiments, or undercover work). Such additional information can improve the link prediction accuracy, but owing to the cost, the queries must be made with due consideration. Thus, we argue that an active learning approach is of great potential interest and developed ALPINE (Active Link Prediction usIng Network Embedding), a framework that identifies the most useful link status by estimating the improvement in link prediction accuracy to be gained by querying it. We proposed several query strategies for use in combination with ALPINE, inspired by the optimal experimental design and active learning literature. Experimental results on real data not only showed that ALPINE was scalable and boosted link prediction accuracy with far fewer queries, but also shed light on the relative merits of the strategies, providing actionable guidance for practitioners.


2020 ◽  
pp. 1-14
Author(s):  
Longjie Li ◽  
Lu Wang ◽  
Hongsheng Luo ◽  
Xiaoyun Chen

Link prediction is an important research direction in complex network analysis and has drawn increasing attention from researchers in various fields. So far, a plethora of structural similarity-based methods have been proposed to solve the link prediction problem. To achieve stable performance on different networks, this paper proposes a hybrid similarity model to conduct link prediction. In the proposed model, the Grey Relation Analysis (GRA) approach is employed to integrate four carefully selected similarity indexes, which are designed according to different structural features. In addition, to adaptively estimate the weight for each index based on the observed network structures, a new weight calculation method is presented by considering the distribution of similarity scores. Due to taking separate similarity indexes into account, the proposed method is applicable to multiple different types of network. Experimental results show that the proposed method outperforms other prediction methods in terms of accuracy and stableness on 10 benchmark networks.


Author(s):  
Didier A. Vega-Oliveros ◽  
Liang Zhao ◽  
Anderson Rocha ◽  
Lilian Berton

2017 ◽  
Vol 2017 ◽  
pp. 1-8 ◽  
Author(s):  
Sen Zhang ◽  
Qiang Fu ◽  
Wendong Xiao

Accurate click-through rate (CTR) prediction can not only improve the advertisement company’s reputation and revenue, but also help the advertisers to optimize the advertising performance. There are two main unsolved problems of the CTR prediction: low prediction accuracy due to the imbalanced distribution of the advertising data and the lack of the real-time advertisement bidding implementation. In this paper, we will develop a novel online CTR prediction approach by incorporating the real-time bidding (RTB) advertising by the following strategies: user profile system is constructed from the historical data of the RTB advertising to describe the user features, the historical CTR features, the ID features, and the other numerical features. A novel CTR prediction approach is presented to address the imbalanced learning sample distribution by integrating the Weighted-ELM (WELM) and the Adaboost algorithm. Compared to the commonly used algorithms, the proposed approach can improve the CTR significantly.


2018 ◽  
Vol 2018 ◽  
pp. 1-11 ◽  
Author(s):  
Zhangjie Fu ◽  
Jingnan Yu ◽  
Guowu Xie ◽  
Yiming Chen ◽  
Yuanhang Mao

With the rapid development of the network and the informatization of society, how to improve the accuracy of information is an urgent problem to be solved. The existing method is to use an intelligent robot to carry sensors to collect data and transmit the data to the server in real time. Many intelligent robots have emerged in life; the UAV (unmanned aerial vehicle) is one of them. With the popularization of UAV applications, the security of UAV has also been exposed. In addition to some human factors, there is a major factor in the UAV’s endurance. UAVs will face a problem of short battery life when performing flying missions. In order to solve this problem, the existing method is to plan the path of UAV flight. In order to find the optimal path for a UAV flight, we propose three cost functions: path security cost, length cost, and smoothness cost. The path security cost is used to determine whether the path is feasible; the length cost and smoothness cost of the path directly affect the cost of the energy consumption of the UAV flight. We proposed a heuristic evolutionary algorithm that designed several evolutionary operations: substitution operations, crossover operations, mutation operations, length operations, and smoothness operations. Through these operations to enhance our build path effect. Under the analysis of experimental results, we proved that our solution is feasible.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yuran Jia ◽  
Shan Huang ◽  
Tianjiao Zhang

DNA-binding protein (DBP) is a protein with a special DNA binding domain that is associated with many important molecular biological mechanisms. Rapid development of computational methods has made it possible to predict DBP on a large scale; however, existing methods do not fully integrate DBP-related features, resulting in rough prediction results. In this article, we develop a DNA-binding protein identification method called KK-DBP. To improve prediction accuracy, we propose a feature extraction method that fuses multiple PSSM features. The experimental results show a prediction accuracy on the independent test dataset PDB186 of 81.22%, which is the highest of all existing methods.


2022 ◽  
Vol 2022 ◽  
pp. 1-9
Author(s):  
Huazhang Liu

With the rapid development of the Internet, social networks have shown an unprecedented development trend among college students. Closer social activities among college students have led to the emergence of college students with new social characteristics. The traditional method of college students’ group classification can no longer meet the current demand. Therefore, this paper proposes a social network link prediction method-combination algorithm, which combines neighbor information and a random block. By mining the social networks of college students’ group relationships, the classification of college students’ groups can be realized. Firstly, on the basis of complex network theory, the essential relationship of college student groups under a complex network is analyzed. Secondly, a new combination algorithm is proposed by using the simplest linear combination method to combine the proximity link prediction based on neighbor information and the likelihood analysis link prediction based on a random block. Finally, the proposed combination algorithm is verified by using the social data of college students’ networks. Experimental results show that, compared with the traditional link prediction algorithm, the proposed combination algorithm can effectively dig out the group characteristics of social networks and improve the accuracy of college students’ association classification.


2016 ◽  
Vol 43 (5) ◽  
pp. 683-695 ◽  
Author(s):  
Chuanming Yu ◽  
Xiaoli Zhao ◽  
Lu An ◽  
Xia Lin

With the rapid development of the Internet, the computational analysis of social networks has grown to be a salient issue. Various research analyses social network topics, and a considerable amount of attention has been devoted to the issue of link prediction. Link prediction aims to predict the interactions that might occur between two entities in the network. To this aim, this study proposed a novel path and node combined approach and constructed a methodology for measuring node similarities. The method was illustrated with five real datasets obtained from different types of social networks. An extensive comparison of the proposed method against existing link prediction algorithms was performed to demonstrate that the path and node combined approach achieved much higher mean average precision (MAP) and area under the curve (AUC) values than those that only consider common nodes (e.g. Common Neighbours and Adamic/Adar) or paths (e.g. Random Walk with Restart and FriendLink). The results imply that two nodes are more likely to establish a link if they have more common neighbours of lower degrees. The weight of the path connecting two nodes is inversely proportional to the product of degrees of nodes on the pathway. The combination of node and topological features can substantially improve the performance of similarity-based link prediction, compared with node-dependent and path-dependent approaches. The experiments also demonstrate that the path-dependent approaches outperform the node-dependent appraoches. This indicates that topological features of networks may contribute more to improving performance than node features.


Sign in / Sign up

Export Citation Format

Share Document