scholarly journals Study on a New Method of Link-Based Link Prediction in the Context of Big Data

2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Chen Jicheng ◽  
Chen Hongchang ◽  
Li Hanchao

Link prediction is a concept of network theory that intends to find a link between two separate network entities. In the present world of social media, this concept has taken root, and its application is seen through numerous social networks. A typical example is 2004, 4 February “TheFeacebook,” currently known as just Facebook. It uses this concept to recommend friends by checking their links using various algorithms. The same goes for shopping and e-commerce sites. Notwithstanding all the merits link prediction presents, they are only enjoyed by large networks. For sparse networks, there is a wide disparity between the links that are likely to form and the ones that include. A barrage of literature has been written to approach this problem; however, they mostly come from the angle of unsupervised learning (UL). While it may seem appropriate based on a dataset’s nature, it does not provide accurate information for sparse networks. Supervised learning could seem reasonable in such cases. This research is aimed at finding the most appropriate link-based link prediction methods in the context of big data based on supervised learning. There is a tone of books written on the same; nonetheless, they are core issues that are not always addressed in these studies, which are critical in understanding the concept of link prediction. This research explicitly looks at the new problems and uses the supervised approach in analyzing them to devise a full-fledge holistic link-based link prediction method. Specifically, the network issues that we will be delving into the lack of specificity in the existing techniques, observational periods, variance reduction, sampling approaches, and topological causes of imbalances. In the subsequent sections of the paper, we explain the theory prediction algorithms, precisely the flow-based process. We specifically address the problems on sparse networks that are never discussed with other prediction methods. The resolutions made by addressing the above techniques place our framework above the previous literature’s unsupervised approaches.

Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6560
Author(s):  
Hui Wang ◽  
Zichun Le

Link prediction is the most basic and essential problem in complex networks. This study analyzes the observed topological, time, attributive, label, weight, directional, and symbolic features and auxiliary information to find the lack of connection and predict the future possible connection. For discussion and analysis of the evolution of the network, the network model is of great significance. In the past two decades, link prediction has attracted extensive attention from experts in various fields, who have published numerous high-level papers, but few combine interdisciplinary characteristics. This survey analyzes and discusses the existing link prediction methods. The idea of stratification is introduced into the classification system of link prediction for the first time and proposes the design idea of a seven-layer model, namely the network, metadata, feature classification, selection input, processing, selection, and output layers. Among them, the processing layer divides link prediction methods into similarity-based, probabilistic, likelihood, supervised learning, semi-supervised learning, unsupervised learning, and reinforcement learning methods. The input features, evaluation metrics, complex analysis, experimental comparisons, relative merits, common dataset and open-source implementations for each link prediction method are then discussed in detail. Through analysis and comparison, we found that the link prediction method based on graph structure features has better prediction performance. Finally, the future development direction of link prediction in complex networks is discussed.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Bin Deng ◽  
Jun Xu ◽  
Xin Wei

In view of the fact that the important characteristics of tourism destination selection preference are not considered in the current prediction methods of tourism destination selection preference, resulting in low prediction accuracy and comprehensive accuracy and long prediction time, a tourism destination selection preference prediction method based on edge calculation is proposed. This paper uses edge computing to construct the characteristics of tourism destination selection preference and uses a random forest algorithm to select important features and carry out preliminary estimation and ranking. Using the multiple logit selection model, the tourists’ preference sequence for tourism destination selection is obtained and sorted and the tourism destination selection preference model is obtained. By calculating the weight value of tourism destination selection preference, the weight set of tourism destination selection preference is determined and the tourism destination selection preference is determined according to the link prediction method to realize the tourism destination selection preference prediction. The experimental results show that the comprehensive accuracy of the proposed method is good, which can effectively improve the prediction accuracy of tourism destination selection preference and shorten the prediction time of tourism destination selection preference.


2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Fei Gao ◽  
Katarzyna Musial ◽  
Colin Cooper ◽  
Sophia Tsoka

Currently, we are experiencing a rapid growth of the number of social-based online systems. The availability of the vast amounts of data gathered in those systems brings new challenges that we face when trying to analyse it. One of the intensively researched topics is theprediction of social connections between users. Although a lot of effort has been made to develop new prediction approaches, the existing methods are not comprehensively analysed. In this paper we investigate the correlation between network metrics and accuracy of different prediction methods. We selected six time-stamped real-world social networks and ten most widely used link prediction methods. The results of the experiments show that the performance of some methods has a strong correlation with certain network metrics. We managed to distinguish “prediction friendly” networks, for which most of the prediction methods give good performance, as well as “prediction unfriendly” networks, for which most of the methods result in high prediction error. Correlation analysis between network metrics and prediction accuracy of prediction methods may form the basis of a metalearning system where based on network characteristics it will be able to recommend the right prediction method for a given network.


Improving the performance of link prediction is a significant role in the evaluation of social network. Link prediction is known as one of the primary purposes for recommended systems, bio information, and web. Most machine learning methods that depend on SNA model’s metrics use supervised learning to develop link prediction models. Supervised learning actually needed huge amount of data set to train the model of link prediction to obtain an optimal level of performance. In few years, Deep Reinforcement Learning (DRL) has achieved excellent success in various domain such as SNA. In this paper, we present the use of deep reinforcement learning (DRL) to improve the performance and accuracy of the model for the applied dataset. The experiment shows that the dataset created by the DRL model through self-play or auto-simulation can be utilized to improve the link prediction model. We have used three different datasets: JUNANES, MAMBO, JAKE. Experimental results show that the DRL proposed method provide accuracy of 85% for JUNANES, 87% for MAMABO, and 78% for JAKE dataset which outperforms the GBM next highest accuracy of 75% for JUNANES, 79% for MAMBO and 71% for JAKE dataset respectively trained with 2500 iteration and also in terms of AUC measures as well. The DRL model shows the better efficiency than a traditional machine learning strategy, such as, Random Forest and the gradient boosting machine (GBM).


2021 ◽  
Vol 267 ◽  
pp. 01005
Author(s):  
Yongli Wang ◽  
Hekun Shen ◽  
Jialin Yang ◽  
Nan Wang ◽  
Yuze Ma ◽  
...  

The planning of integrated energy system is a very complex multi-target, multi-constraint, nonlinear, random uncertainty mixed integrated combination optimization problem, its planning and design process should not only consider the interdependence between the system capacity, energy conversion, energy storage, energy use and other links, but also consider the interaction and integration of cold, hot, electricity and other multi-energy flows, which is essentially a non-deterministic polynomial difficult problem. China’s energy continues to develop rapidly, all kinds of sensors and intelligent equipment data is increasing, the data obtained in the equipment and all kinds of sensors collected energy load prediction related factors such as temperature, weather, wind speed and other data volume increased dramatically, the data dimension is also increasing, the scale of data has also increased from GB to TB or even higher, based on the traditional prediction methods and intelligent prediction methods, has been far below the load forecast desired to achieve accuracy and speed requirements, Therefore, the use of big data technology to predict energy demand is an important future direction.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Tiago Colliri ◽  
Liang Zhao

AbstractIn this paper, we propose a network-based technique to analyze bills-voting data comprising the votes of Brazilian congressmen for a period of 28 years. The voting sessions are initially mapped into static networks, where each node represents a congressman and each edge stands for the similarity of votes between a pair of congressmen. Afterwards, the constructed static networks are converted to temporal networks. Our analyses on the temporal networks capture some of the main political changes happened in Brazil during the period of time under consideration. Moreover, we find out that the bills-voting networks can be used to identify convicted politicians, who commit corruption or other financial crimes. Therefore, we propose two conviction prediction methods, one is based on the highest weighted convicted neighbor and the other is based on link prediction techniques. It is a surprise to us that the high accuracy (up to 90% by the link prediction method) on predicting convictions is achieved only through bills-voting data, without taking into account any financial information beforehand. Such a feature makes possible to monitor congressmen just by considering their legal public activities. In this way, our work contributes to the large scale public data study using complex networks.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Weiwei Gu ◽  
Fei Gao ◽  
Xiaodan Lou ◽  
Jiang Zhang

AbstractIn this paper, we propose graph attention based network representation (GANR) which utilizes the graph attention architecture and takes graph structure as the supervised learning information. Compared with node classification based representations, GANR can be used to learn representation for any given graph. GANR is not only capable of learning high quality node representations that achieve a competitive performance on link prediction, network visualization and node classification but it can also extract meaningful attention weights that can be applied in node centrality measuring task. GANR can identify the leading venture capital investors, discover highly cited papers and find the most influential nodes in Susceptible Infected Recovered Model. We conclude that link structures in graphs are not limited on predicting linkage itself, it is capable of revealing latent node information in an unsupervised way once a appropriate learning algorithm, like GANR, is provided.


Sign in / Sign up

Export Citation Format

Share Document