novel method
Recently Published Documents





2022 ◽  
Vol 13 (1) ◽  
pp. 1-17
Ankit Kumar ◽  
Abhishek Kumar ◽  
Ali Kashif Bashir ◽  
Mamoon Rashid ◽  
V. D. Ambeth Kumar ◽  

Detection of outliers or anomalies is one of the vital issues in pattern-driven data mining. Outlier detection detects the inconsistent behavior of individual objects. It is an important sector in the data mining field with several different applications such as detecting credit card fraud, hacking discovery and discovering criminal activities. It is necessary to develop tools used to uncover the critical information established in the extensive data. This paper investigated a novel method for detecting cluster outliers in a multidimensional dataset, capable of identifying the clusters and outliers for datasets containing noise. The proposed method can detect the groups and outliers left by the clustering process, like instant irregular sets of clusters (C) and outliers (O), to boost the results. The results obtained after applying the algorithm to the dataset improved in terms of several parameters. For the comparative analysis, the accurate average value and the recall value parameters are computed. The accurate average value is 74.05% of the existing COID algorithm, and our proposed algorithm has 77.21%. The average recall value is 81.19% and 89.51% of the existing and proposed algorithm, which shows that the proposed work efficiency is better than the existing COID algorithm.

2022 ◽  
Vol 12 (4) ◽  
pp. 807-812
Yan Li ◽  
Yu-Ren Zhang ◽  
Ping Zhang ◽  
Dong-Xu Li ◽  
Tian-Long Xiao

It is a critical impact on the processing of biological cells to protein–protein interactions (PPIs) in nature. Traditional PPIs predictive biological experiments consume a lot of human and material costs and time. Therefore, there is a great need to use computational methods to forecast PPIs. Most of the existing calculation methods are based on the sequence characteristics or internal structural characteristics of proteins, and most of them have the singleness of features. Therefore, we propose a novel method to predict PPIs base on multiple information fusion through graph representation learning. Specifically, firstly, the known protein sequences are calculated, and the properties of each protein are obtained by k-mer. Then, the known protein relationship pairs were constructed into an adjacency graph, and the graph representation learning method–graph convolution network was used to fuse the attributes of each protein with the graph structure information to obtain the features containing a variety of information. Finally, we put the multi-information features into the random forest classifier species for prediction and classification. Experimental results indicate that our method has high accuracy and AUC of 78.83% and 86.10%, respectively. In conclusion, our method has an excellent application prospect for predicting unknown PPIs.

2022 ◽  
Vol 82 ◽  
pp. 103148
Fanxu Zeng ◽  
Ningchuan Zhang ◽  
Guoxing Huang ◽  
Qian Gu ◽  
Wenbo Pan

2022 ◽  
Vol 40 (3) ◽  
pp. 1-21
Lili Wang ◽  
Chenghan Huang ◽  
Ying Lu ◽  
Weicheng Ma ◽  
Ruibo Liu ◽  

Complex user behavior, especially in settings such as social media, can be organized as time-evolving networks. Through network embedding, we can extract general-purpose vector representations of these dynamic networks which allow us to analyze them without extensive feature engineering. Prior work has shown how to generate network embeddings while preserving the structural role proximity of nodes. These methods, however, cannot capture the temporal evolution of the structural identity of the nodes in dynamic networks. Other works, on the other hand, have focused on learning microscopic dynamic embeddings. Though these methods can learn node representations over dynamic networks, these representations capture the local context of nodes and do not learn the structural roles of nodes. In this article, we propose a novel method for learning structural node embeddings in discrete-time dynamic networks. Our method, called HR2vec , tracks historical topology information in dynamic networks to learn dynamic structural role embeddings. Through experiments on synthetic and real-world temporal datasets, we show that our method outperforms other well-known methods in tasks where structural equivalence and historical information both play important roles. HR2vec can be used to model dynamic user behavior in any networked setting where users can be represented as nodes. Additionally, we propose a novel method (called network fingerprinting) that uses HR2vec embeddings for modeling whole (or partial) time-evolving networks. We showcase our network fingerprinting method on synthetic and real-world networks. Specifically, we demonstrate how our method can be used for detecting foreign-backed information operations on Twitter.

2022 ◽  
Vol 40 (4) ◽  
pp. 1-45
Weiren Yu ◽  
Julie McCann ◽  
Chengyuan Zhang ◽  
Hakan Ferhatosmanoglu

SimRank is an attractive link-based similarity measure used in fertile fields of Web search and sociometry. However, the existing deterministic method by Kusumoto et al. [ 24 ] for retrieving SimRank does not always produce high-quality similarity results, as it fails to accurately obtain diagonal correction matrix  D . Moreover, SimRank has a “connectivity trait” problem: increasing the number of paths between a pair of nodes would decrease its similarity score. The best-known remedy, SimRank++ [ 1 ], cannot completely fix this problem, since its score would still be zero if there are no common in-neighbors between two nodes. In this article, we study fast high-quality link-based similarity search on billion-scale graphs. (1) We first devise a “varied- D ” method to accurately compute SimRank in linear memory. We also aggregate duplicate computations, which reduces the time of [ 24 ] from quadratic to linear in the number of iterations. (2) We propose a novel “cosine-based” SimRank model to circumvent the “connectivity trait” problem. (3) To substantially speed up the partial-pairs “cosine-based” SimRank search on large graphs, we devise an efficient dimensionality reduction algorithm, PSR # , with guaranteed accuracy. (4) We give mathematical insights to the semantic difference between SimRank and its variant, and correct an argument in [ 24 ] that “if D is replaced by a scaled identity matrix (1-Ɣ)I, their top-K rankings will not be affected much”. (5) We propose a novel method that can accurately convert from Li et al.  SimRank ~{S} to Jeh and Widom’s SimRank S . (6) We propose GSR # , a generalisation of our “cosine-based” SimRank model, to quantify pairwise similarities across two distinct graphs, unlike SimRank that would assess nodes across two graphs as completely dissimilar. Extensive experiments on various datasets demonstrate the superiority of our proposed approaches in terms of high search quality, computational efficiency, accuracy, and scalability on billion-edge graphs.

2022 ◽  
Vol 210 ◽  
pp. 114432
Chengshang Zhou ◽  
Fangrui Lin ◽  
Pei Sun ◽  
Zoujun Chen ◽  
Zhongyuan Duan ◽  

Sign in / Sign up

Export Citation Format

Share Document