scholarly journals LDPCD: A Novel Method for Locally Differentially Private Community Detection

2022 ◽  
Vol 2022 ◽  
pp. 1-18
Author(s):  
Zhejian Zhang

As one of the cores of data analysis in large social networks, community detection has become a hot research topic in recent years. However, user’s real social relationship may be at risk of privacy leakage and threatened by inference attacks because of the semitrusted server. As a result, community detection in social graphs under local differential privacy has gradually aroused the interest of industry and academia. On the one hand, the distortion of user’s real data caused by existing privacy-preserving mechanisms can have a serious impact on the mining process of densely connected local graph structure, resulting in low utility of the final community division. On the other hand, private community detection requires to use the results of multiple user-server interactions to adjust user’s partition, which inevitably leads to excessive allocation of privacy budget and large error of perturbed data. For these reasons, a new community detection method based on the local differential privacy model (named LDPCD) is proposed in this paper. Due to the introduction of truncated Laplace mechanism, the accuracy of user perturbation data is improved. In addition, the community divisive algorithm based on extremal optimization (EO) is also refined to reduce the number of interactions between users and the server. Thus, the total privacy overhead is reduced and strong privacy protection is guaranteed. Finally, LDPCD is applied in two commonly used real-world datasets, and its advantage is experimentally validated compared with two state-of-the-art methods.

2019 ◽  
Vol 2019 (1) ◽  
pp. 26-46 ◽  
Author(s):  
Thee Chanyaswad ◽  
Changchang Liu ◽  
Prateek Mittal

Abstract A key challenge facing the design of differential privacy in the non-interactive setting is to maintain the utility of the released data. To overcome this challenge, we utilize the Diaconis-Freedman-Meckes (DFM) effect, which states that most projections of high-dimensional data are nearly Gaussian. Hence, we propose the RON-Gauss model that leverages the novel combination of dimensionality reduction via random orthonormal (RON) projection and the Gaussian generative model for synthesizing differentially-private data. We analyze how RON-Gauss benefits from the DFM effect, and present multiple algorithms for a range of machine learning applications, including both unsupervised and supervised learning. Furthermore, we rigorously prove that (a) our algorithms satisfy the strong ɛ-differential privacy guarantee, and (b) RON projection can lower the level of perturbation required for differential privacy. Finally, we illustrate the effectiveness of RON-Gauss under three common machine learning applications – clustering, classification, and regression – on three large real-world datasets. Our empirical results show that (a) RON-Gauss outperforms previous approaches by up to an order of magnitude, and (b) loss in utility compared to the non-private real data is small. Thus, RON-Gauss can serve as a key enabler for real-world deployment of privacy-preserving data release.


Entropy ◽  
2020 ◽  
Vol 22 (12) ◽  
pp. 1383
Author(s):  
Jinfang Sheng ◽  
Cheng Liu ◽  
Long Chen ◽  
Bin Wang ◽  
Junkai Zhang

With the rapid development of computer technology, the research on complex networks has attracted more and more attention. At present, the research directions of cloud computing, big data, internet of vehicles, and distributed systems with very high attention are all based on complex networks. Community structure detection is a very important and meaningful research hotspot in complex networks. It is a difficult task to quickly and accurately divide the community structure and run it on large-scale networks. In this paper, we put forward a new community detection approach based on internode attraction, named IACD. This algorithm starts from the perspective of the important nodes of the complex network and refers to the gravitational relationship between two objects in physics to represent the forces between nodes in the network dataset, and then perform community detection. Through experiments on a large number of real-world datasets and synthetic networks, it is shown that the IACD algorithm can quickly and accurately divide the community structure, and it is superior to some classic algorithms and recently proposed algorithms.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


2021 ◽  
Vol 17 (4) ◽  
pp. 1-30
Author(s):  
Qiben Yan ◽  
Jianzhi Lou ◽  
Mehmet C. Vuran ◽  
Suat Irmak

Precision agriculture has become a promising paradigm to transform modern agriculture. The recent revolution in big data and Internet-of-Things (IoT) provides unprecedented benefits including optimizing yield, minimizing environmental impact, and reducing cost. However, the mass collection of farm data in IoT applications raises serious concerns about potential privacy leakage that may harm the farmers’ welfare. In this work, we propose a novel scalable and private geo-distance evaluation system, called SPRIDE, to allow application servers to provide geographic-based services by computing the distances among sensors and farms privately. The servers determine the distances without learning any additional information about their locations. The key idea of SPRIDE is to perform efficient distance measurement and distance comparison on encrypted locations over a sphere by leveraging a homomorphic cryptosystem. To serve a large user base, we further propose SPRIDE+ with novel and practical performance enhancements based on pre-computation of cryptographic elements. Through extensive experiments using real-world datasets, we show SPRIDE+ achieves private distance evaluation on a large network of farms, attaining 3+ times runtime performance improvement over existing techniques. We further show SPRIDE+ can run on resource-constrained mobile devices, which offers a practical solution for privacy-preserving precision agriculture IoT applications.


Entropy ◽  
2021 ◽  
Vol 23 (6) ◽  
pp. 680
Author(s):  
Hanyang Lin ◽  
Yongzhao Zhan ◽  
Zizheng Zhao ◽  
Yuzhong Chen ◽  
Chen Dong

There is a wealth of information in real-world social networks. In addition to the topology information, the vertices or edges of a social network often have attributes, with many of the overlapping vertices belonging to several communities simultaneously. It is challenging to fully utilize the additional attribute information to detect overlapping communities. In this paper, we first propose an overlapping community detection algorithm based on an augmented attribute graph. An improved weight adjustment strategy for attributes is embedded in the algorithm to help detect overlapping communities more accurately. Second, we enhance the algorithm to automatically determine the number of communities by a node-density-based fuzzy k-medoids process. Extensive experiments on both synthetic and real-world datasets demonstrate that the proposed algorithms can effectively detect overlapping communities with fewer parameters compared to the baseline methods.


2021 ◽  
pp. 1-17
Author(s):  
Mohammed Al-Andoli ◽  
Wooi Ping Cheah ◽  
Shing Chiang Tan

Detecting communities is an important multidisciplinary research discipline and is considered vital to understand the structure of complex networks. Deep autoencoders have been successfully proposed to solve the problem of community detection. However, existing models in the literature are trained based on gradient descent optimization with the backpropagation algorithm, which is known to converge to local minima and prove inefficient, especially in big data scenarios. To tackle these drawbacks, this work proposed a novel deep autoencoder with Particle Swarm Optimization (PSO) and continuation algorithms to reveal community structures in complex networks. The PSO and continuation algorithms were utilized to avoid the local minimum and premature convergence, and to reduce overall training execution time. Two objective functions were also employed in the proposed model: minimizing the cost function of the autoencoder, and maximizing the modularity function, which refers to the quality of the detected communities. This work also proposed other methods to work in the absence of continuation, and to enable premature convergence. Extensive empirical experiments on 11 publically-available real-world datasets demonstrated that the proposed method is effective and promising for deriving communities in complex networks, as well as outperforming state-of-the-art deep learning community detection algorithms.


2014 ◽  
Vol 17 (01) ◽  
pp. 1450001 ◽  
Author(s):  
MICHEL CRAMPES ◽  
MICHEL PLANTIÉ

With the widespread social networks on the Internet, community detection in social graphs has recently become an important research domain. Interest was initially limited to unipartite graph inputs and partitioned community outputs. More recently, bipartite graphs, directed graphs and overlapping communities have all been investigated. Few contributions however have encompassed all three types of graphs simultaneously. In this paper, we present a method that unifies community detection for these three types of graphs while at the same time it merges partitioned and overlapping communities. Moreover, the results are visualized in a way that allows for analysis and semantic interpretation. For validation purposes this method is experimented on some well-known simple benchmarks and then applied to real data: photos and tags in Facebook and Human Brain Tractography data. This last application leads to the possibility of applying community detection methods to other fields such as data analysis with original enhanced performances.


2021 ◽  
Vol 12 (4) ◽  
pp. 118-131
Author(s):  
Jaya Krishna Raguru ◽  
Devi Prasad Sharma

The problem of identifying a seed set composed of K nodes that increase influence spread over a social network is known as influence maximization (IM). Past works showed this problem to be NP-hard and an optimal solution to this problem using greedy algorithms achieved only 63% of spread. However, this approach is expensive and suffered from performance issues like high computational cost. Furthermore, in a network with communities, IM spread is not always certain. In this paper, heterogeneous influence maximization through community detection (HIMCD) algorithm is proposed. This approach addresses initial seed nodes selection in communities using various centrality measures, and these seed nodes act as sources for influence spread. A parallel influence maximization is applied with the aid of seed node set contained in each group. In this approach, graph is partitioned and IM computations are done in a distributed manner. Extensive experiments with two real-world datasets reveals that HCDIM achieves substantial performance improvement over state-of-the-art techniques.


Sign in / Sign up

Export Citation Format

Share Document