scholarly journals Differentially Private Attributed Network Releasing Based on Early Fusion

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Yuye Wang ◽  
Jing Yang ◽  
Jianpei Zhan

Vertex attributes exert huge impacts on the analysis of social networks. Since the attributes are often sensitive, it is necessary to seek effective ways to protect the privacy of graphs with correlated attributes. Prior work has focused mainly on the graph topological structure and the attributes, respectively, and combining them together by defining the relevancy between them. However, these methods need to add noise to them, respectively, and they produce a large number of required noise and reduce the data utility. In this paper, we introduce an approach to release graphs with correlated attributes under differential privacy based on early fusion. We combine the graph topological structure and the attributes together with a private probability model and generate a synthetic network satisfying differential privacy. We conduct extensive experiments to demonstrate that our approach could meet the request of attributed networks and achieve high data utility.

2020 ◽  
Vol 2020 ◽  
pp. 1-12
Author(s):  
Jing Yang ◽  
Yuye Wang ◽  
Jianpei Zhang

Releasing evolving networks which contain sensitive information could compromise individual privacy. In this paper, we study the problem of releasing evolving networks under differential privacy. We explore the possibility of designing a differentially private evolving networks releasing algorithm. We found that the majority of traditional methods provide a snapshot of the networks under differential privacy over a brief period of time. As the network structure only changes in local part, the amount of required noise entirely is large and it leads to an inefficient utility. To this end, we propose GHRG-DP, a novel differentially private evolving networks releasing algorithm which reduces the noise scale and achieves high data utility. In the GHRG-DP algorithm, we learn the online connection probabilities between vertices in the evolving networks by generalized hierarchical random graph (GHRG) model. To fit the dynamic environment, a dendrogram structure adjusting method in local areas is proposed to reduce the noise scale in the whole period of time. Moreover, to avoid the unhelpful outcome of the connection probabilities, a Bayesian noisy probabilities calculating method is proposed. Through formal privacy analysis, we show that the GHRG-DP algorithm is ε -differentially private. Experiments on real evolving network datasets illustrate that GHRG-DP algorithm can privately release evolving networks with high accuracy.


Author(s):  
Hao Wang ◽  
Xiao Peng ◽  
Yihang Xiao ◽  
Zhengquan Xu ◽  
Xian Chen

AbstractPrivacy preserving methods supporting for data aggregating have attracted the attention of researchers in multidisciplinary fields. Among the advanced methods, differential privacy (DP) has become an influential privacy mechanism owing to its rigorous privacy guarantee and high data utility. But DP has no limitation on the bound of noise, leading to a low-level utility. Recently, researchers investigate how to preserving rigorous privacy guarantee while limiting the relative error to a fixed bound. However, these schemes destroy the statistical properties, including the mean, variance and MSE, which are the foundational elements for data aggregating and analyzing. In this paper, we explore the optimal privacy preserving solution, including novel definitions and implementing mechanisms, to maintain the statistical properties while satisfying DP with a fixed relative error bound. Experimental evaluation demonstrates that our mechanism outperforms current schemes in terms of security and utility for large quantities of queries.


2016 ◽  
Vol E99.D (8) ◽  
pp. 2069-2078 ◽  
Author(s):  
Mohammad Rasool SARRAFI AGHDAM ◽  
Noboru SONEHARA

2022 ◽  
Vol 40 (3) ◽  
pp. 1-36
Author(s):  
Jinyuan Fang ◽  
Shangsong Liang ◽  
Zaiqiao Meng ◽  
Maarten De Rijke

Network-based information has been widely explored and exploited in the information retrieval literature. Attributed networks, consisting of nodes, edges as well as attributes describing properties of nodes, are a basic type of network-based data, and are especially useful for many applications. Examples include user profiling in social networks and item recommendation in user-item purchase networks. Learning useful and expressive representations of entities in attributed networks can provide more effective building blocks to down-stream network-based tasks such as link prediction and attribute inference. Practically, input features of attributed networks are normalized as unit directional vectors. However, most network embedding techniques ignore the spherical nature of inputs and focus on learning representations in a Gaussian or Euclidean space, which, we hypothesize, might lead to less effective representations. To obtain more effective representations of attributed networks, we investigate the problem of mapping an attributed network with unit normalized directional features into a non-Gaussian and non-Euclidean space. Specifically, we propose a hyperspherical variational co-embedding for attributed networks (HCAN), which is based on generalized variational auto-encoders for heterogeneous data with multiple types of entities. HCAN jointly learns latent embeddings for both nodes and attributes in a unified hyperspherical space such that the affinities between nodes and attributes can be captured effectively. We argue that this is a crucial feature in many real-world applications of attributed networks. Previous Gaussian network embedding algorithms break the assumption of uninformative prior, which leads to unstable results and poor performance. In contrast, HCAN embeds nodes and attributes as von Mises-Fisher distributions, and allows one to capture the uncertainty of the inferred representations. Experimental results on eight datasets show that HCAN yields better performance in a number of applications compared with nine state-of-the-art baselines.


2015 ◽  
Vol 31 (4) ◽  
pp. 737-761 ◽  
Author(s):  
Matthias Templ

Abstract Scientific- or public-use files are typically produced by applying anonymisation methods to the original data. Anonymised data should have both low disclosure risk and high data utility. Data utility is often measured by comparing well-known estimates from original data and anonymised data, such as comparing their means, covariances or eigenvalues. However, it is a fact that not every estimate can be preserved. Therefore the aim is to preserve the most important estimates, that is, instead of calculating generally defined utility measures, evaluation on context/data dependent indicators is proposed. In this article we define such indicators and utility measures for the Structure of Earnings Survey (SES) microdata and proper guidelines for selecting indicators and models, and for evaluating the resulting estimates are given. For this purpose, hundreds of publications in journals and from national statistical agencies were reviewed to gain insight into how the SES data are used for research and which indicators are relevant for policy making. Besides the mathematical description of the indicators and a brief description of the most common models applied to SES, four different anonymisation procedures are applied and the resulting indicators and models are compared to those obtained from the unmodified data. The disclosure risk is reported and the data utility is evaluated for each of the anonymised data sets based on the most important indicators and a model which is often used in practice.


2018 ◽  
Vol 8 (1) ◽  
Author(s):  
Yue Wang ◽  
Daniel Kifer ◽  
Jaewoo Lee ◽  
Vishesh Karwa

Statistics computed from data are viewed as random variables. When they are used for tasks like hypothesis testing and confidence intervals, their true finite sample distributions are often replaced by approximating distributions that are easier to work with (for example, the Gaussian, which results from using approximations justified by the Central Limit Theorem). When data are perturbed by differential privacy, the approximating distributions also need to be modified. Prior work provided various competing methods for creating such approximating distributions with little formal justification beyond the fact that they worked well empirically. In this paper, we study the question of how to generate statistical approximating distributions for differentially private statistics, provide finite sample guarantees for the quality of the approximations.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Xiang Liu ◽  
Yuchun Guo ◽  
Xiaoying Tan ◽  
Yishuai Chen

Nowadays, a lot of data mining applications, such as web traffic analysis and content popularity prediction, leverage users’ web browsing trajectories to improve their performance. However, the disclosure of web browsing trajectory is the most prominent issue. A novel privacy model, named Differential Privacy, is used to rigorously protect user’s privacy. Some works have applied this privacy model to spatial-temporal streams. However, these works either protect the users’ activities in different places separately or protect their activities in all places jointly. The former one cannot protect trajectories that traverse multiple places; while the latter ignores the differences among places and suffers the degradation of data utility (i.e., data accuracy). In this paper, we propose a w , n -differential privacy to protect any spatial-temporal sequence occurring in w successive timestamps and n -range places. To achieve better data utility, we propose two implementation algorithms, named Spatial-Temporal Budget Distribution (STBD) and Spatial-Temporal RescueDP (STR). Theoretical analysis and experimental results show that these two algorithms can achieve a balance between data utility and trajectory privacy guarantee.


Author(s):  
Carl Yang ◽  
Haonan Wang ◽  
Ke Zhang ◽  
Liang Chen ◽  
Lichao Sun

Many data mining and analytical tasks rely on the abstraction of networks (graphs) to summarize relational structures among individuals (nodes). Since relational data are often sensitive, we aim to seek effective approaches to generate utility-preserved yet privacy-protected structured data. In this paper, we leverage the differential privacy (DP) framework to formulate and enforce rigorous privacy constraints on deep graph generation models, with a focus on edge-DP to guarantee individual link privacy. In particular, we enforce edge-DP by injecting designated noise to the gradients of a link reconstruction based graph generation model, while ensuring data utility by improving structure learning with structure-oriented graph discrimination. Extensive experiments on two real-world network datasets show that our proposed DPGGAN model is able to generate graphs with effectively preserved global structure and rigorously protected individual link privacy.


Sign in / Sign up

Export Citation Format

Share Document