scholarly journals Improving Frequency Estimation under Local Differential Privacy

Author(s):  
Milan Lopuhaä-Zwakenberg ◽  
Zitao Li ◽  
Boris Škorić ◽  
Ninghui Li
2021 ◽  
Vol 14 (11) ◽  
pp. 2046-2058
Author(s):  
Graham Cormode ◽  
Samuel Maddock ◽  
Carsten Maple

Private collection of statistics from a large distributed population is an important problem, and has led to large scale deployments from several leading technology companies. The dominant approach requires each user to randomly perturb their input, leading to guarantees in the local differential privacy model. In this paper, we place the various approaches that have been suggested into a common framework, and perform an extensive series of experiments to understand the tradeoffs between different implementation choices. Our conclusion is that for the core problems of frequency estimation and heavy hitter identification, careful choice of algorithms can lead to very effective solutions that scale to millions of users.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7030
Author(s):  
Teng Wang ◽  
Xuefeng Zhang ◽  
Jingyu Feng ◽  
Xinyu Yang

Collecting and analyzing massive data generated from smart devices have become increasingly pervasive in crowdsensing, which are the building blocks for data-driven decision-making. However, extensive statistics and analysis of such data will seriously threaten the privacy of participating users. Local differential privacy (LDP) was proposed as an excellent and prevalent privacy model with distributed architecture, which can provide strong privacy guarantees for each user while collecting and analyzing data. LDP ensures that each user’s data is locally perturbed first in the client-side and then sent to the server-side, thereby protecting data from privacy leaks on both the client-side and server-side. This survey presents a comprehensive and systematic overview of LDP with respect to privacy models, research tasks, enabling mechanisms, and various applications. Specifically, we first provide a theoretical summarization of LDP, including the LDP model, the variants of LDP, and the basic framework of LDP algorithms. Then, we investigate and compare the diverse LDP mechanisms for various data statistics and analysis tasks from the perspectives of frequency estimation, mean estimation, and machine learning. Furthermore, we also summarize practical LDP-based application scenarios. Finally, we outline several future research directions under LDP.


Author(s):  
Dan Zhao ◽  
Hong Chen ◽  
Suyun Zhao ◽  
Xiaoying Zhang ◽  
Cuiping Li ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Zixuan Shen ◽  
Zhihua Xia ◽  
Peipeng Yu

The collection of multidimensional crowdsourced data has caused a public concern because of the privacy issues. To address it, local differential privacy (LDP) is proposed to protect the crowdsourced data without much loss of usage, which is popularly used in practice. However, the existing LDP protocols ignore users’ personal privacy requirements in spite of offering good utility for multidimensional crowdsourced data. In this paper, we consider the personality of data owners in protection and utilization of their multidimensional data by introducing the notion of personalized LDP (PLDP). Specifically, we design personalized multiple optimized unary encoding (PMOUE) to perturb data owners’ data, which satisfies ϵ total -PLDP. Then, the aggregation algorithm for frequency estimation on multidimensional data under PLDP is developed, which is described in two situations. Experiments are conducted on four real datasets, and the results show that the proposed aggregation algorithm yields high utility. Moreover, case studies with four real datasets demonstrate the efficiency and superiority of the proposed scheme.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Xu Zheng ◽  
Ke Yan ◽  
Jingyuan Duan ◽  
Wenyi Tang ◽  
Ling Tian

Local differential privacy has been considered the standard measurement for privacy preservation in distributed data collection. Corresponding mechanisms have been designed for multiple types of tasks, like the frequency estimation for categorical values and the mean value estimation for numerical values. However, the histogram publication of numerical values, containing abundant and crucial clues for the whole dataset, has not been thoroughly considered under this measurement. To simply encode data into different intervals upon each query will soon exhaust the bandwidth and the privacy budgets, which is infeasible for real scenarios. Therefore, this paper proposes a highly efficient framework for differentially private histogram publication of numerical values in a distributed environment. The proposed algorithms can efficiently adopt the correlations among multiple queries and achieve an optimal resource consumption. We also conduct extensive experiments on real-world data traces, and the results validate the improvement of proposed algorithms.


Sign in / Sign up

Export Citation Format

Share Document