2021 ◽  
Vol 2021 ◽  
pp. 1-17
Author(s):  
Yunlu Bai ◽  
Geng Yang ◽  
Yang Xiang ◽  
Xuan Wang

For data analysis with differential privacy, an analysis task usually requires multiple queries to complete, and the total budget needs to be divided into different parts and allocated to each query. However, at present, the budget allocation in differential privacy lacks efficient and general allocation strategies, and most of the research tends to adopt an average or exclusive allocation method. In this paper, we propose two series strategies for budget allocation: the geometric series and the Taylor series. We show the different characteristics of the two series and provide a calculation method for selecting the key parameters. To better reflect a user’s preference of noise during the allocation, we explored the relationship between sensitivity and noise in detail, and, based on this, we propose an optimization for the series strategies. Finally, to prevent collusion attacks and improve security, we provide three ideas for protecting the budget sequence. Both the theoretical analysis and experimental results show that our methods can support more queries and achieve higher utility. This shows that our series allocation strategies have a high degree of flexibility which can meet the user’s need and allow them to be better applied to differentially private algorithms to achieve high performance while maintaining the security.


Author(s):  
Anis Bkakria ◽  
Aimilia Tasidou ◽  
Nora Cuppens-Boulahia ◽  
Frédéric Cuppens ◽  
Fatma Bouattour ◽  
...  

2020 ◽  
Vol 34 (01) ◽  
pp. 784-791 ◽  
Author(s):  
Qinbin Li ◽  
Zhaomin Wu ◽  
Zeyi Wen ◽  
Bingsheng He

The Gradient Boosting Decision Tree (GBDT) is a popular machine learning model for various tasks in recent years. In this paper, we study how to improve model accuracy of GBDT while preserving the strong guarantee of differential privacy. Sensitivity and privacy budget are two key design aspects for the effectiveness of differential private models. Existing solutions for GBDT with differential privacy suffer from the significant accuracy loss due to too loose sensitivity bounds and ineffective privacy budget allocations (especially across different trees in the GBDT model). Loose sensitivity bounds lead to more noise to obtain a fixed privacy level. Ineffective privacy budget allocations worsen the accuracy loss especially when the number of trees is large. Therefore, we propose a new GBDT training algorithm that achieves tighter sensitivity bounds and more effective noise allocations. Specifically, by investigating the property of gradient and the contribution of each tree in GBDTs, we propose to adaptively control the gradients of training data for each iteration and leaf node clipping in order to tighten the sensitivity bounds. Furthermore, we design a novel boosting framework to allocate the privacy budget between trees so that the accuracy loss can be further reduced. Our experiments show that our approach can achieve much better model accuracy than other baselines.


2019 ◽  
Vol 17 (4) ◽  
pp. 450-460
Author(s):  
Hai Liu ◽  
Zhenqiang Wu ◽  
Changgen Peng ◽  
Feng Tian ◽  
Laifeng Lu

Considering the untrusted server, differential privacy and local differential privacy has been used for privacy-preserving in data aggregation. Through our analysis, differential privacy and local differential privacy cannot achieve Nash equilibrium between privacy and utility for mobile service based multiuser collaboration, which is multiuser negotiating a desired privacy budget in a collaborative manner for privacy-preserving. To this end, we proposed a Privacy-Preserving Data Aggregation Framework (PPDAF) that reached Nash equilibrium between privacy and utility. Firstly, we presented an adaptive Gaussian mechanism satisfying Nash equilibrium between privacy and utility by multiplying expected utility factor with conditional filtering noise under expected privacy budget. Secondly, we constructed PPDAF using adaptive Gaussian mechanism based on negotiating privacy budget with heuristic obfuscation. Finally, our theoretical analysis and experimental evaluation showed that the PPDAF could achieve Nash equilibrium between privacy and utility. Furthermore, this framework can be extended to engineering instances in a data aggregation setting


2021 ◽  
Vol 14 (10) ◽  
pp. 1805-1817
Author(s):  
David Pujol ◽  
Yikai Wu ◽  
Brandon Fain ◽  
Ashwin Machanavajjhala

Large organizations that collect data about populations (like the US Census Bureau) release summary statistics that are used by multiple stakeholders for resource allocation and policy making problems. These organizations are also legally required to protect the privacy of individuals from whom they collect data. Differential Privacy (DP) provides a solution to release useful summary data while preserving privacy. Most DP mechanisms are designed to answer a single set of queries. In reality, there are often multiple stakeholders that use a given data release and have overlapping but not-identical queries. This introduces a novel joint optimization problem in DP where the privacy budget must be shared among different analysts. We initiate study into the problem of DP query answering across multiple analysts. To capture the competing goals and priorities of multiple analysts, we formulate three desiderata that any mechanism should satisfy in this setting - The Sharing Incentive, Non-interference, and Adaptivity - while still optimizing for overall error. We demonstrate how existing DP query answering mechanisms in the multi-analyst settings fail to satisfy at least one of the desiderata. We present novel DP algorithms that provably satisfy all our desiderata and empirically show that they incur low error on realistic tasks.


2020 ◽  
Vol 2020 (1) ◽  
pp. 103-125
Author(s):  
Parameswaran Kamalaruban ◽  
Victor Perrier ◽  
Hassan Jameel Asghar ◽  
Mohamed Ali Kaafar

AbstractDifferential privacy provides strong privacy guarantees simultaneously enabling useful insights from sensitive datasets. However, it provides the same level of protection for all elements (individuals and attributes) in the data. There are practical scenarios where some data attributes need more/less protection than others. In this paper, we consider dX -privacy, an instantiation of the privacy notion introduced in [6], which allows this flexibility by specifying a separate privacy budget for each pair of elements in the data domain. We describe a systematic procedure to tailor any existing differentially private mechanism that assumes a query set and a sensitivity vector as input into its dX -private variant, specifically focusing on linear queries. Our proposed meta procedure has broad applications as linear queries form the basis of a range of data analysis and machine learning algorithms, and the ability to define a more flexible privacy budget across the data domain results in improved privacy/utility tradeoff in these applications. We propose several dX -private mechanisms, and provide theoretical guarantees on the trade-off between utility and privacy. We also experimentally demonstrate the effectiveness of our procedure, by evaluating our proposed dX -private Laplace mechanism on both synthetic and real datasets using a set of randomly generated linear queries.


2019 ◽  
Vol 9 (2) ◽  
Author(s):  
Brendan Avent ◽  
Aleksandra Korolova ◽  
David Zeber ◽  
Torgeir Hovden ◽  
Benjamin Livshits

We propose a hybrid model of differential privacy that considers a combination of regular and opt-in users who desire the differential privacy guarantees of the local privacy model and the trusted curator model, respectively. We demonstrate that within this model, it is possible to design a new type of blended algorithm that improves the utility of obtained data, while providing users with their desired privacy guarantees. We apply this algorithm to the task of privately computing the head of the search log and show that the blended approach provides significant improvements in the utility of the data compared to related work. Specifically, on two large search click data sets, comprising 1.75 and 16 GB, respectively, our approach attains NDCG values exceeding 95% across a range of privacy budget values.


Sign in / Sign up

Export Citation Format

Share Document