scholarly journals Disparate Vulnerability to Membership Inference Attacks

2021 ◽  
Vol 2022 (1) ◽  
pp. 460-480
Author(s):  
Bogdan Kulynych ◽  
Mohammad Yaghini ◽  
Giovanni Cherubin ◽  
Michael Veale ◽  
Carmela Troncoso

Abstract A membership inference attack (MIA) against a machine-learning model enables an attacker to determine whether a given data record was part of the model’s training data or not. In this paper, we provide an in-depth study of the phenomenon of disparate vulnerability against MIAs: unequal success rate of MIAs against different population subgroups. We first establish necessary and sufficient conditions for MIAs to be prevented, both on average and for population subgroups, using a notion of distributional generalization. Second, we derive connections of disparate vulnerability to algorithmic fairness and to differential privacy. We show that fairness can only prevent disparate vulnerability against limited classes of adversaries. Differential privacy bounds disparate vulnerability but can significantly reduce the accuracy of the model. We show that estimating disparate vulnerability by naïvely applying existing attacks can lead to overestimation. We then establish which attacks are suitable for estimating disparate vulnerability, and provide a statistical framework for doing so reliably. We conduct experiments on synthetic and real-world data finding significant evidence of disparate vulnerability in realistic settings.

Author(s):  
George Leal Jamil ◽  
Alexis Rocha da Silva

Users' personal, highly sensitive data such as photos and voice recordings are kept indefinitely by the companies that collect it. Users can neither delete nor restrict the purposes for which it is used. Learning how to machine learning that protects privacy, we can make a huge difference in solving many social issues like curing disease, etc. Deep neural networks are susceptible to various inference attacks as they remember information about their training data. In this chapter, the authors introduce differential privacy, which ensures that different kinds of statistical analysis don't compromise privacy and federated learning, training a machine learning model on a data to which we do not have access to.


2020 ◽  
Vol 34 (01) ◽  
pp. 784-791 ◽  
Author(s):  
Qinbin Li ◽  
Zhaomin Wu ◽  
Zeyi Wen ◽  
Bingsheng He

The Gradient Boosting Decision Tree (GBDT) is a popular machine learning model for various tasks in recent years. In this paper, we study how to improve model accuracy of GBDT while preserving the strong guarantee of differential privacy. Sensitivity and privacy budget are two key design aspects for the effectiveness of differential private models. Existing solutions for GBDT with differential privacy suffer from the significant accuracy loss due to too loose sensitivity bounds and ineffective privacy budget allocations (especially across different trees in the GBDT model). Loose sensitivity bounds lead to more noise to obtain a fixed privacy level. Ineffective privacy budget allocations worsen the accuracy loss especially when the number of trees is large. Therefore, we propose a new GBDT training algorithm that achieves tighter sensitivity bounds and more effective noise allocations. Specifically, by investigating the property of gradient and the contribution of each tree in GBDTs, we propose to adaptively control the gradients of training data for each iteration and leaf node clipping in order to tighten the sensitivity bounds. Furthermore, we design a novel boosting framework to allocate the privacy budget between trees so that the accuracy loss can be further reduced. Our experiments show that our approach can achieve much better model accuracy than other baselines.


2021 ◽  
Vol 14 (13) ◽  
pp. 3335-3347
Author(s):  
Daniel Bernau ◽  
Günther Eibl ◽  
Philip W. Grassal ◽  
Hannah Keller ◽  
Florian Kerschbaum

Differential privacy allows bounding the influence that training data records have on a machine learning model. To use differential privacy in machine learning, data scientists must choose privacy parameters (ϵ, δ ). Choosing meaningful privacy parameters is key, since models trained with weak privacy parameters might result in excessive privacy leakage, while strong privacy parameters might overly degrade model utility. However, privacy parameter values are difficult to choose for two main reasons. First, the theoretical upper bound on privacy loss (ϵ, δ) might be loose, depending on the chosen sensitivity and data distribution of practical datasets. Second, legal requirements and societal norms for anonymization often refer to individual identifiability, to which (ϵ, δ ) are only indirectly related. We transform (ϵ, δ ) to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset. The bound holds for multidimensional queries under composition, and we show that it can be tight in practice. Furthermore, we derive an identifiability bound, which relates the adversary assumed in differential privacy to previous work on membership inference adversaries. We formulate an implementation of this differential privacy adversary that allows data scientists to audit model training and compute empirical identifiability scores and empirical (ϵ, δ ).


2020 ◽  
Vol 34 (01) ◽  
pp. 622-629
Author(s):  
Jiahao Ding ◽  
Xinyue Zhang ◽  
Xiaohuan Li ◽  
Junyi Wang ◽  
Rong Yu ◽  
...  

Machine learning is increasingly becoming a powerful tool to make decisions in a wide variety of applications, such as medical diagnosis and autonomous driving. Privacy concerns related to the training data and unfair behaviors of some decisions with regard to certain attributes (e.g., sex, race) are becoming more critical. Thus, constructing a fair machine learning model while simultaneously providing privacy protection becomes a challenging problem. In this paper, we focus on the design of classification model with fairness and differential privacy guarantees by jointly combining functional mechanism and decision boundary fairness. In order to enforce ϵ-differential privacy and fairness, we leverage the functional mechanism to add different amounts of Laplace noise regarding different attributes to the polynomial coefficients of the objective function in consideration of fairness constraint. We further propose an utility-enhancement scheme, called relaxed functional mechanism by adding Gaussian noise instead of Laplace noise, hence achieving (ϵ, δ)-differential privacy. Based on the relaxed functional mechanism, we can design (ϵ, δ)-differentially private and fair classification model. Moreover, our theoretical analysis and empirical results demonstrate that our two approaches achieve both fairness and differential privacy while preserving good utility and outperform the state-of-the-art algorithms.


Author(s):  
Lichao Sun ◽  
Lingjuan Lyu

Conventional federated learning directly averages model weights, which is only possible for collaboration between models with homogeneous architectures. Sharing prediction instead of weight removes this obstacle and eliminates the risk of white-box inference attacks in conventional federated learning. However, the predictions from local models are sensitive and would leak training data privacy to the public. To address this issue, one naive approach is adding the differentially private random noise to the predictions, which however brings a substantial trade-off between privacy budget and model performance. In this paper, we propose a novel framework called FEDMD-NFDP, which applies a Noise-FreeDifferential Privacy (NFDP) mechanism into a federated model distillation framework. Our extensive experimental results on various datasets validate that FEDMD-NFDP can deliver not only comparable utility and communication efficiency but also provide a noise-free differential privacy guarantee. We also demonstrate the feasibility of our FEDMD-NFDP by considering both IID and Non-IID settings, heterogeneous model architectures, and unlabelled public datasets from a different distribution.


1986 ◽  
Vol 23 (04) ◽  
pp. 851-858 ◽  
Author(s):  
P. J. Brockwell

The Laplace transform of the extinction time is determined for a general birth and death process with arbitrary catastrophe rate and catastrophe size distribution. It is assumed only that the birth rates satisfyλ0= 0,λj> 0 for eachj> 0, and. Necessary and sufficient conditions for certain extinction of the population are derived. The results are applied to the linear birth and death process (λj=jλ, µj=jμ) with catastrophes of several different types.


2008 ◽  
pp. 134-151
Author(s):  
A. Shastitko ◽  
M. Ovchinnikov

The article proposes an approach to the analysis of social change and contributes to the clarification of concepts of economic policy. It deals in particular with the notion of "change of system". The author considers positive and normative aspects of the analysis of capitalist and socialist systems. The necessary and sufficient conditions for the system to be changed are introduced, their fulfillment is discussed drawing upon the historical and statistical data. The article describes both economic and political peculiarities of the transitional period in different countries, especially in Eastern Europe.


2020 ◽  
pp. 77-90
Author(s):  
V.D. Gerami ◽  
I.G. Shidlovskii

The article presents a special modification of the EOQ formula and its application to the accounting of the cargo capacity factor for the relevant procedures for optimizing deliveries when renting storage facilities. The specified development will allow managers to take into account the following process specifics in the format of a simulated supply chain when managing inventory. First of all, it will allow considering the most important factor of cargo capacity when optimizing stocks. Moreover, this formula will make it possible to find the optimal strategy for the supply of goods if, also, it is necessary to take into account the combined effect of several factors necessary for practice, which will undoubtedly affect decision-making procedures. Here we are talking about the need for additional consideration of the following essential attributes of the simulated cash flow of the supply chain: 1) time value of money; 2) deferral of payment of the cost of the order; 3) pre-agreed allowable delays in the receipt of revenue from goods sold. Developed analysis and optimization procedures have been implemented to models of this type that are interesting and important for a business. This — inventory management systems, the format of which is related to the special concept of efficient supply. We are talking about models where the presence of the specified delays for the outgoing cash flows allows you to pay for the order and the corresponding costs of the supply chain from the corresponding revenue on the re-order interval. Accordingly, the necessary and sufficient conditions are established based on which managers will be able to identify models of the specified type. The purpose of the article is to draw the attention of managers to real opportunities to improve the efficiency of inventory management systems by taking into account these factors for a simulated supply chain.


Sign in / Sign up

Export Citation Format

Share Document