Disparate Vulnerability to Membership Inference Attacks

Abstract A membership inference attack (MIA) against a machine-learning model enables an attacker to determine whether a given data record was part of the model’s training data or not. In this paper, we provide an in-depth study of the phenomenon of disparate vulnerability against MIAs: unequal success rate of MIAs against different population subgroups. We first establish necessary and sufficient conditions for MIAs to be prevented, both on average and for population subgroups, using a notion of distributional generalization. Second, we derive connections of disparate vulnerability to algorithmic fairness and to differential privacy. We show that fairness can only prevent disparate vulnerability against limited classes of adversaries. Differential privacy bounds disparate vulnerability but can significantly reduce the accuracy of the model. We show that estimating disparate vulnerability by naïvely applying existing attacks can lead to overestimation. We then establish which attacks are suitable for estimating disparate vulnerability, and provide a statistical framework for doing so reliably. We conduct experiments on synthetic and real-world data finding significant evidence of disparate vulnerability in realistic settings.

Download Full-text

Emerging Technologies in a Modern Competitive Scenario

Digital Transformation and Challenges to Data Security and Privacy - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-4201-9.ch001 ◽

2021 ◽

pp. 1-16

Author(s):

George Leal Jamil ◽

Alexis Rocha da Silva

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Statistical Analysis ◽

Differential Privacy ◽

Social Issues ◽

Training Data ◽

Sensitive Data ◽

Machine Learning Model ◽

Highly Sensitive ◽

Inference Attacks

Users' personal, highly sensitive data such as photos and voice recordings are kept indefinitely by the companies that collect it. Users can neither delete nor restrict the purposes for which it is used. Learning how to machine learning that protects privacy, we can make a huge difference in solving many social issues like curing disease, etc. Deep neural networks are susceptible to various inference attacks as they remember information about their training data. In this chapter, the authors introduce differential privacy, which ensures that different kinds of statistical analysis don't compromise privacy and federated learning, training a machine learning model on a data to which we do not have access to.

Download Full-text

Privacy-Preserving Gradient Boosting Decision Trees

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5422 ◽

2020 ◽

Vol 34 (01) ◽

pp. 784-791 ◽

Cited By ~ 1

Author(s):

Qinbin Li ◽

Zhaomin Wu ◽

Zeyi Wen ◽

Bingsheng He

Keyword(s):

Machine Learning ◽

Differential Privacy ◽

Training Data ◽

Gradient Boosting ◽

Training Algorithm ◽

Model Accuracy ◽

Machine Learning Model ◽

Improve Model ◽

Privacy Budget ◽

Privacy Level

The Gradient Boosting Decision Tree (GBDT) is a popular machine learning model for various tasks in recent years. In this paper, we study how to improve model accuracy of GBDT while preserving the strong guarantee of differential privacy. Sensitivity and privacy budget are two key design aspects for the effectiveness of differential private models. Existing solutions for GBDT with differential privacy suffer from the significant accuracy loss due to too loose sensitivity bounds and ineffective privacy budget allocations (especially across different trees in the GBDT model). Loose sensitivity bounds lead to more noise to obtain a fixed privacy level. Ineffective privacy budget allocations worsen the accuracy loss especially when the number of trees is large. Therefore, we propose a new GBDT training algorithm that achieves tighter sensitivity bounds and more effective noise allocations. Specifically, by investigating the property of gradient and the contribution of each tree in GBDTs, we propose to adaptively control the gradients of training data for each iteration and leaf node clipping in order to tighten the sensitivity bounds. Furthermore, we design a novel boosting framework to allocate the privacy budget between trees so that the accuracy loss can be further reduced. Our experiments show that our approach can achieve much better model accuracy than other baselines.

Download Full-text

Quantifying identifiability to choose and audit ϵ in differentially private deep learning

Proceedings of the VLDB Endowment ◽

10.14778/3484224.3484231 ◽

2021 ◽

Vol 14 (13) ◽

pp. 3335-3347

Author(s):

Daniel Bernau ◽

Günther Eibl ◽

Philip W. Grassal ◽

Hannah Keller ◽

Florian Kerschbaum

Keyword(s):

Machine Learning ◽

Differential Privacy ◽

Training Data ◽

Training Dataset ◽

Privacy Leakage ◽

Societal Norms ◽

Machine Learning Model ◽

Model Training ◽

Parameter Values ◽

Learning Data

Differential privacy allows bounding the influence that training data records have on a machine learning model. To use differential privacy in machine learning, data scientists must choose privacy parameters (ϵ, δ ). Choosing meaningful privacy parameters is key, since models trained with weak privacy parameters might result in excessive privacy leakage, while strong privacy parameters might overly degrade model utility. However, privacy parameter values are difficult to choose for two main reasons. First, the theoretical upper bound on privacy loss (ϵ, δ) might be loose, depending on the chosen sensitivity and data distribution of practical datasets. Second, legal requirements and societal norms for anonymization often refer to individual identifiability, to which (ϵ, δ ) are only indirectly related. We transform (ϵ, δ ) to a bound on the Bayesian posterior belief of the adversary assumed by differential privacy concerning the presence of any record in the training dataset. The bound holds for multidimensional queries under composition, and we show that it can be tight in practice. Furthermore, we derive an identifiability bound, which relates the adversary assumed in differential privacy to previous work on membership inference adversaries. We formulate an implementation of this differential privacy adversary that allows data scientists to audit model training and compute empirical identifiability scores and empirical (ϵ, δ ).

Download Full-text

Differentially Private and Fair Classification via Calibrated Functional Mechanism

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5402 ◽

2020 ◽

Vol 34 (01) ◽

pp. 622-629

Author(s):

Jiahao Ding ◽

Xinyue Zhang ◽

Xiaohuan Li ◽

Junyi Wang ◽

Rong Yu ◽

...

Keyword(s):

Machine Learning ◽

Differential Privacy ◽

State Of The Art ◽

Autonomous Driving ◽

Training Data ◽

Classification Model ◽

Privacy Concerns ◽

Polynomial Coefficients ◽

Functional Mechanism ◽

Machine Learning Model

Machine learning is increasingly becoming a powerful tool to make decisions in a wide variety of applications, such as medical diagnosis and autonomous driving. Privacy concerns related to the training data and unfair behaviors of some decisions with regard to certain attributes (e.g., sex, race) are becoming more critical. Thus, constructing a fair machine learning model while simultaneously providing privacy protection becomes a challenging problem. In this paper, we focus on the design of classification model with fairness and differential privacy guarantees by jointly combining functional mechanism and decision boundary fairness. In order to enforce ϵ-differential privacy and fairness, we leverage the functional mechanism to add different amounts of Laplace noise regarding different attributes to the polynomial coefficients of the objective function in consideration of fairness constraint. We further propose an utility-enhancement scheme, called relaxed functional mechanism by adding Gaussian noise instead of Laplace noise, hence achieving (ϵ, δ)-differential privacy. Based on the relaxed functional mechanism, we can design (ϵ, δ)-differentially private and fair classification model. Moreover, our theoretical analysis and empirical results demonstrate that our two approaches achieve both fairness and differential privacy while preserving good utility and outperform the state-of-the-art algorithms.

Download Full-text

Federated Model Distillation with Noise-Free Differential Privacy

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/216 ◽

2021 ◽

Author(s):

Lichao Sun ◽

Lingjuan Lyu

Keyword(s):

Data Privacy ◽

Differential Privacy ◽

Random Noise ◽

Model Performance ◽

Training Data ◽

Heterogeneous Model ◽

The Public ◽

Inference Attacks ◽

Public Datasets ◽

Privacy Budget

Conventional federated learning directly averages model weights, which is only possible for collaboration between models with homogeneous architectures. Sharing prediction instead of weight removes this obstacle and eliminates the risk of white-box inference attacks in conventional federated learning. However, the predictions from local models are sensitive and would leak training data privacy to the public. To address this issue, one naive approach is adding the differentially private random noise to the predictions, which however brings a substantial trade-off between privacy budget and model performance. In this paper, we propose a novel framework called FEDMD-NFDP, which applies a Noise-FreeDifferential Privacy (NFDP) mechanism into a federated model distillation framework. Our extensive experimental results on various datasets validate that FEDMD-NFDP can deliver not only comparable utility and communication efficiency but also provide a noise-free differential privacy guarantee. We also demonstrate the feasibility of our FEDMD-NFDP by considering both IID and Non-IID settings, heterogeneous model architectures, and unlabelled public datasets from a different distribution.

Download Full-text

The extinction time of a general birth and death process with catastrophes

Journal of Applied Probability ◽

10.1017/s0021900200116031 ◽

1986 ◽

Vol 23 (04) ◽

pp. 851-858 ◽

Cited By ~ 1

Author(s):

P. J. Brockwell

Keyword(s):

Laplace Transform ◽

Size Distribution ◽

Sufficient Conditions ◽

Necessary And Sufficient Conditions ◽

Death Process ◽

Extinction Time ◽

Birth And Death Process ◽

Different Types ◽

The Laplace Transform ◽

Necessary And Sufficient

The Laplace transform of the extinction time is determined for a general birth and death process with arbitrary catastrophe rate and catastrophe size distribution. It is assumed only that the birth rates satisfyλ0= 0,λj> 0 for eachj> 0, and. Necessary and sufficient conditions for certain extinction of the population are derived. The results are applied to the linear birth and death process (λj=jλ, µj=jμ) with catastrophes of several different types.

Download Full-text

Does the Concatenated Generalized Matching Law Identify the Necessary and Sufficient Conditions?

PsycEXTRA Dataset ◽

10.1037/e537052012-028 ◽

2004 ◽

Author(s):

James S. MacDonall

Keyword(s):

Sufficient Conditions ◽

Matching Law ◽

Necessary And Sufficient Conditions ◽

Generalized Matching Law ◽

Necessary And Sufficient

Download Full-text

NECESSARY AND SUFFICIENT CONDITIONS OF EXISTENCE AND UNIQUENESS OF LIMIT CYCLES FOR A CLASS OF POLYNOMIAL SYSTEM

Acta Mathematica Scientia ◽

10.1016/s0252-9602(18)30430-2 ◽

1991 ◽

Vol 11 (1) ◽

pp. 65-71 ◽

Cited By ~ 1

Author(s):

Deming Liu

Keyword(s):

Limit Cycles ◽

Existence And Uniqueness ◽

Polynomial System ◽

Sufficient Conditions ◽

Necessary And Sufficient Conditions ◽

Uniqueness Of Limit Cycles ◽

Necessary And Sufficient

Download Full-text

Budgeting in Socio-economic Development Strategies (Raising a Problem)

Voprosy Ekonomiki ◽

10.32609/0042-8736-2008-3-134-151 ◽

2008 ◽

pp. 134-151

Author(s):

A. Shastitko ◽

M. Ovchinnikov

Keyword(s):

Economic Policy ◽

Economic Development ◽

Social Change ◽

Eastern Europe ◽

Statistical Data ◽

Sufficient Conditions ◽

Transitional Period ◽

Socio Economic Development ◽

Necessary And Sufficient ◽

Economic Development Strategies

The article proposes an approach to the analysis of social change and contributes to the clarification of concepts of economic policy. It deals in particular with the notion of "change of system". The author considers positive and normative aspects of the analysis of capitalist and socialist systems. The necessary and sufficient conditions for the system to be changed are introduced, their fulfillment is discussed drawing upon the historical and statistical data. The article describes both economic and political peculiarities of the transitional period in different countries, especially in Eastern Europe.

Download Full-text

Inventory optimization considering the vehicle capacity and specifics of cash flows at rental storage sites

10.33983/2075-1826-2020-2-77-90 ◽

2020 ◽

pp. 77-90

Author(s):

V.D. Gerami ◽

I.G. Shidlovskii

Keyword(s):

Supply Chain ◽

Inventory Management ◽

Sufficient Conditions ◽

Cash Flows ◽

Management Systems ◽

Cargo Capacity ◽

Special Modification ◽

The Cost ◽

Necessary And Sufficient ◽

Vehicle Capacity

The article presents a special modification of the EOQ formula and its application to the accounting of the cargo capacity factor for the relevant procedures for optimizing deliveries when renting storage facilities. The specified development will allow managers to take into account the following process specifics in the format of a simulated supply chain when managing inventory. First of all, it will allow considering the most important factor of cargo capacity when optimizing stocks. Moreover, this formula will make it possible to find the optimal strategy for the supply of goods if, also, it is necessary to take into account the combined effect of several factors necessary for practice, which will undoubtedly affect decision-making procedures. Here we are talking about the need for additional consideration of the following essential attributes of the simulated cash flow of the supply chain: 1) time value of money; 2) deferral of payment of the cost of the order; 3) pre-agreed allowable delays in the receipt of revenue from goods sold. Developed analysis and optimization procedures have been implemented to models of this type that are interesting and important for a business. This — inventory management systems, the format of which is related to the special concept of efficient supply. We are talking about models where the presence of the specified delays for the outgoing cash flows allows you to pay for the order and the corresponding costs of the supply chain from the corresponding revenue on the re-order interval. Accordingly, the necessary and sufficient conditions are established based on which managers will be able to identify models of the specified type. The purpose of the article is to draw the attention of managers to real opportunities to improve the efficiency of inventory management systems by taking into account these factors for a simulated supply chain.

Download Full-text