Pairwise Learning with Differential Privacy Guarantees

Mengdi Huai; Di Wang; Chenglin Miao; Jinhui Xu; Aidong Zhang

doi:10.1609/aaai.v34i01.5411

Pairwise Learning with Differential Privacy Guarantees

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5411 ◽

2020 ◽

Vol 34 (01) ◽

pp. 694-701

Author(s):

Mengdi Huai ◽

Di Wang ◽

Chenglin Miao ◽

Jinhui Xu ◽

Aidong Zhang

Keyword(s):

Differential Privacy ◽

Metric Learning ◽

Loss Functions ◽

Sensitive Information ◽

Training Set ◽

Pairwise Learning ◽

Auc Maximization ◽

General Convex ◽

Convex Loss ◽

Real World Datasets

Pairwise learning has received much attention recently as it is more capable of modeling the relative relationship between pairs of samples. Many machine learning tasks can be categorized as pairwise learning, such as AUC maximization and metric learning. Existing techniques for pairwise learning all fail to take into consideration a critical issue in their design, i.e., the protection of sensitive information in the training set. Models learned by such algorithms can implicitly memorize the details of sensitive information, which offers opportunity for malicious parties to infer it from the learned models. To address this challenging issue, in this paper, we propose several differentially private pairwise learning algorithms for both online and offline settings. Specifically, for the online setting, we first introduce a differentially private algorithm (called OnPairStrC) for strongly convex loss functions. Then, we extend this algorithm to general convex loss functions and give another differentially private algorithm (called OnPairC). For the offline setting, we also present two differentially private algorithms (called OffPairStrC and OffPairC) for strongly and general convex loss functions, respectively. These proposed algorithms can not only learn the model effectively from the data but also provide strong privacy protection guarantee for sensitive information in the training set. Extensive experiments on real-world datasets are conducted to evaluate the proposed algorithms and the experimental results support our theoretical analysis.

Download Full-text

Differentially Private Pairwise Learning Revisited

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/446 ◽

2021 ◽

Author(s):

Zhiyu Xue ◽

Shaoyang Yang ◽

Mengdi Huai ◽

Di Wang

Keyword(s):

Theoretical Analysis ◽

Real World ◽

Differential Privacy ◽

Experimental Results ◽

Loss Functions ◽

Privacy Issue ◽

Strongly Convex ◽

Pairwise Learning ◽

General Convex ◽

Real World Datasets

Instead of learning with pointwise loss functions, learning with pairwise loss functions (pairwise learning) has received much attention recently as it is more capable of modeling the relative relationship between pairs of samples. However, most of the existing algorithms for pairwise learning fail to take into consideration the privacy issue in their design. To address this issue, previous work studied pairwise learning in the Differential Privacy (DP) model. However, their utilities (population errors) are far from optimal. To address the sub-optimal utility issue, in this paper, we proposed new pure or approximate DP algorithms for pairwise learning. Specifically, under the assumption that the loss functions are Lipschitz, our algorithms could achieve the optimal expected population risk for both strongly convex and general convex cases. We also conduct extensive experiments on real-world datasets to evaluate the proposed algorithms, experimental results support our theoretical analysis and show the priority of our algorithms.

Download Full-text

Scalable and Efficient Pairwise Learning to Achieve Statistical Accuracy

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013697 ◽

2019 ◽

Vol 33 ◽

pp. 3697-3704

Author(s):

Bin Gu ◽

Zhouyuan Huo ◽

Heng Huang

Keyword(s):

Learning Community ◽

Learning Algorithms ◽

Metric Learning ◽

Computational Cost ◽

Gradient Algorithm ◽

Statistical Accuracy ◽

Pairwise Learning ◽

Auc Maximization ◽

Real World Datasets ◽

The Relationship

Pairwise learning is an important learning topic in the machine learning community, where the loss function involves pairs of samples (e.g., AUC maximization and metric learning). Existing pairwise learning algorithms do not perform well in the generality, scalability and efficiency simultaneously. To address these challenging problems, in this paper, we first analyze the relationship between the statistical accuracy and the regularized empire risk for pairwise loss. Based on the relationship, we propose a scalable and efficient adaptive doubly stochastic gradient algorithm (AdaDSG) for generalized regularized pairwise learning problems. More importantly, we prove that the overall computational cost of AdaDSG is O(n) to achieve the statistical accuracy on the full training set with the size of n, which is the best theoretical result for pairwise learning to the best of our knowledge. The experimental results on a variety of real-world datasets not only confirm the effectiveness of our AdaDSG algorithm, but also show that AdaDSG has significantly better scalability and efficiency than the existing pairwise learning algorithms.

Download Full-text

Differentially Private Empirical Risk Minimization with Smooth Non-Convex Loss Functions: A Non-Stationary View

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33011182 ◽

2019 ◽

Vol 33 ◽

pp. 1182-1189

Author(s):

Di Wang ◽

Jinhui Xu

Keyword(s):

Dimensional Space ◽

Upper Bounds ◽

Loss Functions ◽

Log P ◽

Empirical Risk Minimization ◽

Risk Minimization ◽

Projected Gradient ◽

Empirical Risk ◽

Convex Loss ◽

Real World Datasets

In this paper, we study the Differentially Private Empirical Risk Minimization (DP-ERM) problem with non-convex loss functions and give several upper bounds for the utility in different settings. We first consider the problem in low-dimensional space. For DP-ERM with non-smooth regularizer, we generalize an existing work by measuring the utility using ℓ2 norm of the projected gradient. Also, we extend the error bound measurement, for the first time, from empirical risk to population risk by using the expected ℓ2 norm of the gradient. We then investigate the problem in high dimensional space, and show that by measuring the utility with Frank-Wolfe gap, it is possible to bound the utility by the Gaussian Width of the constraint set, instead of the dimensionality p of the underlying space. We further demonstrate that the advantages of this result can be achieved by the measure of ℓ2 norm of the projected gradient. A somewhat surprising discovery is that although the two kinds of measurements are quite different, their induced utility upper bounds are asymptotically the same under some assumptions. We also show that the utility of some special non-convex loss functions can be reduced to a level (i.e., depending only on log p) similar to that of convex loss functions. Finally, we test our proposed algorithms on both synthetic and real world datasets and the experimental results confirm our theoretical analysis.

Download Full-text

Online pairwise learning algorithms with convex loss functions

Information Sciences ◽

10.1016/j.ins.2017.04.022 ◽

2017 ◽

Vol 406-407 ◽

pp. 57-70 ◽

Cited By ~ 6

Author(s):

Junhong Lin ◽

Yunwen Lei ◽

Bo Zhang ◽

Ding-Xuan Zhou

Keyword(s):

Learning Algorithms ◽

Loss Functions ◽

Pairwise Learning ◽

Convex Loss

Download Full-text

Optimal Partitioning of Probability Distributions under General Convex Loss Functions in Selective Assembly

Communication in Statistics- Theory and Methods ◽

10.1080/03610920903581002 ◽

2011 ◽

Vol 40 (9) ◽

pp. 1545-1560 ◽

Cited By ~ 7

Author(s):

Shun Matsuura

Keyword(s):

Probability Distributions ◽

Loss Functions ◽

Selective Assembly ◽

Optimal Partitioning ◽

General Convex ◽

Convex Loss

Download Full-text

Variational Bayes In Private Settings (VIPS)

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.11763 ◽

2020 ◽

Vol 68 ◽

pp. 109-157

Author(s):

Mijung Park ◽

James Foulds ◽

Kamalika Chaudhuri ◽

Max Welling

Keyword(s):

Large Scale ◽

Latent Dirichlet Allocation ◽

Probabilistic Models ◽

Data Augmentation ◽

Differential Privacy ◽

Variational Bayes ◽

Sensitive Information ◽

Bayesian Data Analysis ◽

Bayes Algorithm ◽

Real World Datasets

Many applications of Bayesian data analysis involve sensitive information such as personal documents or medical records, motivating methods which ensure that privacy is protected. We introduce a general privacy-preserving framework for Variational Bayes (VB), a widely used optimization-based Bayesian inference method. Our framework respects differential privacy, the gold-standard privacy criterion, and encompasses a large class of probabilistic models, called the Conjugate Exponential (CE) family. We observe that we can straightforwardly privatise VB’s approximate posterior distributions for models in the CE family, by perturbing the expected sufficient statistics of the complete-data likelihood. For a broadly-used class of non-CE models, those with binomial likelihoods, we show how to bring such models into the CE family, such that inferences in the modified model resemble the private variational Bayes algorithm as closely as possible, using the Pólya-Gamma data augmentation scheme. The iterative nature of variational Bayes presents a further challenge since iterations increase the amount of noise needed. We overcome this by combining: (1) an improved composition method for differential privacy, called the moments accountant, which provides a tight bound on the privacy cost of multiple VB iterations and thus significantly decreases the amount of additive noise; and (2) the privacy amplification effect of subsampling mini-batches from large-scale data in stochastic learning. We empirically demonstrate the effectiveness of our method in CE and non-CE models including latent Dirichlet allocation, Bayesian logistic regression, and sigmoid belief networks, evaluated on real-world datasets.

Download Full-text

Adversarial Metric Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/279 ◽

2018 ◽

Cited By ~ 6

Author(s):

Shuo Chen ◽

Chen Gong ◽

Jian Yang ◽

Xiang Li ◽

Yang Wei ◽

...

Keyword(s):

State Of The Art ◽

Metric Learning ◽

Sampling Bias ◽

Training Data ◽

Loss Functions ◽

Original Training ◽

Training Set ◽

Learning Problem ◽

Optimization Framework ◽

The Past

In the past decades, intensive efforts have been put to design various loss functions and metric forms for metric learning problem. These improvements have shown promising results when the test data is similar to the training data. However, the trained models often fail to produce reliable distances on the ambiguous test pairs due to the different samplings between training set and test set. To address this problem, the Adversarial Metric Learning (AML) is proposed in this paper, which automatically generates adversarial pairs to remedy the sampling bias and facilitate robust metric learning. Specifically, AML consists of two adversarial stages, i.e. confusion and distinguishment. In confusion stage, the ambiguous but critical adversarial data pairs are adaptively generated to mislead the learned metric. In distinguishment stage, a metric is exhaustively learned to try its best to distinguish both adversarial pairs and original training pairs. Thanks to the challenges posed by the confusion stage in such competing process, the AML model is able to grasp plentiful difficult knowledge that has not been contained by the original training pairs, so the discriminability of AML can be significantly improved. The entire model is formulated into optimization framework, of which the global convergence is theoretically proved. The experimental results on toy data and practical datasets clearly demonstrate the superiority of AML to representative state-of-the-art metric learning models.

Download Full-text

Generalization Bounds for Regularized Pairwise Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/329 ◽

2018 ◽

Author(s):

Yunwen Lei ◽

Shao-Bo Lin ◽

Ke Tang

Keyword(s):

State Of The Art ◽

Metric Learning ◽

Distance Metric Learning ◽

Generalization Error ◽

Unified Framework ◽

Generalization Bounds ◽

Learning Tasks ◽

Pairwise Learning ◽

Auc Maximization ◽

Learning Schemes

Pairwise learning refers to learning tasks with the associated loss functions depending on pairs of examples. Recently, pairwise learning has received increasing attention since it covers many machine learning schemes, e.g., metric learning, ranking and AUC maximization, in a unified framework. In this paper, we establish a unified generalization error bound for regularized pairwise learning without either Bernstein conditions or capacity assumptions. We apply this general result to typical learning tasks including distance metric learning and ranking, for each of which our discussion is able to improve the state-of-the-art results.

Download Full-text

Stability and optimization error of stochastic gradient descent for pairwise learning

Analysis and Applications ◽

10.1142/s0219530519400062 ◽

2019 ◽

Vol 18 (05) ◽

pp. 887-927

Author(s):

Wei Shen ◽

Zhenhuan Yang ◽

Yiming Ying ◽

Xiaoming Yuan

Keyword(s):

Gradient Descent ◽

Metric Learning ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Trade Off ◽

Lower Bounding ◽

Pairwise Learning ◽

Auc Maximization ◽

The Stability ◽

Stability Results

In this paper, we study the stability and its trade-off with optimization error for stochastic gradient descent (SGD) algorithms in the pairwise learning setting. Pairwise learning refers to a learning task which involves a loss function depending on pairs of instances among which notable examples are bipartite ranking, metric learning, area under ROC curve (AUC) maximization and minimum error entropy (MEE) principle. Our contribution is twofolded. Firstly, we establish the stability results for SGD for pairwise learning in the convex, strongly convex and non-convex settings, from which generalization errors can be naturally derived. Secondly, we establish the trade-off between stability and optimization error of SGD algorithms for pairwise learning. This is achieved by lower-bounding the sum of stability and optimization error by the minimax statistical error over a prescribed class of pairwise loss functions. From this fundamental trade-off, we obtain lower bounds for the optimization error of SGD algorithms and the excess expected risk over a class of pairwise losses. In addition, we illustrate our stability results by giving some specific examples of AUC maximization, metric learning and MEE.

Download Full-text

A Neuron Noise-Injection Technique for Privacy Preserving Deep Neural Networks

Open Computer Science ◽

10.1515/comp-2020-0133 ◽

2020 ◽

Vol 10 (1) ◽

pp. 137-152

Author(s):

Tosin A. Adesuyi ◽

Byeong Man Kim

Keyword(s):

Differential Privacy ◽

Real Life ◽

Privacy Preserving ◽

Training Dataset ◽

Injection Technique ◽

Sensitive Information ◽

Contribution Ratio ◽

Noise Injection ◽

Real World Datasets ◽

The Right

AbstractData is the key to information mining that unveils hidden knowledge. The ability to revealed knowledge relies on the extractable features of a dataset and likewise the depth of the mining model. Conversely, several of these datasets embed sensitive information that can engender privacy violation and are subsequently used to build deep neural network (DNN) models. Recent approaches to enact privacy and protect data sensitivity in DNN models does decline accuracy, thus, giving rise to significant accuracy disparity between a non-private DNN and a privacy preserving DNN model. This accuracy gap is due to the enormous uncalculated noise flooding and the inability to quantify the right level of noise required to perturb distinct neurons in the DNN model, hence, a dent in accuracy. Consequently, this has hindered the use of privacy protected DNN models in real life applications. In this paper, we present a neuron noise-injection technique based on layer-wise buffered contribution ratio forwarding and ϵ-differential privacy technique to preserve privacy in a DNN model. We adapt a layer-wise relevance propagation technique to compute contribution ratio for each neuron in our network at the pre-training phase. Based on the proportion of each neuron’s contribution ratio, we generate a noise-tuple via the Laplace mechanism, and this helps to eliminate unwanted noise flooding. The noise-tuple is subsequently injected into the training network through its neurons to preserve privacy of the training dataset in a differentially private manner. Hence, each neuron receives right proportion of noise as estimated via contribution ratio, and as a result, unquantifiable noise that drops accuracy of privacy preserving DNN models is avoided. Extensive experiments were conducted based on three real-world datasets and their results show that our approach was able to narrow down the existing accuracy gap to a close proximity, as well outperforms the state-of-the-art approaches in this context.

Download Full-text