Accuracy First: Selecting a Differential Privacy Level for Accuracy-Constrained ERM

Traditional approaches to differential privacy assume a fixed privacy requirement ε for a computation, and attempt to maximize the accuracy of the computation subject to the privacy constraint. As differential privacy is increasingly deployed in practical settings, it may often be that there is instead a fixed accuracy requirement for a given computation and the data analyst would like to maximize the privacy of the computation subject to the accuracy constraint. This raises the question of how to find and run a maximally private empirical risk minimizer subject to a given accuracy requirement. We propose a general “noise reduction” framework that can apply to a variety of private empirical risk minimization (ERM) algorithms, using them to “search” the space of privacy levels to find the empirically strongest one that meets the accuracy constraint, and incurring only logarithmic overhead in the number of privacy levels searched. The privacy analysis of our algorithm leads naturally to a version of differential privacy where the privacy parameters are dependent on the data, which we term ex-post privacy, and which is related to the recently introduced notion of privacy odometers. We also give an ex-post privacy analysis of the classical AboveThreshold privacy tool, modifying it to allow for queries chosen depending on the database. Finally, we apply our approach to two common objective functions, regularized linear and logistic regression, and empirically compare our noise reduction methods to (i) inverting the theoretical utility guarantees of standard private ERM algorithms and (ii) a stronger, empirical baseline based on binary search.

Download Full-text

An Effective Hard Thresholding Method Based on Stochastic Variance Reduction for Nonconvex Sparse Learning

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i02.5519 ◽

2020 ◽

Vol 34 (02) ◽

pp. 1585-1592

Author(s):

Guannan Liang ◽

Qianqian Tong ◽

Chunjiang Zhu ◽

Jinbo Bi

Keyword(s):

Variance Reduction ◽

Linear Convergence ◽

Stochastic Gradient Descent ◽

Batch Size ◽

Sparse Learning ◽

Risk Minimization ◽

Hard Thresholding ◽

Thresholding Method ◽

Empirical Risk ◽

Reduction Methods

We propose a hard thresholding method based on stochastically controlled stochastic gradients (SCSG-HT) to solve a family of sparsity-constrained empirical risk minimization problems. The SCSG-HT uses batch gradients where batch size is pre-determined by the desirable precision tolerance rather than full gradients to reduce the variance in stochastic gradients. It also employs the geometric distribution to determine the number of loops per epoch. We prove that, similar to the latest methods based on stochastic gradient descent or stochastic variance reduction methods, SCSG-HT enjoys a linear convergence rate. However, SCSG-HT now has a strong guarantee to recover the optimal sparse estimator. The computational complexity of SCSG-HT is independent of sample size n when n is larger than 1/ε, which enhances the scalability to massive-scale problems. Empirical results demonstrate that SCSG-HT outperforms several competitors and decreases the objective value the most with the same computational costs.

Download Full-text

Between Pure and Approximate Differential Privacy

Journal of Privacy and Confidentiality ◽

10.29012/jpc.v7i2.648 ◽

2017 ◽

Vol 7 (2) ◽

Cited By ~ 6

Author(s):

Thomas Steinke ◽

Jonathan Ullman

Keyword(s):

Lower Bound ◽

Differential Privacy ◽

High Dimensional ◽

Empirical Risk Minimization ◽

Sample Complexity ◽

Risk Minimization ◽

Worst Case ◽

Logarithmic Factor ◽

Empirical Risk ◽

Statistical Queries

We show a new lower bound on the sample complexity of (ε,δ)-differentially private algorithms that accurately answer statistical queries on high-dimensional databases. The novelty of our bound is that it depends optimally on the parameter δ, which loosely corresponds to the probability that the algorithm fails to be private, and is the first to smoothly interpolate between approximate differential privacy (δ >0) and pure differential privacy (δ= 0). Specifically, we consider a database D ∈{±1}n×d and its one-way marginals, which are the d queries of the form “What fraction of individual records have the i-th bit set to +1?” We show that in order to answer all of these queries to within error ±α (on average) while satisfying (ε,δ)-differential privacy for some function δ such that δ≥2−o(n) and δ≤1/n1+Ω(1), it is necessary that \[n≥Ω (\frac{√dlog(1/δ)}{αε}).\] This bound is optimal up to constant factors. This lower bound implies similar new bounds for problems like private empirical risk minimization and private PCA. To prove our lower bound, we build on the connection between fingerprinting codes and lower bounds in differential privacy (Bun, Ullman, and Vadhan, STOC’14). In addition to our lower bound, we give new purely and approximately differentially private algorithms for answering arbitrary statistical queries that improve on the sample complexity of the standard Laplace and Gaussian mechanisms for achieving worst-case accuracy guarantees by a logarithmic factor.

Download Full-text

Differentially Private Confidence Intervals for Empirical Risk Minimization

Journal of Privacy and Confidentiality ◽

10.29012/jpc.660 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 2

Author(s):

Yue Wang ◽

Daniel Kifer ◽

Jaewoo Lee

Keyword(s):

Machine Learning ◽

Data Mining ◽

Confidence Intervals ◽

Differential Privacy ◽

Sensitive Information ◽

Empirical Risk Minimization ◽

Risk Minimization ◽

Learning Models ◽

Empirical Risk ◽

Machine Learning Models

The process of data mining with differential privacy produces results that are affected by two types of noise: sampling noise due to data collection and privacy noise that is designed to prevent the reconstruction of sensitive information. In this paper, we consider the problem of designing confidence intervals for the parameters of a variety of differentially private machine learning models. The algorithms can provide confidence intervals that satisfy differential privacy (as well as the more recently proposed concentrated differential privacy) and can be used with existing differentially private mechanisms that train models using objective perturbation and output perturbation.

Download Full-text

Differentially Private Learning with Small Public Data

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6088 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6219-6226

Author(s):

Jun Wang ◽

Zhi-Hua Zhou

Keyword(s):

Gradient Descent ◽

Differential Privacy ◽

Public Information ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Risk Minimization ◽

Empirical Risk ◽

Private Data ◽

Public Data ◽

Common Situation

Differentially private learning tackles tasks where the data are private and the learning process is subject to differential privacy requirements. In real applications, however, some public data are generally available in addition to private data, and it is interesting to consider how to exploit them. In this paper, we study a common situation where a small amount of public data can be used when solving the Empirical Risk Minimization problem over a private database. Specifically, we propose Private-Public Stochastic Gradient Descent, which utilizes such public information to adjust parameters in differentially private stochastic gradient descent and fine-tunes the final result with model reuse. Our method keeps differential privacy for the private database, and empirical study validates its superiority compared with existing approaches.

Download Full-text

Statistical Learning: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization

10.21236/ada459857 ◽

2004 ◽

Cited By ~ 4

Author(s):

Sayan Mukherjee ◽

Partha Niyogi ◽

Tomaso Poggio ◽

Ryan Rifkin

Keyword(s):

Statistical Learning ◽

Empirical Risk Minimization ◽

Risk Minimization ◽

Empirical Risk ◽

Necessary And Sufficient

Download Full-text

Distributed empirical risk minimization over directed graphs

2019 53rd Asilomar Conference on Signals, Systems, and Computers ◽

10.1109/ieeeconf44664.2019.9049065 ◽

2019 ◽

Cited By ~ 1

Author(s):

Ran Xin ◽

Anit Kumar Sahu ◽

Soummya Kar ◽

Usman A. Khan

Keyword(s):

Directed Graphs ◽

Empirical Risk Minimization ◽

Risk Minimization ◽

Empirical Risk

Download Full-text

Learning Bounds of ERM Principle for Sequences of Time-Dependent Samples

Discrete Dynamics in Nature and Society ◽

10.1155/2015/826812 ◽

2015 ◽

Vol 2015 ◽

pp. 1-8

Author(s):

Mingchen Yao ◽

Chao Zhang ◽

Wei Wu

Keyword(s):

Time Dependent ◽

Dependent Data ◽

Empirical Risk Minimization ◽

Risk Minimization ◽

Practical Applications ◽

Empirical Risk ◽

Learning Tasks ◽

Generalization Bound ◽

Erm Principle ◽

Dependent Samples

Many generalization results in learning theory are established under the assumption that samples are independent and identically distributed (i.i.d.). However, numerous learning tasks in practical applications involve the time-dependent data. In this paper, we propose a theoretical framework to analyze the generalization performance of the empirical risk minimization (ERM) principle for sequences of time-dependent samples (TDS). In particular, we first present the generalization bound of ERM principle for TDS. By introducing some auxiliary quantities, we also give a further analysis of the generalization properties and the asymptotical behaviors of ERM principle for TDS.

Download Full-text

Differentially Private Empirical Risk Minimization for AUC Maximization

Neurocomputing ◽

10.1016/j.neucom.2021.07.001 ◽

2021 ◽

Author(s):

Puyu Wang ◽

Zhenhuan Yang ◽

Yunwen Lei ◽

Yiming Ying ◽

Hai Zhang

Keyword(s):

Empirical Risk Minimization ◽

Risk Minimization ◽

Empirical Risk ◽

Auc Maximization

Download Full-text

On the Running Noise of Toothed Belt Drive : 2nd Report, Influence of Running Condition and Some Noise Reduction Methods

Transactions of the Japan Society of Mechanical Engineers ◽

10.1299/kikai1938.37.203 ◽

1971 ◽

Vol 37 (293) ◽

pp. 203-211

Author(s):

Aizoh KUBO ◽

Toshiaki ANDO ◽

Susumu SATO ◽

Toshio AIDA ◽

Takeshi HOSHIRO

Keyword(s):

Noise Reduction ◽

Belt Drive ◽

Toothed Belt ◽

Reduction Methods

Download Full-text

Asymptotic Properties of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization

Mathematics of Operations Research ◽

10.1287/moor.2021.1198 ◽

2021 ◽

Author(s):

Zhengling Qi ◽

Ying Cui ◽

Yufeng Liu ◽

Jong-Shi Pang

Keyword(s):

Phase Retrieval ◽

Convergence Rates ◽

Stationary Solutions ◽

Statistical Properties ◽

Global Minimizer ◽

General Distribution ◽

Empirical Risk Minimization ◽

Risk Minimization ◽

Minimization Problems ◽

Empirical Risk

This paper has two main goals: (a) establish several statistical properties—consistency, asymptotic distributions, and convergence rates—of stationary solutions and values of a class of coupled nonconvex and nonsmooth empirical risk-minimization problems and (b) validate these properties by a noisy amplitude-based phase-retrieval problem, the latter being of much topical interest. Derived from available data via sampling, these empirical risk-minimization problems are the computational workhorse of a population risk model that involves the minimization of an expected value of a random functional. When these minimization problems are nonconvex, the computation of their globally optimal solutions is elusive. Together with the fact that the expectation operator cannot be evaluated for general probability distributions, it becomes necessary to justify whether the stationary solutions of the empirical problems are practical approximations of the stationary solution of the population problem. When these two features, general distribution and nonconvexity, are coupled with nondifferentiability that often renders the problems “non-Clarke regular,” the task of the justification becomes challenging. Our work aims to address such a challenge within an algorithm-free setting. The resulting analysis is, therefore, different from much of the analysis in the recent literature that is based on local search algorithms. Furthermore, supplementing the classical global minimizer-centric analysis, our results offer a promising step to close the gap between computational optimization and asymptotic analysis of coupled, nonconvex, nonsmooth statistical estimation problems, expanding the former with statistical properties of the practically obtained solution and providing the latter with a more practical focus pertaining to computational tractability.

Download Full-text