Private FL-GAN: Differential Privacy Synthetic Data Generation Based on Federated Learning

Author(s):  
Bangzhou Xin ◽  
Wei Yang ◽  
Yangyang Geng ◽  
Sheng Chen ◽  
Shaowei Wang ◽  
...  
2021 ◽  
Vol 11 (3) ◽  
Author(s):  
Ryan McKenna ◽  
Gerome Miklau ◽  
Daniel Sheldon

We propose a general approach for differentially private synthetic data generation, that consists of three steps: (1) select a collection of low-dimensional marginals, (2) measure those marginals with a noise addition mechanism, and (3) generate synthetic data that preserves the measured marginals well. Central to this approach is Private-PGM, a post-processing method that is used to estimate a high-dimensional data distribution from noisy measurements of its marginals. We present two mechanisms, NIST-MST and MST, that are instances of this general approach. NIST-MST was the winning mechanism in the 2018 NIST differential privacy synthetic data competition, and MST is a new mechanism that can work in more general settings, while still performing comparably to NIST-MST. We believe our general approach should be of broad interest, and can be adopted in future mechanisms for synthetic data generation.


2021 ◽  
Author(s):  
Bangzhou Xin ◽  
Yangyang Geng ◽  
Teng Hu ◽  
Sheng Chen ◽  
Wei Yang ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Claire McKay Bowen ◽  
Joshua Snoke

Differentially private synthetic data generation offers a recent solution to release analytically useful data while preserving the privacy of individuals in the data. In order to utilize these algorithms for public policy decisions, policymakers need an accurate understanding of these algorithms' comparative performance. Correspondingly, data practitioners also require standard metrics for evaluating the analytic qualities of the synthetic data. In this paper, we present an in-depth evaluation of several differentially private synthetic data algorithms using actual differentially private synthetic data sets created by contestants in the recent National Institute of Standards and Technology Public Safety Communications Research (NIST PSCR) Division's ``"Differential Privacy Synthetic Data Challenge." We offer analyses of these algorithms based on both the accuracy of the data they create and their usability by potential data providers. We frame the methods used in the NIST PSCR data challenge within the broader differentially private synthetic data literature. We implement additional utility metrics, including two of our own, on the differentially private synthetic data and compare mechanism utility on three categories. Our comparative assessment of the differentially private data synthesis methods and the quality metrics shows the relative usefulness, general strengths and weaknesses, preferred choices of algorithms and metrics. Finally we describe the implications of our evaluation for policymakers seeking to implement differentially private synthetic data algorithms on future data products.


Author(s):  
Anne-Sophie Charest

Synthetic datasets generated within the multiple imputation framework are now commonly used by statistical agencies to protect the confidentiality of their respondents. More recently, researchers have also proposed techniques to generate synthetic datasets which offer the formal guarantee of differential privacy. While combining rules were derived for the first type of synthetic datasets, little has been said on the analysis of differentially-private synthetic datasets generated with multiple imputations. In this paper, we show that we can not use the usual combining rules to analyze synthetic datasets which have been generated to achieve differential privacy. We consider specifically the case of generating synthetic count data with the beta-binomial synthetizer, and illustrate our discussion with simulation results. We also propose as a simple alternative a Bayesian model which models explicitly the mechanism for synthetic data generation.


2021 ◽  
Vol 11 (3) ◽  
Author(s):  
Ergute Bao ◽  
Xiaokui Xiao ◽  
Jun Zhao ◽  
Dongping Zhang ◽  
Bolin Ding

This paper describes PrivBayes, a differentially private method for generating synthetic datasets that was used in the 2018 Differential Privacy Synthetic Data Challenge organized by NIST.


2007 ◽  
Author(s):  
Marek K. Jakubowski ◽  
David Pogorzala ◽  
Timothy J. Hattenberger ◽  
Scott D. Brown ◽  
John R. Schott

2004 ◽  
pp. 211-234 ◽  
Author(s):  
Lewis Girod ◽  
Ramesh Govindan ◽  
Deepak Ganesan ◽  
Deborah Estrin ◽  
Yan Yu

2021 ◽  
Author(s):  
Maria Lyssenko ◽  
Christoph Gladisch ◽  
Christian Heinzemann ◽  
Matthias Woehrle ◽  
Rudolph Triebel

Author(s):  
Daniel Jeske ◽  
Pengyue Lin ◽  
Carlos Rendon ◽  
Rui Xiao ◽  
Behrokh Samadi

Sign in / Sign up

Export Citation Format

Share Document