Building a RAPPOR with the Unknown: Privacy-Preserving Learning of Associations and Data Dictionaries

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>

Download Full-text

Differentially Private Image Classification Using Support Vector Machine and Differential Privacy

Machine Learning and Knowledge Extraction ◽

10.3390/make1010029 ◽

2019 ◽

Vol 1 (1) ◽

pp. 483-491 ◽

Cited By ~ 6

Author(s):

Makhamisa Senekane

Keyword(s):

Support Vector Machine ◽

Data Analysis ◽

Image Classification ◽

Differential Privacy ◽

Privacy Preserving ◽

Global Optimum ◽

Support Vector ◽

Sensitive Data ◽

Radiological Images ◽

Golden Standard

The ubiquity of data, including multi-media data such as images, enables easy mining and analysis of such data. However, such an analysis might involve the use of sensitive data such as medical records (including radiological images) and financial records. Privacy-preserving machine learning is an approach that is aimed at the analysis of such data in such a way that privacy is not compromised. There are various privacy-preserving data analysis approaches such as k-anonymity, l-diversity, t-closeness and Differential Privacy (DP). Currently, DP is a golden standard of privacy-preserving data analysis due to its robustness against background knowledge attacks. In this paper, we report a scheme for privacy-preserving image classification using Support Vector Machine (SVM) and DP. SVM is chosen as a classification algorithm because unlike variants of artificial neural networks, it converges to a global optimum. SVM kernels used are linear and Radial Basis Function (RBF), while ϵ -differential privacy was the DP framework used. The proposed scheme achieved an accuracy of up to 98%. The results obtained underline the utility of using SVM and DP for privacy-preserving image classification.

Download Full-text

Privacy-Preserving Monotonicity of Differential Privacy Mechanisms

Applied Sciences ◽

10.3390/app8112081 ◽

2018 ◽

Vol 8 (11) ◽

pp. 2081 ◽

Cited By ~ 1

Author(s):

Hai Liu ◽

Zhenqiang Wu ◽

Yihui Zhou ◽

Changgen Peng ◽

Feng Tian ◽

...

Keyword(s):

Differential Privacy ◽

Estimation Error ◽

Privacy Preserving ◽

Randomized Response ◽

Response Mechanism ◽

Rational Model ◽

Trade Off ◽

Definition Of ◽

Utility Metrics ◽

Monotonicity Results

Differential privacy mechanisms can offer a trade-off between privacy and utility by using privacy metrics and utility metrics. The trade-off of differential privacy shows that one thing increases and another decreases in terms of privacy metrics and utility metrics. However, there is no unified trade-off measurement of differential privacy mechanisms. To this end, we proposed the definition of privacy-preserving monotonicity of differential privacy, which measured the trade-off between privacy and utility. First, to formulate the trade-off, we presented the definition of privacy-preserving monotonicity based on computational indistinguishability. Second, building on privacy metrics of the expected estimation error and entropy, we theoretically and numerically showed privacy-preserving monotonicity of Laplace mechanism, Gaussian mechanism, exponential mechanism, and randomized response mechanism. In addition, we also theoretically and numerically analyzed the utility monotonicity of these several differential privacy mechanisms based on utility metrics of modulus of characteristic function and variant of normalized entropy. Third, according to the privacy-preserving monotonicity of differential privacy, we presented a method to seek trade-off under a semi-honest model and analyzed a unilateral trade-off under a rational model. Therefore, privacy-preserving monotonicity can be used as a criterion to evaluate the trade-off between privacy and utility in differential privacy mechanisms under the semi-honest model. However, privacy-preserving monotonicity results in a unilateral trade-off of the rational model, which can lead to severe consequences.

Download Full-text

A Defense Framework for Privacy Risks in Remote Machine Learning Service

Security and Communication Networks ◽

10.1155/2021/9924684 ◽

2021 ◽

Vol 2021 ◽

pp. 1-13

Author(s):

Yang Bai ◽

Yu Li ◽

Mingchuang Xie ◽

Mingyu Fan

Keyword(s):

Machine Learning ◽

Differential Privacy ◽

Original Data ◽

Privacy Preserving ◽

Training Data ◽

Sensitive Information ◽

Learning Approaches ◽

Local Data ◽

Sensitive Data ◽

Privacy Risks

In recent years, machine learning approaches have been widely adopted for many applications, including classification. Machine learning models deal with collective sensitive data usually trained in a remote public cloud server, for instance, machine learning as a service (MLaaS) system. In this scene, users upload their local data and utilize the computation capability to train models, or users directly access models trained by MLaaS. Unfortunately, recent works reveal that the curious server (that trains the model with users’ sensitive local data and is curious to know the information about individuals) and the malicious MLaaS user (who abused to query from the MLaaS system) will cause privacy risks. The adversarial method as one of typical mitigation has been studied by several recent works. However, most of them focus on the privacy-preserving against the malicious user; in other words, they commonly consider the data owner and the model provider as one role. Under this assumption, the privacy leakage risks from the curious server are neglected. Differential privacy methods can defend against privacy threats from both the curious sever and the malicious MLaaS user by directly adding noise to the training data. Nonetheless, the differential privacy method will decrease the classification accuracy of the target model heavily. In this work, we propose a generic privacy-preserving framework based on the adversarial method to defend both the curious server and the malicious MLaaS user. The framework can adapt with several adversarial algorithms to generate adversarial examples directly with data owners’ original data. By doing so, sensitive information about the original data is hidden. Then, we explore the constraint conditions of this framework which help us to find the balance between privacy protection and the model utility. The experiments’ results show that our defense framework with the AdvGAN method is effective against MIA and our defense framework with the FGSM method can protect the sensitive data from direct content exposed attacks. In addition, our method can achieve better privacy and utility balance compared to the existing method.

Download Full-text

Inductive learning and local differential privacy for privacy-preserving offloading in mobile edge intelligent systems

10.36227/techrxiv.13698883.v1 ◽

2021 ◽

Author(s):

Jude TCHAYE-KONDI ◽

Yanlong Zhai ◽

Liehuang Zhu

Keyword(s):

Intelligent Systems ◽

Differential Privacy ◽

Inductive Learning ◽

Random Noise ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Concerns ◽

Feature Extractor ◽

Data Source ◽

Series Of Experiments

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>

Download Full-text

Inductive learning and local differential privacy for privacy-preserving offloading in mobile edge intelligent systems

10.36227/techrxiv.13698883.v2 ◽

2021 ◽

Author(s):

Jude TCHAYE-KONDI ◽

Yanlong Zhai ◽

Liehuang Zhu

Keyword(s):

Intelligent Systems ◽

Differential Privacy ◽

Inductive Learning ◽

Random Noise ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Concerns ◽

Feature Extractor ◽

Data Source ◽

Series Of Experiments

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>

Download Full-text

Detection of Sensitive Data Leakage for Privacy Preserving

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i9.765769 ◽

2018 ◽

Vol 6 (9) ◽

pp. 765-769

Author(s):

R.J. Patil ◽

Y.S. Borse

Keyword(s):

Privacy Preserving ◽

Sensitive Data

Download Full-text

Federated deep learning for detecting COVID-19 lung abnormalities in CT: a privacy-preserving multinational validation study

npj Digital Medicine ◽

10.1038/s41746-021-00431-6 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Qi Dou ◽

Tiffany Y. So ◽

Meirui Jiang ◽

Quande Liu ◽

Varut Vardhanabhuti ◽

...

Keyword(s):

Data Privacy ◽

Medical Training ◽

External Validation ◽

Mainland China ◽

Privacy Preserving ◽

Generalization Capability ◽

Major Focus ◽

Sensitive Data ◽

Lung Abnormalities ◽

Medical Image Diagnosis

AbstractData privacy mechanisms are essential for rapidly scaling medical training databases to capture the heterogeneity of patient data distributions toward robust and generalizable machine learning systems. In the current COVID-19 pandemic, a major focus of artificial intelligence (AI) is interpreting chest CT, which can be readily used in the assessment and management of the disease. This paper demonstrates the feasibility of a federated learning method for detecting COVID-19 related CT abnormalities with external validation on patients from a multinational study. We recruited 132 patients from seven multinational different centers, with three internal hospitals from Hong Kong for training and testing, and four external, independent datasets from Mainland China and Germany, for validating model generalizability. We also conducted case studies on longitudinal scans for automated estimation of lesion burden for hospitalized COVID-19 patients. We explore the federated learning algorithms to develop a privacy-preserving AI model for COVID-19 medical image diagnosis with good generalization capability on unseen multinational datasets. Federated learning could provide an effective mechanism during pandemics to rapidly develop clinically useful AI across institutions and countries overcoming the burden of central aggregation of large amounts of sensitive data.

Download Full-text

Privacy-preserving Streaming Truth Discovery in Crowdsourcing with Differential Privacy

IEEE Transactions on Mobile Computing ◽

10.1109/tmc.2021.3062775 ◽

2021 ◽

pp. 1-1

Author(s):

Dan Wang ◽

Ju Ren ◽

Zhibo Wang ◽

Xiaoyi Pang ◽

Yaoxue Zhang ◽

...

Keyword(s):

Differential Privacy ◽

Privacy Preserving ◽

Truth Discovery

Download Full-text

Privacy-Preserving Deep Neural Network Methods: Computational and Perceptual Methods—An Overview

Electronics ◽

10.3390/electronics10111367 ◽

2021 ◽

Vol 10 (11) ◽

pp. 1367

Author(s):

Raghida El El Saj ◽

Ehsan Sedgh Sedgh Gooya ◽

Ayman Alfalou ◽

Mohamad Khalil

Keyword(s):

Neural Networks ◽

Classification Accuracy ◽

Deep Neural Networks ◽

Privacy Preserving ◽

Sensitive Data ◽

Encrypted Data ◽

High Classification Accuracy ◽

Hide Information ◽

Network Methods ◽

Cloud Environments

Privacy-preserving deep neural networks have become essential and have attracted the attention of many researchers due to the need to maintain the privacy and the confidentiality of personal and sensitive data. The importance of privacy-preserving networks has increased with the widespread use of neural networks as a service in unsecured cloud environments. Different methods have been proposed and developed to solve the privacy-preserving problem using deep neural networks on encrypted data. In this article, we reviewed some of the most relevant and well-known computational and perceptual image encryption methods. These methods as well as their results have been presented, compared, and the conditions of their use, the durability and robustness of some of them against attacks, have been discussed. Some of the mentioned methods have demonstrated an ability to hide information and make it difficult for adversaries to retrieve it while maintaining high classification accuracy. Based on the obtained results, it was suggested to develop and use some of the cited privacy-preserving methods in applications other than classification.

Download Full-text