Differential Privacy under Dependent Tuples - The Case of Genomic Privacy

Bioinformatics ◽

10.1093/bioinformatics/btz837 ◽

2019 ◽

Author(s):

Nour Almadhoun ◽

Erman Ayday ◽

Özgür Ulusoy

Keyword(s):

Differential Privacy ◽

Genomic Data ◽

Privacy Preserving ◽

Supplementary Information ◽

Sensitive Information ◽

Genomic Databases ◽

Privacy Concerns ◽

Rigorous Approach ◽

Genomic Studies ◽

Inference Attack

Abstract Motivation The rapid progress in genome sequencing has led to high availability of genomic data. However, due to growing privacy concerns about the participant’s sensitive information, accessing results and data of genomic studies is restricted to only trusted individuals. On the other hand, paving the way to biomedical discoveries requires granting open access to genomic databases. Privacy-preserving mechanisms can be a solution for granting wider access to such data while protecting their owners. In particular, there has been growing interest in applying the concept of differential privacy (DP) while sharing summary statistics about genomic data. DP provides a mathematically rigorous approach but it does not consider the dependence between tuples in a database, which may degrade the privacy guarantees offered by the DP. Results In this work, focusing on genomic databases, we show this drawback of DP and we propose techniques to mitigate it. First, using a real-world genomic dataset, we demonstrate the feasibility of an inference attack on differentially private query results by utilizing the correlations between the tuples in the dataset. The results show that the adversary can infer sensitive genomic data about a user from the differentially private query results by exploiting correlations between genomes of family members. Second, we propose a mechanism for privacy-preserving sharing of statistics from genomic datasets to attain privacy guarantees while taking into consideration the dependence between tuples. By evaluating our mechanism on different genomic datasets, we empirically demonstrate that our proposed mechanism can achieve up to 50% better privacy than traditional DP-based solutions. Availability https://github.com/nourmadhoun/Differential-privacy-genomic-inference-attack. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Citizen-Centered, Auditable, and Privacy-Preserving Population Genomics

10.1101/799999 ◽

2019 ◽

Author(s):

Dennis Grishin ◽

Jean Louis Raisaro ◽

Juan Ramón Troncoso-Pastoriza ◽

Kamal Obbad ◽

Kevin Quinn ◽

...

Keyword(s):

Open Source ◽

Population Genomics ◽

Homomorphic Encryption ◽

Genomic Data ◽

Data Access ◽

Privacy Preserving ◽

Genomic Databases ◽

Privacy Concerns ◽

Novel Approach ◽

Security Guarantees

AbstractThe growing number of health-data breaches, the use of genomic databases for law enforcement purposes and the lack of transparency of personal-genomics companies are raising unprecedented privacy concerns. To enable a secure exploration of genomic datasets with controlled and transparent data access, we propose a novel approach that combines cryptographic privacy-preserving technologies, such as homomorphic encryption and secure multi-party computation, with the auditability of blockchains. This approach provides strong security guarantees against realistic threat models by empowering individual citizens to decide who can query and access their genomic data and by ensuring end-to-end data confidentiality. Our open-source implementation supports queries on the encrypted genomic data of hundreds of thousands of individuals, with minimal overhead. Our work opens a path towards multi-functional, privacy-preserving genomic-data analysis.One Sentence SummaryA citizen-centered open-source response to the privacy concerns that hinder population genomics, based on modern cryptography.

Download Full-text

Identifying disease-causing mutations with privacy protection

Bioinformatics ◽

10.1093/bioinformatics/btaa641 ◽

2020 ◽

Author(s):

Mete Akgün ◽

Ali Burak Ünal ◽

Bekir Ergüner ◽

Nico Pfeifer ◽

Oliver Kohlbacher

Keyword(s):

Genomic Data ◽

Privacy Preserving ◽

Supplementary Information ◽

Compound Heterozygous ◽

Sensitive Information ◽

Patient Privacy ◽

Privacy And Security ◽

Genome Data ◽

Number Of Patients ◽

Or Genes

Abstract Motivation The use of genome data for diagnosis and treatment is becoming increasingly common. Researchers need access to as many genomes as possible to interpret the patient genome, to obtain some statistical patterns and to reveal disease–gene relationships. The sensitive information contained in the genome data and the high risk of re-identification increase the privacy and security concerns associated with sharing such data. In this article, we present an approach to identify disease-associated variants and genes while ensuring patient privacy. The proposed method uses secure multi-party computation to find disease-causing mutations under specific inheritance models without sacrificing the privacy of individuals. It discloses only variants or genes obtained as a result of the analysis. Thus, the vast majority of patient data can be kept private. Results Our prototype implementation performs analyses on thousands of genomic data in milliseconds, and the runtime scales logarithmically with the number of patients. We present the first inheritance model (recessive, dominant and compound heterozygous) based privacy-preserving analyses of genomic data to find disease-causing mutations. Furthermore, we re-implement the privacy-preserving methods (MAX, SETDIFF and INTERSECTION) proposed in a previous study. Our MAX, SETDIFF and INTERSECTION implementations are 2.5, 1122 and 341 times faster than the corresponding operations of the state-of-the-art protocol, respectively. Availability and implementation https://gitlab.com/DIFUTURE/privacy-preserving-genomic-diagnosis. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Effective Privacy-Preserving Collection of Health Data from a User’s Wearable Device

Applied Sciences ◽

10.3390/app10186396 ◽

2020 ◽

Vol 10 (18) ◽

pp. 6396

Author(s):

Jong Wook Kim ◽

Su-Mee Moon ◽

Sang-ug Kang ◽

Beakcheol Jang

Keyword(s):

Differential Privacy ◽

Service Providers ◽

Healthcare Services ◽

Wearable Devices ◽

Privacy Preserving ◽

Health Data ◽

Experimental Results ◽

Sensitive Information ◽

Privacy Concerns ◽

Primary Means

The popularity of wearable devices equipped with a variety of sensors that can measure users’ health status and monitor their lifestyle has been increasing. In fact, healthcare service providers have been utilizing these devices as a primary means to collect considerable health data from users. Although the health data collected via wearable devices are useful for providing healthcare services, the indiscriminate collection of an individual’s health data raises serious privacy concerns. This is because the health data measured and monitored by wearable devices contain sensitive information related to the wearer’s personal health and lifestyle. Therefore, we propose a method to aggregate health data obtained from users’ wearable devices in a privacy-preserving manner. The proposed method leverages local differential privacy, which is a de facto standard for privacy-preserving data processing and aggregation, to collect sensitive health data. In particular, to mitigate the error incurred by the perturbation mechanism of location differential privacy, the proposed scheme first samples a small number of salient data that best represents the original health data, after which the scheme collects the sampled salient data instead of the entire set of health data. Our experimental results show that the proposed sampling-based collection scheme achieves significant improvement in the estimated accuracy when compared with straightforward solutions. Furthermore, the experimental results verify that an effective tradeoff between the level of privacy protection and the accuracy of aggregate statistics can be achieved with the proposed approach.

Download Full-text

Inductive learning and local differential privacy for privacy-preserving offloading in mobile edge intelligent systems

10.36227/techrxiv.13698883 ◽

2021 ◽

Author(s):

Jude TCHAYE-KONDI ◽

Yanlong Zhai ◽

Liehuang Zhu

Keyword(s):

Intelligent Systems ◽

Differential Privacy ◽

Inductive Learning ◽

Random Noise ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Concerns ◽

Feature Extractor ◽

Data Source ◽

Series Of Experiments

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>

Download Full-text

Privacy in Control and Dynamical Systems

Annual Review of Control Robotics and Autonomous Systems ◽

10.1146/annurev-control-060117-105018 ◽

2018 ◽

Vol 1 (1) ◽

pp. 309-332 ◽

Cited By ~ 10

Author(s):

Shuo Han ◽

George J. Pappas

Keyword(s):

Dynamical Systems ◽

Smart Grids ◽

Differential Privacy ◽

Side Information ◽

Sensitive Information ◽

Efficient Operation ◽

The Public ◽

Rigorous Approach ◽

Trade Offs ◽

User Data

Many modern dynamical systems, such as smart grids and traffic networks, rely on user data for efficient operation. These data often contain sensitive information that the participating users do not wish to reveal to the public. One major challenge is to protect the privacy of participating users when utilizing user data. Over the past decade, differential privacy has emerged as a mathematically rigorous approach that provides strong privacy guarantees. In particular, differential privacy has several useful properties, including resistance to both postprocessing and the use of side information by adversaries. Although differential privacy was first proposed for static-database applications, this review focuses on its use in the context of control systems, in which the data under processing often take the form of data streams. Through two major applications—filtering and optimization algorithms—we illustrate the use of mathematical tools from control and optimization to convert a nonprivate algorithm to its private counterpart. These tools also enable us to quantify the trade-offs between privacy and system performance.

Download Full-text

Differential Privacy Preserving Genomic Data Releasing via Factor Graph

Bioinformatics Research and Applications - Lecture Notes in Computer Science ◽

10.1007/978-3-319-59575-7_33 ◽

2017 ◽

pp. 350-355 ◽

Cited By ~ 2

Author(s):

Zaobo He ◽

Yingshu Li ◽

Jinbao Wang

Keyword(s):

Differential Privacy ◽

Genomic Data ◽

Privacy Preserving ◽

Factor Graph

Download Full-text

Differential Privacy for Stackelberg Games

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/481 ◽

2020 ◽

Cited By ~ 2

Author(s):

Ferdinando Fioretto ◽

Lesia Mitridati ◽

Pascal Van Hentenryck

Keyword(s):

Electricity Market ◽

Original Problem ◽

Differential Privacy ◽

Stackelberg Game ◽

Privacy Preserving ◽

Coordination Mechanism ◽

Stackelberg Games ◽

Sensitive Information ◽

Near Optimality

This paper introduces a differentially private (DP) mechanism to protect the information exchanged during the coordination of sequential and interdependent markets. This coordination represents a classic Stackelberg game and relies on the exchange of sensitive information between the system agents. The paper is motivated by the observation that the perturbation introduced by traditional DP mechanisms fundamentally changes the underlying optimization problem and even leads to unsatisfiable instances. To remedy such limitation, the paper introduces the Privacy-Preserving Stackelberg Mechanism (PPSM), a framework that enforces the notions of feasibility and fidelity (i.e. near-optimality) of the privacy-preserving information to the original problem objective. PPSM complies with the notion of differential privacy and ensures that the outcomes of the privacy-preserving coordination mechanism are close-to-optimality for each agent. Experimental results on several gas and electricity market benchmarks based on a real case study demonstrate the effectiveness of the proposed approach. A full version of this paper [Fioretto et al., 2020b] contains complete proofs and additional discussion on the motivating application.

Download Full-text

Inference attacks against differentially private query results from genomic datasets including dependent tuples

Bioinformatics ◽

10.1093/bioinformatics/btaa475 ◽

2020 ◽

Vol 36 (Supplement_1) ◽

pp. i136-i145

Author(s):

Nour Almadhoun ◽

Erman Ayday ◽

Özgür Ulusoy

Keyword(s):

Complex Traits ◽

Differential Privacy ◽

Clinical Care ◽

Real Life ◽

Ratio Test ◽

Sensitive Information ◽

Chi Square ◽

Significant Information ◽

Inference Attacks ◽

Inference Attack

Abstract Motivation The rapid decrease in the sequencing technology costs leads to a revolution in medical research and clinical care. Today, researchers have access to large genomic datasets to study associations between variants and complex traits. However, availability of such genomic datasets also results in new privacy concerns about personal information of the participants in genomic studies. Differential privacy (DP) is one of the rigorous privacy concepts, which received widespread interest for sharing summary statistics from genomic datasets while protecting the privacy of participants against inference attacks. However, DP has a known drawback as it does not consider the correlation between dataset tuples. Therefore, privacy guarantees of DP-based mechanisms may degrade if the dataset includes dependent tuples, which is a common situation for genomic datasets due to the inherent correlations between genomes of family members. Results In this article, using two real-life genomic datasets, we show that exploiting the correlation between the dataset participants results in significant information leak from differentially private results of complex queries. We formulate this as an attribute inference attack and show the privacy loss in minor allele frequency (MAF) and chi-square queries. Our results show that using the results of differentially private MAF queries and utilizing the dependency between tuples, an adversary can reveal up to 50% more sensitive information about the genome of a target (compared to original privacy guarantees of standard DP-based mechanisms), while differentially privacy chi-square queries can reveal up to 40% more sensitive information. Furthermore, we show that the adversary can use the inferred genomic data obtained from the attribute inference attack to infer the membership of a target in another genomic dataset (e.g. associated with a sensitive trait). Using a log-likelihood-ratio test, our results also show that the inference power of the adversary can be significantly high in such an attack even using inferred (and hence partially incorrect) genomes. Availability and implementation https://github.com/nourmadhoun/Inference-Attacks-Differential-Privacy

Download Full-text

A Secure and Privacy-Preserving Approach to Protect User Data across Cloud based Online Social Networks

Research Anthology on Artificial Intelligence Applications in Security ◽

10.4018/978-1-7998-7705-9.ch027 ◽

2021 ◽

pp. 560-585

Author(s):

Neelu khare ◽

Kumaran U.

Keyword(s):

Social Networks ◽

Online Social Networks ◽

Data Access ◽

Privacy Preserving ◽

Security And Privacy ◽

Sensitive Information ◽

Privacy Concerns ◽

Attribute Based Encryption ◽

User Data ◽

Trapdoor Function

The tremendous growth of social networking systems enables the active participation of a wide variety of users. This has led to an increased probability of security and privacy concerns. In order to solve the issue, the article defines a secure and privacy-preserving approach to protect user data across Cloud-based online social networks. The proposed approach models social networks as a directed graph, such that a user can share sensitive information with other users only if there exists a directed edge from one user to another. The connectivity between data users data is efficiently shared using an attribute-based encryption (ABE) with different data access levels. The proposed ABE technique makes use of a trapdoor function to re-encrypt the data without the use of proxy re-encryption techniques. Experimental evaluation states that the proposed approach provides comparatively better results than the existing techniques.

Download Full-text

Preserving Differential Privacy for Similarity Measurement in Smart Environments

The Scientific World JOURNAL ◽

10.1155/2014/581426 ◽

2014 ◽

Vol 2014 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Kok-Seng Wong ◽

Myung Ho Kim

Keyword(s):

Privacy Protection ◽

Differential Privacy ◽

Smart Environments ◽

Coefficient Function ◽

Sensitive Information ◽

Smart Environment ◽

Privacy Concerns ◽

Privacy Model ◽

The Subject ◽

Measurement Metric

Advances in both sensor technologies and network infrastructures have encouraged the development of smart environments to enhance people’s life and living styles. However, collecting and storing user’s data in the smart environments pose severe privacy concerns because these data may contain sensitive information about the subject. Hence, privacy protection is now an emerging issue that we need to consider especially when data sharing is essential for analysis purpose. In this paper, we consider the case where two agents in the smart environment want to measure the similarity of their collected or stored data. We use similarity coefficient functionFSCas the measurement metric for the comparison with differential privacy model. Unlike the existing solutions, our protocol can facilitate more than one request to computeFSCwithout modifying the protocol. Our solution ensures privacy protection for both the inputs and the computedFSCresults.

Download Full-text