Efficient and secure protocols for privacy-preserving set operations

Essential predictions are to be made by the parties distributed at multiple locations. However, in the process of building a model, perceptive data is not to be revealed. Maintaining the privacy of such data is a foremost concern. Earlier approaches developed for classification and prediction are proven not to be secure enough and the performance is affected. This chapter focuses on the secure construction of commonly used classifiers. The computations performed during model building are proved to be semantically secure. The homomorphism and probabilistic property of Paillier is used to perform secure product, mean, and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost. It is also proved that proposed privacy preserving classifiers perform significantly better than the base classifiers.

Download Full-text

Privacy-Preserving Search of Similar Patients in Genomic Data

Proceedings on Privacy Enhancing Technologies ◽

10.1515/popets-2018-0034 ◽

2018 ◽

Vol 2018 (4) ◽

pp. 104-124 ◽

Cited By ~ 7

Author(s):

Gilad Asharov ◽

Shai Halevi ◽

Yehuda Lindell ◽

Tal Rabin

Keyword(s):

Approximation Method ◽

Genomic Data ◽

Privacy Preserving ◽

Secure Computation ◽

Great Promise ◽

Full Potential ◽

Genome Data ◽

Similar Patient ◽

Secure Protocols ◽

Private Computation

Abstract The growing availability of genomic data holds great promise for advancing medicine and research, but unlocking its full potential requires adequate methods for protecting the privacy of individuals whose genome data we use. One example of this tension is running Similar Patient Query on remote genomic data: In this setting a doctor that holds the genome of his/her patient may try to find other individuals with “close” genomic data, and use the data of these individuals to help diagnose and find effective treatment for that patient’s conditions. This is clearly a desirable mode of operation. However, the privacy exposure implications are considerable, and so we would like to carry out the above “closeness” computation in a privacy preserving manner. In this work we put forward a new approach for highly efficient secure computation for computing an approximation of the Similar Patient Query problem. We present contributions on two fronts. First, an approximation method that is designed with the goal of achieving efficient private computation. Second, further optimizations of the two-party protocol. Our tests indicate that the approximation method works well, it returns the exact closest records in 98% of the queries and very good approximation otherwise. As for speed, our protocol implementation takes just a few seconds to run on databases with thousands of records, each of length thousands of alleles, and it scales almost linearly with both the database size and the length of the sequences in it. As an example, in the datasets of the recent iDASH competition, after a one-time preprocessing of around 12 seconds, it takes around a second to find the nearest five records to a query, in a size-500 dataset of length- 3500 sequences. This is 2-3 orders of magnitude faster than using state-of-the-art secure protocols with existing edit distance algorithms.

Download Full-text

Privacy-Preserving Set Operations

10.21236/ada457144 ◽

2005 ◽

Cited By ~ 53

Author(s):

Lea Kissner ◽

Dawn Song

Keyword(s):

Privacy Preserving ◽

Set Operations

Download Full-text

Accurate Classification Models for Distributed Mining of Privately Preserved Data

International Journal of Information Security and Privacy ◽

10.4018/ijisp.2016100104 ◽

2016 ◽

Vol 10 (4) ◽

pp. 58-73 ◽

Cited By ~ 1

Author(s):

Sumana M. ◽

Hareesha K.S.

Keyword(s):

Privacy Preservation ◽

Computation Time ◽

Privacy Preserving ◽

Data Reconstruction ◽

Sensitive Data ◽

Intermediate Data ◽

Secure Protocols ◽

Mean And Variance ◽

Mining Work ◽

Probabilistic Property

Data maintained at various sectors, needs to be mined to derive useful inferences. Larger part of the data is sensitive and not to be revealed while mining. Current methods perform privacy preservation classification either by randomizing, perturbing or anonymizing the data during mining. These forms of privacy preserving mining work well for data centralized at a single site. Moreover the amount of information hidden during mining is not sufficient. When perturbation approaches are used, data reconstruction is a major challenge. This paper aims at modeling classifiers for data distributed across various sites with respect to the same instances. The homomorphic and probabilistic property of Paillier is used to perform secure product, mean and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost.

Download Full-text

Semantically Secure Classifiers for Privacy Preserving Data Mining

Security and Privacy Management, Techniques, and Protocols - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-5583-4.ch003 ◽

2018 ◽

pp. 66-95

Author(s):

Sumana M. ◽

Hareesha K. S. ◽

Sampath Kumar

Keyword(s):

Data Mining ◽

Model Building ◽

Computation Time ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Preserving Data Mining ◽

Intermediate Data ◽

Secure Protocols ◽

Mean And Variance ◽

Probabilistic Property

Essential predictions are to be made by the parties distributed at multiple locations. However, in the process of building a model, perceptive data is not to be revealed. Maintaining the privacy of such data is a foremost concern. Earlier approaches developed for classification and prediction are proven not to be secure enough and the performance is affected. This chapter focuses on the secure construction of commonly used classifiers. The computations performed during model building are proved to be semantically secure. The homomorphism and probabilistic property of Paillier is used to perform secure product, mean, and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost. It is also proved that proposed privacy preserving classifiers perform significantly better than the base classifiers.

Download Full-text

Maximized Privacy-Preserving Outsourcing on Support Vector Clustering

Electronics ◽

10.3390/electronics9010178 ◽

2020 ◽

Vol 9 (1) ◽

pp. 178

Author(s):

Yuan Ping ◽

Bin Hao ◽

Xiali Hei ◽

Jie Wu ◽

Baocang Wang

Keyword(s):

Homomorphic Encryption ◽

Privacy Preserving ◽

Support Vector ◽

Convex Decomposition ◽

Support Vector Clustering ◽

Cluster Labeling ◽

Coefficient Vector ◽

Dual Coordinate Descent ◽

Secure Protocols ◽

Vector Clustering

Despite its remarkable capability in handling arbitrary cluster shapes, support vector clustering (SVC) suffers from pricey storage of kernel matrix and costly computations. Outsourcing data or function on demand is intuitively expected, yet it raises a great violation of privacy. We propose maximized privacy-preserving outsourcing on SVC (MPPSVC), which, to the best of our knowledge, is the first all-phase outsourceable solution. For privacy-preserving, we exploit the properties of homomorphic encryption and secure two-party computation. To break through the operation limitation, we propose a reformative SVC with elementary operations (RSVC-EO, the core of MPPSVC), in which a series of designs make selective outsourcing phase possible. In the training phase, we develop a dual coordinate descent solver, which avoids interactions before getting the encrypted coefficient vector. In the labeling phase, we design a fresh convex decomposition cluster labeling, by which no iteration is required by convex decomposition and no sampling checks exist in connectivity analysis. Afterward, we customize secure protocols to match these operations for essential interactions in the encrypted domain. Considering the privacy-preserving property and efficiency in a semi-honest environment, we proved MPPSVC’s robustness against adversarial attacks. Our experimental results confirm that MPPSVC achieves comparable accuracies to RSVC-EO, which outperforms the state-of-the-art variants of SVC.

Download Full-text

Privacy Preserving Using Dummy Data for Set Operations in Itemset Mining Implemented with ZDDs

IEICE Transactions on Information and Systems ◽

10.1587/transinf.e95.d.3017 ◽

2012 ◽

Vol E95.D (12) ◽

pp. 3017-3025

Author(s):

Keisuke OTAKI ◽

Mahito SUGIYAMA ◽

Akihiro YAMAMOTO

Keyword(s):

Privacy Preserving ◽

Itemset Mining ◽

Set Operations

Download Full-text

Accurate Classification Models for Distributed Mining of Privately Preserved Data

Cyber Law, Privacy, and Security ◽

10.4018/978-1-5225-8897-9.ch022 ◽

2019 ◽

pp. 462-478

Author(s):

Sumana M. ◽

Hareesha K. S.

Keyword(s):

Privacy Preservation ◽

Computation Time ◽

Privacy Preserving ◽

Classification Models ◽

Sensitive Data ◽

Intermediate Data ◽

Secure Protocols ◽

Mean And Variance ◽

Mining Work ◽

Probabilistic Property

Data maintained at various sectors, needs to be mined to derive useful inferences. Larger part of the data is sensitive and not to be revealed while mining. Current methods perform privacy preservation classification either by randomizing, perturbing or anonymizing the data during mining. These forms of privacy preserving mining work well for data centralized at a single site. Moreover the amount of information hidden during mining is not sufficient. When perturbation approaches are used, data reconstruction is a major challenge. This paper aims at modeling classifiers for data distributed across various sites with respect to the same instances. The homomorphic and probabilistic property of Paillier is used to perform secure product, mean and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost.

Download Full-text

UnLynx: A Decentralized System for Privacy-Conscious Data Sharing

Proceedings on Privacy Enhancing Technologies ◽

10.1515/popets-2017-0047 ◽

2017 ◽

Vol 2017 (4) ◽

pp. 232-250 ◽

Cited By ~ 9

Author(s):

David Froelicher ◽

Patricia Egger ◽

João Sá Sousa ◽

Jean Louis Raisaro ◽

Zhicong Huang ◽

...

Keyword(s):

Data Sharing ◽

Data Privacy ◽

Personal Data ◽

Privacy Preserving ◽

Zero Knowledge ◽

Decentralized System ◽

Weakest Link ◽

Cryptographic Keys ◽

Secure Protocols

Abstract Current solutions for privacy-preserving data sharing among multiple parties either depend on a centralized authority that must be trusted and provides only weakest-link security (e.g., the entity that manages private/secret cryptographic keys), or leverage on decentralized but impractical approaches (e.g., secure multi-party computation). When the data to be shared are of a sensitive nature and the number of data providers is high, these solutions are not appropriate. Therefore, we present UnLynx, a new decentralized system for efficient privacy-preserving data sharing. We consider m servers that constitute a collective authority whose goal is to verifiably compute on data sent from n data providers. UnLynx guarantees the confidentiality, unlinkability between data providers and their data, privacy of the end result and the correctness of computations by the servers. Furthermore, to support differentially private queries, UnLynx can collectively add noise under encryption. All of this is achieved through a combination of a set of new distributed and secure protocols that are based on homomorphic cryptography, verifiable shuffling and zero-knowledge proofs. UnLynx is highly parallelizable and modular by design as it enables multiple security/privacy vs. runtime tradeoffs. Our evaluation shows that UnLynx can execute a secure survey on 400,000 personal data records containing 5 encrypted attributes, distributed over 20 independent databases, for a total of 2,000,000 ciphertexts, in 24 minutes.

Download Full-text