scholarly journals Privacy Preserving Decomposable Mining Association Rules on Distributed Data

2018 ◽  
Vol 7 (3.13) ◽  
pp. 157
Author(s):  
Ahmed M. Khedr ◽  
Zaher AL Aghbari ◽  
Ibrahim Kamel

In distributed computing, data sharing is inevitable, however, moving local databases from one site to another should be avoided because of the computational overhead and privacy consideration. Most of the data mining algorithms are designed assuming that data repository is stored locally. This paper presents a scheme and algorithms for mining association rules in geographically distributed data. The proposed scheme preserves data privacy of the different geographical site by passing secure messages between them. The algorithms minimize the communication cost by exchanging statistical summaries of the local databases. We provide a privacy and security analysis that shows the privacy preserving aspects of the proposed algorithms. Moreover, the paper presents extensive simulation experiments to evaluate the efficiency of the proposed scheme.  

2018 ◽  
Vol 2018 ◽  
pp. 1-10
Author(s):  
Hua Dai ◽  
Hui Ren ◽  
Zhiye Chen ◽  
Geng Yang ◽  
Xun Yi

Outsourcing data in clouds is adopted by more and more companies and individuals due to the profits from data sharing and parallel, elastic, and on-demand computing. However, it forces data owners to lose control of their own data, which causes privacy-preserving problems on sensitive data. Sorting is a common operation in many areas, such as machine learning, service recommendation, and data query. It is a challenge to implement privacy-preserving sorting over encrypted data without leaking privacy of sensitive data. In this paper, we propose privacy-preserving sorting algorithms which are on the basis of the logistic map. Secure comparable codes are constructed by logistic map functions, which can be utilized to compare the corresponding encrypted data items even without knowing their plaintext values. Data owners firstly encrypt their data and generate the corresponding comparable codes and then outsource them to clouds. Cloud servers are capable of sorting the outsourced encrypted data in accordance with their corresponding comparable codes by the proposed privacy-preserving sorting algorithms. Security analysis and experimental results show that the proposed algorithms can protect data privacy, while providing efficient sorting on encrypted data.


2014 ◽  
Vol 9 (1) ◽  
pp. 59-72
Author(s):  
Alaa Khalil Jumaa ◽  
Sufyan T. F. Al-Janabi ◽  
Nazar Abedlqader Ali

Author(s):  
Boudheb Tarik ◽  
Elberrichi Zakaria

Classifying data is to automatically assign predefined classes to data. It is one of the main applications of data mining. Having complete access to all data is critical for building accurate models. Data can be highly sensitive, such as biomedical data, which cannot be disclosed or shared with third party, because it can harm individuals and organizations. The challenge is how to preserve privacy and usefulness of data. Privacy preserving classification addresses this problem. Collaborative models are constructed over networks without violating the data owners' privacy. In this article, the authors address two problems: privacy records deduplication of the same records and privacy-preserving classification. They propose a randomized hash technic for deduplication and an enhanced privacy preserving classification of biomedical data over horizontally distributed data based on two homomorphic encryptions. No private, intermediate or final results are disclosed. Experimentations show that their solution is efficient and secure without loss of accuracy.


2014 ◽  
Vol 11 (2) ◽  
pp. 163-170
Author(s):  
Binli Wang ◽  
Yanguang Shen

Recently, with the rapid development of network, communications and computer technology, privacy preserving data mining (PPDM) has become an increasingly important research in the field of data mining. In distributed environment, how to protect data privacy while doing data mining jobs from a large number of distributed data is more far-researching. This paper describes current research of PPDM at home and abroad. Then it puts emphasis on classifying the typical uses and algorithms of PPDM in distributed environment, and summarizing their advantages and disadvantages. Furthermore, it points out the future research directions in the field.


Electronics ◽  
2021 ◽  
Vol 10 (13) ◽  
pp. 1546
Author(s):  
Munan Yuan ◽  
Xiaofeng Li ◽  
Xiru Li ◽  
Haibo Tan ◽  
Jinlin Xu

Three-dimensional (3D) data are easily collected in an unconscious way and are sensitive to lead biological characteristics exposure. Privacy and ownership have become important disputed issues for the 3D data application field. In this paper, we design a privacy-preserving computation system (SPPCS) for sensitive data protection, based on distributed storage, trusted execution environment (TEE) and blockchain technology. The SPPCS separates a storage and analysis calculation from consensus to build a hierarchical computation architecture. Based on a similarity computation of graph structures, the SPPCS finds data requirement matching lists to avoid invalid transactions. With TEE technology, the SPPCS implements a dual hybrid isolation model to restrict access to raw data and obscure the connections among transaction parties. To validate confidential performance, we implement a prototype of SPPCS with Ethereum and Intel Software Guard Extensions (SGX). The evaluation results derived from test datasets show that (1) the enhanced security and increased time consumption (490 ms in this paper) of multiple SGX nodes need to be balanced; (2) for a single SGX node to enhance data security and preserve privacy, an increased time consumption of about 260 ms is acceptable; (3) the transaction relationship cannot be inferred from records on-chain. The proposed SPPCS implements data privacy and security protection with high performance.


2014 ◽  
Vol 2014 ◽  
pp. 1-7
Author(s):  
Yu Li ◽  
Yuan Zhang ◽  
Yue Ji

With the arrival of the big data era, it is predicted that distributed data mining will lead to an information technology revolution. To motivate different institutes to collaborate with each other, the crucial issue is to eliminate their concerns regarding data privacy. In this paper, we propose a privacy-preserving method for training a restricted boltzmann machine (RBM). The RBM can be got without revealing their private data to each other when using our privacy-preserving method. We provide a correctness and efficiency analysis of our algorithms. The comparative experiment shows that the accuracy is very close to the original RBM model.


Author(s):  
M. Lincy ◽  
A. Meena Kowshalya

Data privacy and security are incredibly important in the healthcare industry. Federated learning is a new way of training a machine learning algorithm using distributed data which is not hosted in a centralized server. Numerous centralized machine learning models exists in literature but none offers privacy to users’ data. This paper proposes a federated learning approach for early detection of Type-2 Diabetes among patients. A simple federated architecture is exploited for early detection of Type-2 diabetes. We compare the proposed federated learning model against our centralised approach. Experimental results prove that the federated learning model ensures significant privacy over centralised learning model whereas compromising accuracy for a subtle extend.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Fan Yin ◽  
Rongxing Lu ◽  
Yandong Zheng ◽  
Xiaohu Tang

The cloud computing technique, which was initially used to mitigate the explosive growth of data, has been required to take both data privacy and users’ query functionality into consideration. Searchable symmetric encryption (SSE) is a popular solution that can support efficient attribute queries over encrypted datasets in the cloud. In particular, some SSE schemes focus on the substring query, which deals with the situation that the user only remembers the substring of the queried attribute. However, all of them just consider substring queries on a single attribute, which cannot be used to achieve compound substring queries on multiple attributes. This paper aims to address this issue by proposing an efficient and privacy-preserving SSE scheme supporting compound substring queries. In specific, we first employ the position heap technique to design a novel tree-based index to support substring queries on a single attribute and employ pseudorandom function (PRF) and fully homomorphic encryption (FHE) techniques to protect its privacy. Then, based on the homomorphism of FHE, we design a filter algorithm to calculate the intersection of search results for different attributes, which can be used to support compound substring queries on multiple attributes. Detailed security analysis shows that our proposed scheme is privacy-preserving. In addition, extensive performance evaluations are also conducted, and the results demonstrate the efficiency of our proposed scheme.


Author(s):  
Xiaodong Lin ◽  
Alan F. Karr

Recent technological advances enable the collection of huge amounts of data. Commonly, these data are generated, stored, and owned by multiple entities that are unwilling to cede control of their data. This distributed environment requires statistical tools that can produce correct results while preserving data privacy. Privacy-preserving protocols have been proposed to solve specific statistical analysis such as linear regression, clustering, and classification. In this paper, we present methods and protocols for privacy-preserving maximum likelihood estimation in general settings. We discuss both horizontally and vertically partitioned data, and propose procedures that allow participating parties to withdraw from the joint computation. Logistic regression is used to demonstrate our method.


Sign in / Sign up

Export Citation Format

Share Document