Study on distributed privacy preserving data mining

2014 ◽  
Vol 11 (2) ◽  
pp. 163-170
Author(s):  
Binli Wang ◽  
Yanguang Shen

Recently, with the rapid development of network, communications and computer technology, privacy preserving data mining (PPDM) has become an increasingly important research in the field of data mining. In distributed environment, how to protect data privacy while doing data mining jobs from a large number of distributed data is more far-researching. This paper describes current research of PPDM at home and abroad. Then it puts emphasis on classifying the typical uses and algorithms of PPDM in distributed environment, and summarizing their advantages and disadvantages. Furthermore, it points out the future research directions in the field.

2014 ◽  
Vol 556-562 ◽  
pp. 3532-3535
Author(s):  
Heng Li ◽  
Xue Fang Wu

With the rapid development of computer technology and the popularity of the network, database scale, scope and depth of the constantly expanding, which has accumulated vast amounts of different forms of stored data. The use of data mining technology can access valuable information from a lot of data. Privacy preserving has been one of the greater concerns in data mining. Privacy preserving data mining has a rapid development in a short year. But it still faces many challenges in the future. A number of methods and techniques have been developed for privacy preserving data mining. This paper analyzed the representative techniques for privacy preservation. Finally the present problems and directions for future research are discussed.


2010 ◽  
Vol 6 (4) ◽  
pp. 30-45 ◽  
Author(s):  
M. Rajalakshmi ◽  
T. Purusothaman ◽  
S. Pratheeba

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.


Author(s):  
T. Purusothaman ◽  
M. Rajalakshmi ◽  
S. Pratheeba

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.


2018 ◽  
Vol 7 (2.7) ◽  
pp. 515
Author(s):  
Aaluri Seenu ◽  
M Kameswara Rao

In distributed data mining environment maintaining individual data or patterns is a major issue due to high dimensionality and data size. Distributed Data mining framework can help to find the essential decision making patterns from distributed data. Privacy preserving data mining (PPDM) has emerged as a main research area for data confidentiality and knowledge sharing in between the communicating parties. As the distributed data of the individuals are stored by the third party, it leads to the misuse of distributed information in digital networks. Most of the decision patterns generated using the machine learning models for business organizations, industries and individuals has to be encoded before it is publicly shared or published. As the amount of data collected from different sources are increasing exponentially, the time taken to preserve the patterns using the  traditional privacy preserving data mining models also increasing due to high computational attribute selection measures and noise in the distributed data. Also, filling sparse values using the conventional models are inefficient and infeasible for privacy preserving models. In this paper, a novel privacy preserving based classification model was designed and implemented on large datasets. In this model, a filter-based privacy preserving model using improved decision tree classifier is implemented to preserve the decision patterns using IPPDM-KPABE model. Experimental results proved that the proposed model has high computational efficiency compared to the traditional privacy preserving model on high dimensional datasets. 


Author(s):  
Aris Gkoulalas-Divanis ◽  
Vassilios S. Verykios

Since its inception in 2000, privacy preserving data mining has gained increasing popularity in the data mining research community. This line of research can be primarily attributed to the growing concern of individuals, organizations and the government regarding the violation of privacy in the mining of their data by the existing data mining technology. As a result, a whole new body of research was introduced to allow for the mining of data, while at the same time prohibiting the leakage of any private and sensitive information. In this chapter, the authors introduce the readers to the field of privacy preserving data mining; they discuss the reasons that led to its inception, the most prominent research directions, as well as some important methodologies per direction. Following that, the authors focus their attention on very recently investigated methodologies for the offering of privacy during the mining of user mobility data. In the end of the chapter, they provide a roadmap along with potential future research directions both with respect to the field of privacy-aware mobility data mining and to privacy preserving data mining at large.


2021 ◽  
Vol 54 (6) ◽  
pp. 1-36
Author(s):  
Xuefei Yin ◽  
Yanming Zhu ◽  
Jiankun Hu

The past four years have witnessed the rapid development of federated learning (FL). However, new privacy concerns have also emerged during the aggregation of the distributed intermediate results. The emerging privacy-preserving FL (PPFL) has been heralded as a solution to generic privacy-preserving machine learning. However, the challenge of protecting data privacy while maintaining the data utility through machine learning still remains. In this article, we present a comprehensive and systematic survey on the PPFL based on our proposed 5W-scenario-based taxonomy. We analyze the privacy leakage risks in the FL from five aspects, summarize existing methods, and identify future research directions.


2008 ◽  
pp. 693-704
Author(s):  
Bhavani Thuraisingham

This article first describes the privacy concerns that arise due to data mining, especially for national security applications. Then we discuss privacy-preserving data mining. In particular, we view the privacy problem as a form of inference problem and introduce the notion of privacy constraints. We also describe an approach for privacy constraint processing and discuss its relationship to privacy-preserving data mining. Then we give an overview of the developments on privacy-preserving data mining that attempt to maintain privacy and at the same time extract useful information from data mining. Finally, some directions for future research on privacy as related to data mining are given.


Author(s):  
Stanley R.M. Oliveira ◽  
Osmar R. Zaïane

Privacy-preserving data mining (PPDM) is one of the newest trends in privacy and security research. It is driven by one of the major policy issues of the information era—the right to privacy. This chapter describes the foundations for further research in PPDM on the Web. In particular, we describe the problems we face in defining what information is private in data mining. We then describe the basis of PPDM including the historical roots, a discussion on how privacy can be violated in data mining, and the definition of privacy preservation in data mining based on users’ personal information and information concerning their collective activities. Subsequently, we introduce a taxonomy of the existing PPDM techniques and a discussion on how these techniques are applicable to Web-based applications. Finally, we suggest some privacy requirements that are related to industrial initiatives and point to some technical challenges as future research trends in PPDM on the Web.


Sign in / Sign up

Export Citation Format

Share Document