A SURVEY OF PRIVACY PRESERVING DATA MINING ALGORITHMS

Recent concerns about privacy issues have motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. One approach to develop privacy preserving data mining algorithms is secure multiparty computation, which allows for privacy preserving data mining algorithms that do not trade accuracy for privacy. However, earlier methods suffer from very high communication and computational costs, making them infeasible to use in any real world scenario. Moreover, these algorithms have strict assumptions on the involved parties, assuming involved parties will not collude with each other. In this paper, the authors propose a new secure multiparty computation based k-means clustering algorithm that is both secure and efficient enough to be used in a real world scenario. Experiments based on realistic scenarios reveal that this protocol has lower communication costs and significantly lower computational costs.

Download Full-text

Privacy Preserving Data Mining

10.5772/intechopen.99224 ◽

2021 ◽

Author(s):

Esma Ergüner Özkoç

Keyword(s):

Data Mining ◽

Data Privacy ◽

Personal Data ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

Data Mining Techniques ◽

Data Mining Algorithms ◽

Data Output ◽

The Individual ◽

Mining Algorithms

Data mining techniques provide benefits in many areas such as medicine, sports, marketing, signal processing as well as data and network security. However, although data mining techniques used in security subjects such as intrusion detection, biometric authentication, fraud and malware classification, “privacy” has become a serious problem, especially in data mining applications that involve the collection and sharing of personal data. For these reasons, the problem of protecting privacy in the context of data mining differs from traditional data privacy protection, as data mining can act as both a friend and foe. Chapter covers the previously developed privacy preserving data mining techniques in two parts: (i) techniques proposed for input data that will be subject to data mining and (ii) techniques suggested for processed data (output of the data mining algorithms). Also presents attacks against the privacy of data mining applications. The chapter conclude with a discussion of next-generation privacy-preserving data mining applications at both the individual and organizational levels.

Download Full-text

Distributed Privacy Preserving Clustering via Homomorphic Secret Sharing and its Application to (Vertically) Partitioned Spatio-Temporal Data

Developments in Data Extraction, Management, and Analysis ◽

10.4018/978-1-4666-2148-0.ch003 ◽

2013 ◽

pp. 45-65

Author(s):

Can Brochmann Yildizli ◽

Thomas Pedersen ◽

Yucel Saygin ◽

Erkay Savas ◽

Albert Levi

Keyword(s):

Data Mining ◽

Real World ◽

Privacy Preserving ◽

Secure Multiparty Computation ◽

Multiparty Computation ◽

Privacy Preserving Data Mining ◽

Computational Costs ◽

Data Mining Algorithms ◽

Spatio Temporal ◽

Mining Algorithms

Recent concerns about privacy issues have motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. One approach to develop privacy preserving data mining algorithms is secure multiparty computation, which allows for privacy preserving data mining algorithms that do not trade accuracy for privacy. However, earlier methods suffer from very high communication and computational costs, making them infeasible to use in any real world scenario. Moreover, these algorithms have strict assumptions on the involved parties, assuming involved parties will not collude with each other. In this paper, the authors propose a new secure multiparty computation based k-means clustering algorithm that is both secure and efficient enough to be used in a real world scenario. Experiments based on realistic scenarios reveal that this protocol has lower communication costs and significantly lower computational costs.

Download Full-text

Privacy Preserving Data Mining Algorithms by Data Distortion

2006 International Conference on Management Science and Engineering ◽

10.1109/icmse.2006.313871 ◽

2006 ◽

Cited By ~ 2

Author(s):

Wu Xiao-dan ◽

Yue Dian-min ◽

Liu Feng-li ◽

Wang Yun-feng ◽

Chu Chao-Hsien

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

Data Mining Algorithms ◽

Data Distortion ◽

Mining Algorithms

Download Full-text

Sensitive Items in Privacy Preserving — Association Rule Mining

Journal of Information & Knowledge Management ◽

10.1142/s0219649208001932 ◽

2008 ◽

Vol 07 (01) ◽

pp. 31-35

Author(s):

K. Duraiswamy ◽

N. Maheswari

Keyword(s):

Data Mining ◽

Private Information ◽

Association Rule ◽

Association Rule Mining ◽

Privacy Preserving ◽

Data Input ◽

Rule Mining ◽

Privacy Preserving Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Privacy-preserving has recently been proposed in response to the concerns of preserving personal or sensible information derived from data-mining algorithms. For example, through data-mining, sensible information such as private information or patterns may be inferred from non-sensible information or unclassified data. As large repositories of data contain confidential rules that must be protected before published, association rule hiding becomes one of important privacy preserving data-mining problems. There have been two types of privacy concerning data-mining. Output privacy tries to hide the mining results by minimally altering the data. Input privacy tries to manipulate the data so that the mining result is not affected or minimally affected. For some applications certain sensitive predictive rules are hidden that contain given sensitive items. To identify the sensitive items an algorithm SENSITEM is proposed. The results of the work have been given.

Download Full-text

On the design and quantification of privacy preserving data mining algorithms

Proceedings of the twentieth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS '01 ◽

10.1145/375551.375602 ◽

2001 ◽

Cited By ~ 440

Author(s):

Dakshi Agrawal ◽

Charu C. Aggarwal

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

Distributed Privacy Preserving Clustering via Homomorphic Secret Sharing and Its Application to (Vertically) Partitioned Spatio-Temporal Data

International Journal of Data Warehousing and Mining ◽

10.4018/jdwm.2011010103 ◽

2011 ◽

Vol 7 (1) ◽

pp. 46-66 ◽

Cited By ~ 10

Author(s):

Can Brochmann Yildizli ◽

Thomas Pedersen ◽

Yucel Saygin ◽

Erkay Savas ◽

Albert Levi

Keyword(s):

Data Mining ◽

Real World ◽

Clustering Algorithm ◽

Privacy Preserving ◽

Secure Multiparty Computation ◽

Multiparty Computation ◽

Privacy Preserving Data Mining ◽

Computational Costs ◽

Data Mining Algorithms ◽

Mining Algorithms

Recent concerns about privacy issues have motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. One approach to develop privacy preserving data mining algorithms is secure multiparty computation, which allows for privacy preserving data mining algorithms that do not trade accuracy for privacy. However, earlier methods suffer from very high communication and computational costs, making them infeasible to use in any real world scenario. Moreover, these algorithms have strict assumptions on the involved parties, assuming involved parties will not collude with each other. In this paper, the authors propose a new secure multiparty computation based k-means clustering algorithm that is both secure and efficient enough to be used in a real world scenario. Experiments based on realistic scenarios reveal that this protocol has lower communication costs and significantly lower computational costs.

Download Full-text

Data Mining and Privacy

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch061 ◽

2011 ◽

pp. 388-393

Author(s):

Esma Aïmeur

Keyword(s):

Data Mining ◽

Data Privacy ◽

Reconstruction Algorithm ◽

Learning Task ◽

Privacy Preserving ◽

Sources Of Information ◽

Privacy Preserving Data Mining ◽

Useful Knowledge ◽

Data Mining Algorithms ◽

Mining Algorithms

With the emergence of Internet, it is now possible to connect and access sources of information and databases throughout the world. At the same time, this raises many questions regarding the privacy and the security of the data, in particular how to mine useful information while preserving the privacy of sensible and confidential data. Privacy-preserving data mining is a relatively new but rapidly growing field that studies how data mining algorithms affect the privacy of data and tries to find and analyze new algorithms that preserve this privacy. At first glance, it may seem that data mining and privacy have orthogonal goals, the first one being concerned with the discovery of useful knowledge from data whereas the second is concerned with the protection of data’s privacy. Historically, the interactions between privacy and data mining have been questioned and studied since more than a decade ago, but the name of the domain itself was coined more recently by two seminal papers attacking the subject from two very different perspectives (Agrawal & Srikant, 2000; Lindell & Pinkas, 2000). The first paper (Agrawal & Srikant, 2000) takes the approach of randomizing the data through the injection of noise, and then recovers from it by applying a reconstruction algorithm before a learning task (the induction of a decision tree) is carried out on the reconstructed dataset. The second paper (Lindell & Pinkas, 2000) adopts a cryptographic view of the problem and rephrases it within the general framework of secure multiparty computation. The outline of this chapter is the following. First, the area of privacy-preserving data mining is illustrated through three scenarios, before a classification of privacy- preserving algorithms is described and the three main approaches currently used are detailed. Finally, the future trends and challenges that await the domain are discussed before concluding.

Download Full-text