Matrix Decomposition Techniques for Data Privacy

Data mining technologies have now been used in commercial, industrial, and governmental businesses, for various purposes, ranging from increasing profitability to enhancing national security. The widespread applications of data mining technologies have raised concerns about trade secrecy of corporations and privacy of innocent people contained in the datasets collected and used for the data mining purpose. It is necessary that data mining technologies designed for knowledge discovery across corporations and for security purpose towards general population have sufficient privacy awareness to protect the corporate trade secrecy and individual private information. Unfortunately, most standard data mining algorithms are not very efficient in terms of privacy protection, as they were originally developed mainly for commercial applications, in which different organizations collect and own their private databases, and mine their private databases for specific commercial purposes. In the cases of inter-corporation and security data mining applications, data mining algorithms may be applied to datasets containing sensitive or private information. Data warehouse owners and government agencies may potentially have access to many databases collected from different sources and may extract any information from these databases. This potentially unlimited access to data and information raises the fear of possible abuse and promotes the call for privacy protection and due process of law. Privacy-preserving data mining techniques have been developed to address these concerns (Fung et al., 2007; Zhang, & Zhang, 2007). The general goal of the privacy-preserving data mining techniques is defined as to hide sensitive individual data values from the outside world or from unauthorized persons, and simultaneously preserve the underlying data patterns and semantics so that a valid and efficient decision model based on the distorted data can be constructed. In the best scenarios, this new decision model should be equivalent to or even better than the model using the original data from the viewpoint of decision accuracy. There are currently at least two broad classes of approaches to achieving this goal. The first class of approaches attempts to distort the original data values so that the data miners (analysts) have no means (or greatly reduced ability) to derive the original values of the data. The second is to modify the data mining algorithms so that they allow data mining operations on distributed datasets without knowing the exact values of the data or without direct accessing the original datasets. This article only discusses the first class of approaches. Interested readers may consult (Clifton et al., 2003) and the references therein for discussions on distributed data mining approaches.

Download Full-text

Privacy Preserving Data Mining

10.5772/intechopen.99224 ◽

2021 ◽

Author(s):

Esma Ergüner Özkoç

Keyword(s):

Data Mining ◽

Data Privacy ◽

Personal Data ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

Data Mining Techniques ◽

Data Mining Algorithms ◽

Data Output ◽

The Individual ◽

Mining Algorithms

Data mining techniques provide benefits in many areas such as medicine, sports, marketing, signal processing as well as data and network security. However, although data mining techniques used in security subjects such as intrusion detection, biometric authentication, fraud and malware classification, “privacy” has become a serious problem, especially in data mining applications that involve the collection and sharing of personal data. For these reasons, the problem of protecting privacy in the context of data mining differs from traditional data privacy protection, as data mining can act as both a friend and foe. Chapter covers the previously developed privacy preserving data mining techniques in two parts: (i) techniques proposed for input data that will be subject to data mining and (ii) techniques suggested for processed data (output of the data mining algorithms). Also presents attacks against the privacy of data mining applications. The chapter conclude with a discussion of next-generation privacy-preserving data mining applications at both the individual and organizational levels.

Download Full-text

Sensitive Items in Privacy Preserving — Association Rule Mining

Journal of Information & Knowledge Management ◽

10.1142/s0219649208001932 ◽

2008 ◽

Vol 07 (01) ◽

pp. 31-35

Author(s):

K. Duraiswamy ◽

N. Maheswari

Keyword(s):

Data Mining ◽

Private Information ◽

Association Rule ◽

Association Rule Mining ◽

Privacy Preserving ◽

Data Input ◽

Rule Mining ◽

Privacy Preserving Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Privacy-preserving has recently been proposed in response to the concerns of preserving personal or sensible information derived from data-mining algorithms. For example, through data-mining, sensible information such as private information or patterns may be inferred from non-sensible information or unclassified data. As large repositories of data contain confidential rules that must be protected before published, association rule hiding becomes one of important privacy preserving data-mining problems. There have been two types of privacy concerning data-mining. Output privacy tries to hide the mining results by minimally altering the data. Input privacy tries to manipulate the data so that the mining result is not affected or minimally affected. For some applications certain sensitive predictive rules are hidden that contain given sensitive items. To identify the sensitive items an algorithm SENSITEM is proposed. The results of the work have been given.

Download Full-text

Classification of Privacy Preserving Data Mining Algorithms: A Review

Jurnal Elektronika dan Telekomunikasi ◽

10.14203/jet.v20.36-46 ◽

2020 ◽

Vol 20 (2) ◽

pp. 36

Author(s):

Dedi Gunawan

Keyword(s):

Data Mining ◽

Extraction Process ◽

Privacy Preserving ◽

Sensitive Information ◽

Time Data ◽

Privacy Preserving Data Mining ◽

Data Mining Techniques ◽

Data Mining Algorithms ◽

Using Data ◽

Mining Algorithms

Nowadays, data from various sources are gathered and stored in databases. The collection of the data does not give a significant impact unless the database owner conducts certain data analysis such as using data mining techniques to the databases. Presently, the development of data mining techniques and algorithms provides significant benefits for the information extraction process in terms of the quality, accuracy, and precision results. Realizing the fact that performing data mining tasks using some available data mining algorithms may disclose sensitive information of data subject in the databases, an action to protect privacy should be taken into account by the data owner. Therefore, privacy preserving data mining (PPDM) is becoming an emerging field of study in the data mining research group. The main purpose of PPDM is to investigate the side effects of data mining methods that originate from the penetration into the privacy of individuals and organizations. In addition, it guarantees that the data miners cannot reveal any personal sensitive information contained in a database, while at the same time data utility of a sanitized database does not significantly differ from that of the original one. In this paper, we present a wide view of current PPDM techniques by classifying them based on their taxonomy techniques to differentiate the characteristics of each approach. The review of the PPDM methods is described comprehensively to provide a profound understanding of the methods along with advantages, challenges, and future development for researchers and practitioners.

Download Full-text

A Survey of Quantification of Privacy Preserving Data Mining Algorithms

Privacy-Preserving Data Mining - Advances in Database Systems ◽

10.1007/978-0-387-70992-5_8 ◽

2008 ◽

pp. 183-205 ◽

Cited By ~ 64

Author(s):

Elisa Bertino ◽

Dan Lin ◽

Wei Jiang

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Privacy Preserving Data Mining ◽

Data Mining Algorithms ◽

Mining Algorithms

Download Full-text

Distributed Privacy Preserving Clustering via Homomorphic Secret Sharing and Its Application to (Vertically) Partitioned Spatio-Temporal Data

Cyber Crime ◽

10.4018/978-1-61350-323-2.ch212 ◽

2013 ◽

pp. 395-415 ◽

Cited By ~ 1

Author(s):

Can Brochmann Yildizli ◽

Thomas Pedersen ◽

Yucel Saygin ◽

Erkay Savas ◽

Albert Levi

Keyword(s):

Data Mining ◽

Real World ◽

Privacy Preserving ◽

Secure Multiparty Computation ◽

Multiparty Computation ◽

Privacy Preserving Data Mining ◽

Computational Costs ◽

Data Mining Algorithms ◽

Spatio Temporal ◽

Mining Algorithms

Recent concerns about privacy issues have motivated data mining researchers to develop methods for performing data mining while preserving the privacy of individuals. One approach to develop privacy preserving data mining algorithms is secure multiparty computation, which allows for privacy preserving data mining algorithms that do not trade accuracy for privacy. However, earlier methods suffer from very high communication and computational costs, making them infeasible to use in any real world scenario. Moreover, these algorithms have strict assumptions on the involved parties, assuming involved parties will not collude with each other. In this paper, the authors propose a new secure multiparty computation based k-means clustering algorithm that is both secure and efficient enough to be used in a real world scenario. Experiments based on realistic scenarios reveal that this protocol has lower communication costs and significantly lower computational costs.

Download Full-text