Haphazard, enhanced haphazard and personalised anonymisation for privacy preserving data mining on sensitive data sources

Privacy-preserving data mining (PPDM) refers to data mining techniques developed to protect sensitive data while allowing useful information to be discovered from the data. In this article, we review PPDM and present a broad survey of related issues, techniques, measures, applications, and regulation guidelines. We observe that the rapid pace of change in information technologies available to sustain PPDM has created a gap between theory and practice. We posit that without a clear understanding of the practice, this gap will be widening which, ultimately, will be detrimental to the field. We conclude by proposing a comprehensive research agenda intended to bridge the gap relevant to practice and as a reference basis for the future related legislation activities.

Download Full-text

Semantically Secure Classifiers for Privacy Preserving Data Mining

Research Anthology on Privatizing and Securing Data ◽

10.4018/978-1-7998-8954-0.ch049 ◽

2021 ◽

pp. 1066-1095

Author(s):

Sumana M. ◽

Hareesha K. S. ◽

Sampath Kumar

Keyword(s):

Data Mining ◽

Model Building ◽

Computation Time ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Preserving Data Mining ◽

Intermediate Data ◽

Secure Protocols ◽

Mean And Variance ◽

Probabilistic Property

Essential predictions are to be made by the parties distributed at multiple locations. However, in the process of building a model, perceptive data is not to be revealed. Maintaining the privacy of such data is a foremost concern. Earlier approaches developed for classification and prediction are proven not to be secure enough and the performance is affected. This chapter focuses on the secure construction of commonly used classifiers. The computations performed during model building are proved to be semantically secure. The homomorphism and probabilistic property of Paillier is used to perform secure product, mean, and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost. It is also proved that proposed privacy preserving classifiers perform significantly better than the base classifiers.

Download Full-text

Data Transformation and Data Transitive Techniques for Protecting Sensitive Data in Privacy Preserving Data Mining

Lecture Notes in Electrical Engineering - Emerging Trends in Computing, Informatics, Systems Sciences, and Engineering ◽

10.1007/978-1-4614-3558-7_28 ◽

2012 ◽

pp. 345-355

Author(s):

S. Vijayarani ◽

A. Tamilarasi

Keyword(s):

Data Mining ◽

Data Transformation ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Preserving Data Mining

Download Full-text

Collusion-Free Privacy Preserving Data Mining

International Journal of Intelligent Information Technologies ◽

10.4018/jiit.2010100103 ◽

2010 ◽

Vol 6 (4) ◽

pp. 30-45 ◽

Cited By ~ 7

Author(s):

M. Rajalakshmi ◽

T. Purusothaman ◽

S. Pratheeba

Keyword(s):

Data Mining ◽

Association Rule ◽

Privacy Preserving ◽

Frequent Itemsets ◽

Data Sources ◽

Sensitive Information ◽

Distributed Data ◽

Distributed Environment ◽

Rule Mining ◽

Privacy Preserving Data Mining

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.

Download Full-text

Collusion-Free Privacy Preserving Data Mining

Insights into Advancements in Intelligent Information Technologies ◽

10.4018/978-1-4666-0158-1.ch015 ◽

2012 ◽

pp. 269-284

Author(s):

T. Purusothaman ◽

M. Rajalakshmi ◽

S. Pratheeba

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Frequent Itemsets ◽

Data Sources ◽

Sensitive Information ◽

Distributed Data ◽

Distributed Environment ◽

Rule Mining ◽

Privacy Preserving Data Mining ◽

Distributed Association

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.

Download Full-text

Semantically Secure Classifiers for Privacy Preserving Data Mining

Security and Privacy Management, Techniques, and Protocols - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-5583-4.ch003 ◽

2018 ◽

pp. 66-95

Author(s):

Sumana M. ◽

Hareesha K. S. ◽

Sampath Kumar

Keyword(s):

Data Mining ◽

Model Building ◽

Computation Time ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Preserving Data Mining ◽

Intermediate Data ◽

Secure Protocols ◽

Mean And Variance ◽

Probabilistic Property

Essential predictions are to be made by the parties distributed at multiple locations. However, in the process of building a model, perceptive data is not to be revealed. Maintaining the privacy of such data is a foremost concern. Earlier approaches developed for classification and prediction are proven not to be secure enough and the performance is affected. This chapter focuses on the secure construction of commonly used classifiers. The computations performed during model building are proved to be semantically secure. The homomorphism and probabilistic property of Paillier is used to perform secure product, mean, and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost. It is also proved that proposed privacy preserving classifiers perform significantly better than the base classifiers.

Download Full-text

Privacy-Preserving Data Mining and the Need for Confluence of Research and Practice

Techniques and Applications for Advanced Information Privacy and Security ◽

10.4018/978-1-60566-210-7.ch005 ◽

2011 ◽

pp. 60-78

Author(s):

Lixin Fu ◽

Hamid Nemati ◽

Fereidoon Sadri

Keyword(s):

Data Mining ◽

Research Agenda ◽

Information Technologies ◽

Privacy Preserving ◽

Theory And Practice ◽

Clear Understanding ◽

Sensitive Data ◽

Privacy Preserving Data Mining ◽

Research And Practice ◽

Rapid Pace

Privacy-Preserving Data Mining (PPDM) refers to data mining techniques developed to protect sensitive data while allowing useful information to be discovered from the data. In this chapter the review PPDM and present a broad survey of related issues, techniques, measures, applications, and regulation guidelines. The authors observe that the rapid pace of change in information technologies available to sustain PPDM has created a gap between theory and practice. They posit that without a clear understanding of the practice, this gap will be widening, which, ultimately will be detrimental to the field. They conclude by proposing a comprehensive research agenda intended to bridge the gap relevant to practice and as a reference basis for the future related legislation activities.

Download Full-text

Web based Privacy Disclosure Threats and Control Techniques

Advances in Web Technologies and Engineering - Design Solutions for Improving Website Quality and Effectiveness ◽

10.4018/978-1-4666-9764-5.ch014 ◽

2016 ◽

pp. 334-341

Author(s):

Tithi Hunka ◽

Sital Dash ◽

Prasant Kumar Pattnaik

Keyword(s):

Data Mining ◽

Privacy Preserving ◽

Extensive Study ◽

Security And Privacy ◽

Sensitive Data ◽

Privacy Preserving Data Mining ◽

Web Based ◽

Data Mining Techniques ◽

The Public ◽

And Control

Due to advancement of internet technologies, web based applications are gaining popularity day by day. Many organizations maintain large volumes of web site based data about individuals that may carry information that cannot be revealed to the public or researchers. While web-based applications are becoming increasingly pervasive by nature, they also present new security and privacy challenges. However, privacy threats effects negatively on sensitive data and possibly leads to the leakage of confidential information. More ever, privacy preserving data mining techniques allow us to protect the sensitive data before it gets published to the public by changing the original micro-data format and contents. This chapter is intended to undertake an extensive study on some ramified disclosure threats to the privacy and PPDM (privacy preserving data mining) techniques as a unified solution to protect against threats.

Download Full-text

Privacy Preserving Mining in Code Profiling Data

International Journal of Engineering and Management Research ◽

10.31033/ijemr.8.5.5 ◽

2018 ◽

Vol 8 (5) ◽

Author(s):

Meenakshi Kathayat

Keyword(s):

Data Mining ◽

Software Engineering ◽

Future Development ◽

Data Privacy ◽

Software Metrics ◽

Privacy Preserving ◽

Sensitive Data ◽

Privacy Preserving Data Mining ◽

Encrypted Data

Privacy preserving data mining is an important issue nowadays for data mining. Since various organizations and people are generating sensitive data or information these days. They don’t want to share their sensitive data however that data can be useful for data mining purpose. So, due to privacy preserving mining that data can be mined usefully without harming the privacy of that data. Privacy can be preserved by applying encryption on database which is to be mined because now the data is secure due to encryption. Code profiling is a field in software engineering where we can apply data mining to discover some knowledge so that it will be useful in future development of software. In this work we have applied privacy preserving mining in code profiling data such as software metrics of various codes. Results of data mining on actual and encrypted data are compared for accuracy. We have also analyzed the results of privacy preserving mining in code profiling data and found interesting results.

Download Full-text

Privacy-Preserving Data Mining and the Need for Confluence of Research and Practice

Information Security and Ethics ◽

10.4018/978-1-59904-937-3.ch232 ◽

2008 ◽

pp. 3451-3469

Author(s):

Lixin Fu ◽

Hamid Nemati ◽

Fereidoon Sadri

Keyword(s):

Data Mining ◽

Research Agenda ◽

Information Technologies ◽

Privacy Preserving ◽

Theory And Practice ◽

Clear Understanding ◽

Sensitive Data ◽

Privacy Preserving Data Mining ◽

Research And Practice ◽

Rapid Pace

Privacy-preserving data mining (PPDM) refers to data mining techniques developed to protect sensitive data while allowing useful information to be discovered from the data. In this article, we review PPDM and present a broad survey of related issues, techniques, measures, applications, and regulation guidelines. We observe that the rapid pace of change in information technologies available to sustain PPDM has created a gap between theory and practice. We posit that without a clear understanding of the practice, this gap will be widening which, ultimately, will be detrimental to the field. We conclude by proposing a comprehensive research agenda intended to bridge the gap relevant to practice and as a reference basis for the future related legislation activities.

Download Full-text