Haphazard, enhanced haphazard and personalised anonymisation for privacy preserving data mining on sensitive data sources

Author(s):  
M. Prakash ◽  
G. Singaravel
2008 ◽  
pp. 2402-2420
Author(s):  
Lixin Fu ◽  
Hamid Nemati ◽  
Fereidoon Sadri

Privacy-preserving data mining (PPDM) refers to data mining techniques developed to protect sensitive data while allowing useful information to be discovered from the data. In this article, we review PPDM and present a broad survey of related issues, techniques, measures, applications, and regulation guidelines. We observe that the rapid pace of change in information technologies available to sustain PPDM has created a gap between theory and practice. We posit that without a clear understanding of the practice, this gap will be widening which, ultimately, will be detrimental to the field. We conclude by proposing a comprehensive research agenda intended to bridge the gap relevant to practice and as a reference basis for the future related legislation activities.


Author(s):  
Sumana M. ◽  
Hareesha K. S. ◽  
Sampath Kumar

Essential predictions are to be made by the parties distributed at multiple locations. However, in the process of building a model, perceptive data is not to be revealed. Maintaining the privacy of such data is a foremost concern. Earlier approaches developed for classification and prediction are proven not to be secure enough and the performance is affected. This chapter focuses on the secure construction of commonly used classifiers. The computations performed during model building are proved to be semantically secure. The homomorphism and probabilistic property of Paillier is used to perform secure product, mean, and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost. It is also proved that proposed privacy preserving classifiers perform significantly better than the base classifiers.


2010 ◽  
Vol 6 (4) ◽  
pp. 30-45 ◽  
Author(s):  
M. Rajalakshmi ◽  
T. Purusothaman ◽  
S. Pratheeba

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.


Author(s):  
T. Purusothaman ◽  
M. Rajalakshmi ◽  
S. Pratheeba

Distributed association rule mining is an integral part of data mining that extracts useful information hidden in distributed data sources. As local frequent itemsets are globalized from data sources, sensitive information about individual data sources needs high protection. Different privacy preserving data mining approaches for distributed environment have been proposed but in the existing approaches, collusion among the participating sites reveal sensitive information about the other sites. In this paper, the authors propose a collusion-free algorithm for mining global frequent itemsets in a distributed environment with minimal communication among sites. This algorithm uses the techniques of splitting and sanitizing the itemsets and communicates to random sites in two different phases, thus making it difficult for the colluders to retrieve sensitive information. Results show that the consequence of collusion is reduced to a greater extent without affecting mining performance and confirms optimal communication among sites.


Author(s):  
Sumana M. ◽  
Hareesha K. S. ◽  
Sampath Kumar

Essential predictions are to be made by the parties distributed at multiple locations. However, in the process of building a model, perceptive data is not to be revealed. Maintaining the privacy of such data is a foremost concern. Earlier approaches developed for classification and prediction are proven not to be secure enough and the performance is affected. This chapter focuses on the secure construction of commonly used classifiers. The computations performed during model building are proved to be semantically secure. The homomorphism and probabilistic property of Paillier is used to perform secure product, mean, and variance calculations. The secure computations are performed without any intermediate data or the sensitive data at multiple sites being revealed. It is observed that the accuracy of the classifiers modeled is almost equivalent to the non-privacy preserving classifiers. Secure protocols require reduced computation time and communication cost. It is also proved that proposed privacy preserving classifiers perform significantly better than the base classifiers.


Author(s):  
Lixin Fu ◽  
Hamid Nemati ◽  
Fereidoon Sadri

Privacy-Preserving Data Mining (PPDM) refers to data mining techniques developed to protect sensitive data while allowing useful information to be discovered from the data. In this chapter the review PPDM and present a broad survey of related issues, techniques, measures, applications, and regulation guidelines. The authors observe that the rapid pace of change in information technologies available to sustain PPDM has created a gap between theory and practice. They posit that without a clear understanding of the practice, this gap will be widening, which, ultimately will be detrimental to the field. They conclude by proposing a comprehensive research agenda intended to bridge the gap relevant to practice and as a reference basis for the future related legislation activities.


Author(s):  
Tithi Hunka ◽  
Sital Dash ◽  
Prasant Kumar Pattnaik

Due to advancement of internet technologies, web based applications are gaining popularity day by day. Many organizations maintain large volumes of web site based data about individuals that may carry information that cannot be revealed to the public or researchers. While web-based applications are becoming increasingly pervasive by nature, they also present new security and privacy challenges. However, privacy threats effects negatively on sensitive data and possibly leads to the leakage of confidential information. More ever, privacy preserving data mining techniques allow us to protect the sensitive data before it gets published to the public by changing the original micro-data format and contents. This chapter is intended to undertake an extensive study on some ramified disclosure threats to the privacy and PPDM (privacy preserving data mining) techniques as a unified solution to protect against threats.


Author(s):  
Meenakshi Kathayat

Privacy preserving data mining is an important issue nowadays for data mining. Since various organizations and people are generating sensitive data or information these days. They don’t want to share their sensitive data however that data can be useful for data mining purpose. So, due to privacy preserving mining that data can be mined usefully without harming the privacy of that data. Privacy can be preserved by applying encryption on database which is to be mined because now the data is secure due to encryption. Code profiling is a field in software engineering where we can apply data mining to discover some knowledge so that it will be useful in future development of software. In this work we have applied privacy preserving mining in code profiling data such as software metrics of various codes. Results of data mining on actual and encrypted data are compared for accuracy. We have also analyzed the results of privacy preserving mining in code profiling data and found interesting results.


2008 ◽  
pp. 3451-3469
Author(s):  
Lixin Fu ◽  
Hamid Nemati ◽  
Fereidoon Sadri

Privacy-preserving data mining (PPDM) refers to data mining techniques developed to protect sensitive data while allowing useful information to be discovered from the data. In this article, we review PPDM and present a broad survey of related issues, techniques, measures, applications, and regulation guidelines. We observe that the rapid pace of change in information technologies available to sustain PPDM has created a gap between theory and practice. We posit that without a clear understanding of the practice, this gap will be widening which, ultimately, will be detrimental to the field. We conclude by proposing a comprehensive research agenda intended to bridge the gap relevant to practice and as a reference basis for the future related legislation activities.


Sign in / Sign up

Export Citation Format

Share Document