Efficient Association Rules Hiding Using Genetic Algorithms

In today’s world, millions of transactions are connected to online businesses, and the main challenging task is ensuring the privacy of sensitive information. Sensitive association rules hiding (SARH) is an important goal of privacy protection algorithms. Various approaches and algorithms have been developed for sensitive association rules hiding, differentiated according to their hiding performance through utility preservation, prevention of ghost rules, and computational complexity. A meta-heuristic algorithm is a good candidate to solve the problem of SARH due to its selective and parallel search behavior, avoiding local minima capability. This paper proposes simple genetic encoding for SARH. The proposed algorithm formulates an objective function that estimates the effect on nonsensitive rules and offers recursive computation to reduce them. Three benchmark datasets were used for evaluation. The results show an improvement of 81% in execution time, 23% in utility, and 5% in accuracy.

Download Full-text

Finding the Number of Clusters in Data and Better Initial Centers for K-means Algorithm

International Journal of Intelligent Systems and Applications ◽

10.5815/ijisa.2020.06.01 ◽

2020 ◽

Vol 12 (6) ◽

pp. 1-20

Author(s):

Ahmed Fahim ◽

Keyword(s):

Data Clustering ◽

Linear Time ◽

Original Data ◽

Local Minima ◽

Expected Number ◽

Open Problems ◽

Number Of Clusters ◽

Benchmark Datasets ◽

Selection Of

The k-means is the most well-known algorithm for data clustering in data mining. Its simplicity and speed of convergence to local minima are the most important advantages of it, in addition to its linear time complexity. The most important open problems in this algorithm are the selection of initial centers and the determination of the exact number of clusters in advance. This paper proposes a solution for these two problems together; by adding a preprocess step to get the expected number of clusters in data and better initial centers. There are many researches to solve each of these problems separately, but there is no research to solve both problems together. The preprocess step requires o(n log n); where n is size of the dataset. This preprocess step aims to get initial portioning of data without determining the number of clusters in advance, then computes the means of initial clusters. After that we apply k-means on original data using the resulting information from the preprocess step to get the final clusters. We use many benchmark datasets to test the proposed method. The experimental results show the efficiency of the proposed method.

Download Full-text

Learning With Differential Privacy

Handbook of Research on Cyber Crime and Information Privacy - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-5728-0.ch019 ◽

2021 ◽

pp. 372-395

Author(s):

Poushali Sengupta ◽

Sudipta Paul ◽

Subhankar Mishra

Keyword(s):

Privacy Protection ◽

Differential Privacy ◽

Intrusion Detection Systems ◽

Sensitive Information ◽

Detection Systems ◽

Trade Offs ◽

Personal Level ◽

Encryption Decryption ◽

Individual Trees ◽

Prevention Methods

The leakage of data might have an extreme effect on the personal level if it contains sensitive information. Common prevention methods like encryption-decryption, endpoint protection, intrusion detection systems are prone to leakage. Differential privacy comes to the rescue with a proper promise of protection against leakage, as it uses a randomized response technique at the time of collection of the data which promises strong privacy with better utility. Differential privacy allows one to access the forest of data by describing their pattern of groups without disclosing any individual trees. The current adaption of differential privacy by leading tech companies and academia encourages authors to explore the topic in detail. The different aspects of differential privacy, its application in privacy protection and leakage of information, a comparative discussion on the current research approaches in this field, its utility in the real world as well as the trade-offs will be discussed.

Download Full-text

Modifying Transactional Databases to Hide Sensitive Association Rules

Information Systems Research ◽

10.1287/isre.2021.1033 ◽

2021 ◽

Author(s):

Syam Menon ◽

Abhijeet Ghoshal ◽

Sumit Sarkar

Keyword(s):

Supply Chain ◽

Association Rules ◽

Association Rule ◽

Sensitive Information ◽

Data Set ◽

Problem Reduction ◽

Transactional Databases ◽

Potential Risks ◽

Integer Formulations ◽

Number Of Firms

Although firms recognize the value in sharing data with supply chain partners, many remain reluctant to share for fear of sensitive information potentially making its way to competitors. Approaches that can help hide sensitive information could alleviate such concerns and increase the number of firms that are willing to share. Sensitive information in transactional databases often manifests itself in the form of association rules. The sensitive association rules can be concealed by altering transactions so that they remain hidden when the data are mined by the partner. The problem of hiding these rules in the data are computationally difficult (NP-hard), and extant approaches are all heuristic in nature. To our knowledge, this is the first paper that introduces the problem as a nonlinear integer formulation to hide the sensitive association rule while minimizing the alterations needed in the data set. We apply transformations that linearize the constraints and derive various results that help reduce the size of the problem to be solved. Our results show that although the nonlinear integer formulations are not practical, the linearizations and problem-reduction steps make a significant impact on solvability and solution time. This approach mitigates potential risks associated with sharing and should increase data sharing among supply chain partners.

Download Full-text

Privacy Protection of Class Association Rules produced by medical datasets

2019 IEEE 5th International Conference for Convergence in Technology (I2CT) ◽

10.1109/i2ct45611.2019.9033692 ◽

2019 ◽

Cited By ~ 1

Author(s):

Priyanka Garach ◽

Darshana Patel

Keyword(s):

Association Rules ◽

Privacy Protection

Download Full-text

Privacy protection in mobile crowd sensing: a survey

World Wide Web ◽

10.1007/s11280-019-00745-2 ◽

2019 ◽

Vol 23 (1) ◽

pp. 421-452 ◽

Cited By ~ 1

Author(s):

Yongfeng Wang ◽

Zheng Yan ◽

Wei Feng ◽

Shushu Liu

Keyword(s):

Privacy Protection ◽

Data Privacy ◽

Large Scale ◽

System Structure ◽

Smart Devices ◽

Future Research ◽

Sensitive Information ◽

Mobile Crowd Sensing ◽

Crowd Sensing ◽

Mobile Crowd

AbstractThe unprecedented proliferation of mobile smart devices has propelled a promising computing paradigm, Mobile Crowd Sensing (MCS), where people share surrounding insight or personal data with others. As a fast, easy, and cost-effective way to address large-scale societal problems, MCS is widely applied into many fields, e.g., environment monitoring, map construction, public safety, etc. Despite the popularity, the risk of sensitive information disclosure in MCS poses a serious threat to the participants and limits its further development in privacy-sensitive fields. Thus, the research on privacy protection in MCS becomes important and urgent. This paper targets the privacy issues of MCS and conducts a comprehensive literature research on it by providing a thorough survey. We first introduce a typical system structure of MCS, summarize its characteristics, propose essential requirements on privacy on the basis of a threat model. Then, we survey existing solutions on privacy protection and evaluate their performances by employing the proposed requirements. In essence, we classify the privacy protection schemes into four categories with regard to identity privacy, data privacy, attribute privacy, and task privacy. Besides, we review the achievements on privacy-preserving incentives in MCS from four viewpoints of incentive measures: credit incentive, auction incentive, currency incentive, and reputation incentive. Finally, we point out some open issues and propose future research directions based on the findings from our survey.

Download Full-text

Mining Undominated Association Rules Through Interestingness Measures

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213014600112 ◽

2014 ◽

Vol 23 (04) ◽

pp. 1460011 ◽

Cited By ~ 16

Author(s):

Slim Bouker ◽

Rabie Saidi ◽

Sadok Ben Yahia ◽

Engelbert Mephu Nguifo

Keyword(s):

Association Rules ◽

Threshold Value ◽

Interestingness Measures ◽

Novel Approach ◽

Benchmark Datasets ◽

Efficient Selection ◽

Selection Of

The increasing growth of databases raises an urgent need for more accurate methods to better understand the stored data. In this scope, association rules were extensively used for the analysis and the comprehension of huge amounts of data. However, the number of generated rules is too large to be efficiently analyzed and explored in any further process. In order to bypass this hamper, an efficient selection of rules has to be performed. Since selection is necessarily based on evaluation, many interestingness measures have been proposed. However, the abundance of these measures gave rise to a new problem, namely the heterogeneity of the evaluation results and this created confusion to the decision. In this respect, we propose a novel approach to discover interesting association rules without favoring or excluding any measure by adopting the notion of dominance between association rules. Our approach bypasses the problem of measure heterogeneity and unveils a compromise between their evaluations. Interestingly enough, the proposed approach also avoids another non-trivial problem which is the threshold value specification. Extensive carried out experiments on benchmark datasets show the benefits of the introduced approach.

Download Full-text

Privacy Preserving Spatio-Temporal Databases Based on k-Anonymity

Science & Technology Development Journal - Engineering and Technology ◽

10.32508/stdjet.v3isi1.517 ◽

2020 ◽

Vol 3 (SI1) ◽

pp. SI82-SI94

Author(s):

Anh Tuan Truong

Keyword(s):

Data Mining ◽

Privacy Protection ◽

Location Privacy ◽

Location Based Services ◽

Sensitive Information ◽

User Privacy ◽

Location Data ◽

Related Information ◽

Spatio Temporal ◽

Location Privacy Protection

The development of location-based services and mobile devices has lead to an increase in the location data. Through the data mining process, some valuable information can be discovered from location data. In the other words, an attacker may also extract some private (sensitive) information of the user and this may make threats against the user privacy. Therefore, location privacy protection becomes an important requirement to the success in the development of location-based services. In this paper, we propose a grid-based approach as well as an algorithm to guarantee k-anonymity, a well-known privacy protection approach, in a location database. The proposed approach considers only the information that has significance for the data mining process while ignoring the un-related information. The experiment results show the effectiveness of the proposed approach in comparison with the literature ones.

Download Full-text

Survey on Association Rule Hiding Techniques

International Journal of Scientific Research in Science Engineering and Technology ◽

10.32628/ijsrset196368 ◽

2019 ◽

pp. 300-305

Author(s):

G. Bhavani ◽

S. Sivakumari

Keyword(s):

Data Mining ◽

Comparative Analysis ◽

Association Rules ◽

Association Rule ◽

Privacy Preservation ◽

Sensitive Information ◽

Privacy Preserving Data Mining ◽

Interesting Part ◽

Future Improvement ◽

Sensitive Knowledge

Data mining process extracts useful information from a large amount of data. The most interesting part of data mining is discovering the unseen patterns without unpacking sensitive knowledge. Privacy Preserving Data Mining abbreviated as PPDM deals with the issue of sustaining the privacy of information. This methodology covers the sensitive information from disclosure. PPDM techniques are established for hiding the sensitive information even after performing the data mining. One of the practices to hide the sensitive association rules is termed as association rule hiding. The main objective of association rule hiding algorithm is to slightly adjust the original database so that no sensitive association rule is derived from it. The following article presents a detailed survey of various association rule hiding techniques for preserving privacy in data mining. At first, different techniques developed by previous researchers are studied in detail. Then, a comparative analysis is carried out to know the limitations of each technique and then providing a suggestion for future improvement in association rule hiding for privacy preservation.

Download Full-text

Privacy preserving association rule hiding using border based approach

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v23.i2.pp1137-1145 ◽

2021 ◽

Vol 23 (2) ◽

pp. 1137

Author(s):

Suma B. ◽

Shobha G.

Keyword(s):

Data Mining ◽

Association Rules ◽

Association Rule ◽

Association Rule Mining ◽

Sensitive Information ◽

Rule Mining ◽

Data Mining Technique ◽

Large Databases ◽

Hidden Correlations ◽

Rule Set

<div>Association rule mining is a well-known data mining technique used for extracting hidden correlations between data items in large databases. In the majority of the situations, data mining results contain sensitive information about individuals and publishing such data will violate individual secrecy. The challenge of association rule mining is to preserve the confidentiality of sensitive rules when releasing the database to external parties. The association rule hiding technique conceals the knowledge extracted by the sensitive association rules by modifying the database. In this paper, we introduce a border-based algorithm for hiding sensitive association rules. The main purpose of this approach is to conceal the sensitive rule set while maintaining the utility of the database and association rule mining results at the highest level. The performance of the algorithm in terms of the side effects is demonstrated using experiments conducted on two real datasets. The results show that the information loss is minimized without sacrificing the accuracy. </div>

Download Full-text

Leveraging Access Control for Privacy Protection

Privacy Protection Measures and Technologies in Business Organizations ◽

10.4018/978-1-61350-501-4.ch003 ◽

2012 ◽

pp. 65-94 ◽

Cited By ~ 11

Author(s):

Anna Antonakopoulou ◽

Georgios V. Lioudakis ◽

Fotios Gogoulos ◽

Dimitra I. Kaklamani ◽

Iakovos S. Venieris

Keyword(s):

Access Control ◽

Privacy Protection ◽

Personal Information ◽

Fundamental Aspect ◽

Sensitive Information ◽

Business Environments ◽

Security Models ◽

Modern Business ◽

Privacy Violation ◽

Definition Of

Modern business environments amass and exchange a great deal of sensitive information about their employees, customers, products, et cetera, acknowledging privacy to be not only a business but also an ethical and legal requirement. Any privacy violation certainly includes some access to personal information and, intuitively, access control constitutes a fundamental aspect of privacy protection. In that respect, many organizations use security policies to control access to sensitive resources and the employed security models must provide means to handle flexible and dynamic requirements. Consequently, the definition of an expressive privacy-aware access control model constitutes a crucial issue. Among the technologies proposed, there are various access control models incorporating features designed to enforce privacy protection policies, taking mainly into account the purpose of the access, privacy obligations, as well as other contextual constraints, aiming at the accomplishment of the privacy protection requirements. This chapter studies these models, along with the aforementioned features.

Download Full-text