A State Decision Tree based Backtracking Algorithm for Multi-Sensitive Attribute Privacy Preserving

Author(s):  
Yanchao Zhang ◽  
Qing Liu ◽  
JunJun Cheng ◽  
JiJia Yang

Beyond l-diversity model, an algorithm (l-BDT) based on state decision tree is proposed in this paper, which aims at protecting multi-sensitive attributes from being attacked. The algorithm considers the whole situations in equivalence partitioning for the first, prunes the decision tree according to some conditions for the second, and screens out the method with the least information loss of equivalence partitioning for the last. The analysis and experiments show that the l-BDT algorithm has the best performance in controlling the information loss. It can be ensured that the published data is the most closed towards the original data, so as to ensure that the published data is as useful as possible.

Information ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 166
Author(s):  
Yuelei Xiao ◽  
Haiqi Li

Privacy preserving data publishing has received considerable attention for publishing useful information while preserving data privacy. The existing privacy preserving data publishing methods for multiple sensitive attributes do not consider the situation that different values of a sensitive attribute may have different sensitivity requirements. To solve this problem, we defined three security levels for different sensitive attribute values that have different sensitivity requirements, and given an L s l -diversity model for multiple sensitive attributes. Following this, we proposed three specific greed algorithms based on the maximal-bucket first (MBF), maximal single-dimension-capacity first (MSDCF) and maximal multi-dimension-capacity first (MMDCF) algorithms and the maximal security-level first (MSLF) greed policy, named as MBF based on MSLF (MBF-MSLF), MSDCF based on MSLF (MSDCF-MSLF) and MMDCF based on MSLF (MMDCF-MSLF), to implement the L s l -diversity model for multiple sensitive attributes. The experimental results show that the three algorithms can greatly reduce the information loss of the published microdata, but their runtime is only a small increase, and their information loss tends to be stable with the increasing of data volume. And they can solve the problem that the information loss of MBF, MSDCF and MMDCF increases greatly with the increasing of sensitive attribute number.


2013 ◽  
Vol 4 (3) ◽  
pp. 813-820
Author(s):  
Kiran P ◽  
Kavya N. P.

The core objective of privacy preserving data mining is to preserve the confidentiality of individual even after mining. The basic advantage of personalized privacy preservation is that the information loss is very less as compared with other privacy preservation algorithms. These algorithms how ever have not been designed for specific mining algorithms. SW-SDF personalized privacy preservation uses two flags SW and SDF. SW is used for assigning a weight for the sensitive attribute and SDF for sensitive disclosure which is accepted from individual. In this paper we have designed an algorithm which uses SW-SDF personal privacy preservation for data classification. This method ensures privacy and classification of data.


Author(s):  
SUSANA LADRA ◽  
VICENÇ TORRA

Synthetic data generators are one of the methods used in privacy preserving data mining for ensuring the privacy of the individuals when their data are published. Synthetic data generators construct artificial data from some models obtained from the original data. Such models are mainly based on statistics and, typically, do not take into account other aspects of interest in artificial intelligence. In this paper we study whether one family of such synthetic data generators (the IPSO family) preserves the properties of the data that are of interest when users plan to apply clustering techniques. In particular, we study the effect of such synthetic data generators on fuzzy clustering. That is, we study the information loss data suffer when the original data are replaced by the synthetic ones.


2021 ◽  
Author(s):  
Vikas Thammanna Gowda

Although k-Anonymity is a good way to publish microdata for research purposes, it still suffers from various attacks. Hence, many refinements of k-Anonymity have been proposed such as ldiversity and t-Closeness, with t-Closeness being one of the strictest privacy models. Satisfying t-Closeness for a lower value of t may yield equivalence classes with high number of records which results in a greater information loss. For a higher value of t, equivalence classes are still prone to homogeneity, skewness, and similarity attacks. This is because equivalence classes can be formed with fewer distinct sensitive attribute values and still satisfy the constraint t. In this paper, we introduce a new algorithm that overcomes the limitations of k-Anonymity and lDiversity and yields equivalence classes of size k with greater diversity and frequency of a SA value in all the equivalence classes differ by at-most one.


2021 ◽  
Vol 11 (12) ◽  
pp. 3164-3173
Author(s):  
R. Indhumathi ◽  
S. Sathiya Devi

Data sharing is essential in present biomedical research. A large quantity of medical information is gathered and for different objectives of analysis and study. Because of its large collection, anonymity is essential. Thus, it is quite important to preserve privacy and prevent leakage of sensitive information of patients. Most of the Anonymization methods such as generalisation, suppression and perturbation are proposed to overcome the information leak which degrades the utility of the collected data. During data sanitization, the utility is automatically diminished. Privacy Preserving Data Publishing faces the main drawback of maintaining tradeoff between privacy and data utility. To address this issue, an efficient algorithm called Anonymization based on Improved Bucketization (AIB) is proposed, which increases the utility of published data while maintaining privacy. The Bucketization technique is used in this paper with the intervention of the clustering method. The proposed work is divided into three stages: (i) Vertical and Horizontal partitioning (ii) Assigning Sensitive index to attributes in the cluster (iii) Verifying each cluster against privacy threshold (iv) Examining for privacy breach in Quasi Identifier (QI). To increase the utility of published data, the threshold value is determined based on the distribution of elements in each attribute, and the anonymization method is applied only to the specific QI element. As a result, the data utility has been improved. Finally, the evaluation results validated the design of paper and demonstrated that our design is effective in improving data utility.


2019 ◽  
pp. 1518-1538
Author(s):  
Sowmyarani C. N. ◽  
Dayananda P.

Privacy attack on individual records has great concern in privacy preserving data publishing. When an intruder who is interested to know the private information of particular person of his interest, will acquire background knowledge about the person. This background knowledge may be gained though publicly available information such as Voter's id or through social networks. Combining this background information with published data; intruder may get the private information causing a privacy attack of that person. There are many privacy attack models. Most popular attack models are discussed in this chapter. The study of these attack models plays a significant role towards the invention of robust Privacy preserving models.


Cybersecurity ◽  
2019 ◽  
Vol 2 (1) ◽  
Author(s):  
Qingfeng Chen ◽  
Xu Zhang ◽  
Ruchang Zhang

Sign in / Sign up

Export Citation Format

Share Document