A Novel Approach for Personalized Privacy Preserving Data Publishing with Multiple Sensitive Attributes

The Personalized Privacy has drawn a lot of attention from diverse magnitudes of the public and various functional units like bureau of statistics, and hospitals. A large number of data publishing models and methods have been proposed and most of them focused on single sensitive attribute. A few research papers marked the need for preserving privacy of data consisting of multiple sensitive attributes. Applying the existing methods such as k-anonymity, l-diversity directly for publishing multiple sensitive attributes would minimize the utility of the data. Moreover, personalization has not been studied in this dimension. In this paper, we present a publishing model that manages personalization for publishing data with multiple sensitive attributes. The model uses slicing technique supported by deterministic anonymization for quasi identifiers; generalization for categorical sensitive attributes; and fuzzy approach for numerical sensitive attributes based on diversity. We cap the belief of an adversary inferring a sensitive value in a published data set to as high as that of an inference based on public knowledge. The experiments were carried out on census dataset and synthetic datasets. The results ensure that the privacy is being safeguarded without any compromise on the utility of the data.

Download Full-text

Anonymization Based on Improved Bucketization (AIB): A Privacy-Preserving Data Publishing Technique for Improving Data Utility in Healthcare Data

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3901 ◽

2021 ◽

Vol 11 (12) ◽

pp. 3164-3173

Author(s):

R. Indhumathi ◽

S. Sathiya Devi

Keyword(s):

Medical Information ◽

Threshold Value ◽

Privacy Preserving ◽

Data Publishing ◽

Published Data ◽

Sensitive Information ◽

Data Utility ◽

Healthcare Data ◽

Privacy Preserving Data Publishing ◽

Horizontal Partitioning

Data sharing is essential in present biomedical research. A large quantity of medical information is gathered and for different objectives of analysis and study. Because of its large collection, anonymity is essential. Thus, it is quite important to preserve privacy and prevent leakage of sensitive information of patients. Most of the Anonymization methods such as generalisation, suppression and perturbation are proposed to overcome the information leak which degrades the utility of the collected data. During data sanitization, the utility is automatically diminished. Privacy Preserving Data Publishing faces the main drawback of maintaining tradeoff between privacy and data utility. To address this issue, an efficient algorithm called Anonymization based on Improved Bucketization (AIB) is proposed, which increases the utility of published data while maintaining privacy. The Bucketization technique is used in this paper with the intervention of the clustering method. The proposed work is divided into three stages: (i) Vertical and Horizontal partitioning (ii) Assigning Sensitive index to attributes in the cluster (iii) Verifying each cluster against privacy threshold (iv) Examining for privacy breach in Quasi Identifier (QI). To increase the utility of published data, the threshold value is determined based on the distribution of elements in each attribute, and the anonymization method is applied only to the specific QI element. As a result, the data utility has been improved. Finally, the evaluation results validated the design of paper and demonstrated that our design is effective in improving data utility.

Download Full-text

Analytical Study on Privacy Attack Models in Privacy Preserving Data Publishing

Cyber Law, Privacy, and Security ◽

10.4018/978-1-5225-8897-9.ch076 ◽

2019 ◽

pp. 1518-1538

Author(s):

Sowmyarani C. N. ◽

Dayananda P.

Keyword(s):

Private Information ◽

Background Knowledge ◽

Privacy Preserving ◽

Data Publishing ◽

Background Information ◽

Published Data ◽

Privacy Attack ◽

Privacy Preserving Data Publishing ◽

Available Information ◽

Attack Models

Privacy attack on individual records has great concern in privacy preserving data publishing. When an intruder who is interested to know the private information of particular person of his interest, will acquire background knowledge about the person. This background knowledge may be gained though publicly available information such as Voter's id or through social networks. Combining this background information with published data; intruder may get the private information causing a privacy attack of that person. There are many privacy attack models. Most popular attack models are discussed in this chapter. The study of these attack models plays a significant role towards the invention of robust Privacy preserving models.

Download Full-text

Privacy Preserving Data Publishing for Multiple Sensitive Attributes Based on Security Level

Information ◽

10.3390/info11030166 ◽

2020 ◽

Vol 11 (3) ◽

pp. 166

Author(s):

Yuelei Xiao ◽

Haiqi Li

Keyword(s):

Data Privacy ◽

Privacy Preserving ◽

Information Loss ◽

Experimental Results ◽

Data Publishing ◽

Security Level ◽

Sensitive Attribute ◽

Data Volume ◽

Security Levels ◽

Privacy Preserving Data Publishing

Privacy preserving data publishing has received considerable attention for publishing useful information while preserving data privacy. The existing privacy preserving data publishing methods for multiple sensitive attributes do not consider the situation that different values of a sensitive attribute may have different sensitivity requirements. To solve this problem, we defined three security levels for different sensitive attribute values that have different sensitivity requirements, and given an L s l -diversity model for multiple sensitive attributes. Following this, we proposed three specific greed algorithms based on the maximal-bucket first (MBF), maximal single-dimension-capacity first (MSDCF) and maximal multi-dimension-capacity first (MMDCF) algorithms and the maximal security-level first (MSLF) greed policy, named as MBF based on MSLF (MBF-MSLF), MSDCF based on MSLF (MSDCF-MSLF) and MMDCF based on MSLF (MMDCF-MSLF), to implement the L s l -diversity model for multiple sensitive attributes. The experimental results show that the three algorithms can greatly reduce the information loss of the published microdata, but their runtime is only a small increase, and their information loss tends to be stable with the increasing of data volume. And they can solve the problem that the information loss of MBF, MSDCF and MMDCF increases greatly with the increasing of sensitive attribute number.

Download Full-text

Duplication with Trapdoor Sensitive Attribute Values: A New Approach for Privacy Preserving Data Publishing

Procedia Technology ◽

10.1016/j.protcy.2012.10.118 ◽

2012 ◽

Vol 6 ◽

pp. 970-977

Author(s):

B.R. Purushothama ◽

B.B. Amberker

Keyword(s):

Privacy Preserving ◽

Data Publishing ◽

New Approach ◽

Sensitive Attribute ◽

Privacy Preserving Data Publishing

Download Full-text

f-Slip: An Efficient Privacy-Preserving Data Publishing Framework for 1: M Microdata with Multiple Sensitive Attributes.

10.21203/rs.3.rs-660451/v1 ◽

2021 ◽

Author(s):

Jayapradha J ◽

Prakash M

Keyword(s):

Privacy Preserving ◽

Vital Role ◽

Data Publishing ◽

Slip Model ◽

Correlation Attack ◽

Sensitive Attribute ◽

Utility Loss ◽

Privacy Preserving Data Publishing ◽

Loss Efficiency ◽

Attribute Correlation

Abstract Privacy of the individuals plays a vital role when a dataset is disclosed in public. Privacy-preserving data publishing is a process of releasing the anonymized dataset for various purposes of analysis and research. The data to be published contain several sensitive attributes such as diseases, salary, symptoms, etc. Earlier, researchers have dealt with datasets considering it would contain only one record for an individual [1:1 dataset], which is uncompromising in various applications. Later, many researchers concentrate on the dataset, where an individual has multiple records [1:M dataset]. In the paper, a model f-slip was proposed that can address the various attacks such as Background Knowledge (bk) attack, Multiple Sensitive attribute correlation attack (MSAcorr), Quasi-identifier correlation attack(QIcorr), Non-membership correlation attack(NMcorr) and Membership correlation attack(Mcorr) in 1:M dataset and the solutions for the attacks. In f -slip, the anatomization was performed to divide the table into two subtables consisting of i) quasi-identifier and ii) sensitive attributes. The correlation of sensitive attributes is computed to anonymize the sensitive attributes without breaking the linking relationship. Further, the quasi-identifier table was divided and k-anonymity was implemented on it. An efficient anonymization technique, frequency-slicing (f-slicing), was also developed to anonymize the sensitive attributes. The f -slip model is consistent as the number of records increases. Extensive experiments were performed on a real-world dataset Informs and proved that the f -slip model outstrips the state-of-the-art techniques in terms of utility loss, efficiency and also acquires an optimal balance between privacy and utility.

Download Full-text

Uncertain Data Privacy Protection Based on K-Anonymity via Anatomy

Advanced Engineering Forum ◽

10.4028/www.scientific.net/aef.6-7.64 ◽

2012 ◽

Vol 6-7 ◽

pp. 64-69 ◽

Cited By ~ 3

Author(s):

Xiang Min Ren ◽

Jing Yang ◽

Jian Pei Zhang ◽

Zong Fu Jia

Keyword(s):

Privacy Protection ◽

Data Privacy ◽

Uncertain Data ◽

Information Leakage ◽

Data Publishing ◽

Data Set ◽

Influence Matrix ◽

Sensitive Attribute ◽

Influence Degree ◽

Equivalent Class

In traditional database domain, k-anonymity is a hotspot in data publishing for privacy protection. In this paper, we study how to use k-anonymity in uncertain data set, use influence matrix of background knowledge to describe the influence degree of sensitive attribute produced by QI attributes and sensitive attribute itself, use BK(L,K)-clustering to present equivalent class with diversity, and a novel UDAK-anonymity model via anatomy is proposed for relational uncertain data. We will extend our ideas for handling how to solve privacy information leakage problem by using UDAK-anonymity algorithms in another paper.

Download Full-text

Privacy-Preserving Trajectory Data Publishing by Dynamic Anonymization with Bounded Distortion

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020078 ◽

2021 ◽

Vol 10 (2) ◽

pp. 78

Author(s):

Songyuan Li ◽

Hui Tian ◽

Hong Shen ◽

Yingpeng Sang

Keyword(s):

Privacy Preservation ◽

Privacy Preserving ◽

Data Publishing ◽

Published Data ◽

Trajectory Data ◽

Data Utilization ◽

Data Set ◽

Guangzhou City ◽

Bounded Distortion ◽

Privacy Breaches

Publication of trajectory data that contain rich information of vehicles in the dimensions of time and space (location) enables online monitoring and supervision of vehicles in motion and offline traffic analysis for various management tasks. However, it also provides security holes for privacy breaches as exposing individual’s privacy information to public may results in attacks threatening individual’s safety. Therefore, increased attention has been made recently on the privacy protection of trajectory data publishing. However, existing methods, such as generalization via anonymization and suppression via randomization, achieve protection by modifying the original trajectory to form a publishable trajectory, which results in significant data distortion and hence a low data utility. In this work, we propose a trajectory privacy-preserving method called dynamic anonymization with bounded distortion. In our method, individual trajectories in the original trajectory set are mixed in a localized manner to form synthetic trajectory data set with a bounded distortion for publishing, which can protect the privacy of location information associated with individuals in the trajectory data set and ensure a guaranteed utility of the published data both individually and collectively. Through experiments conducted on real trajectory data of Guangzhou City Taxi statistics, we evaluate the performance of our proposed method and compare it with the existing mainstream methods in terms of privacy preservation against attacks and trajectory data utilization. The results show that our proposed method achieves better performance on data utilization than the existing methods using globally static anonymization, without trading off the data security against attacks.

Download Full-text

Anonymization on refining partition: Same privacy, more utility

Computer Science and Information Systems ◽

10.2298/csis141212052z ◽

2015 ◽

Vol 12 (4) ◽

pp. 1193-1216 ◽

Cited By ~ 1

Author(s):

Hong Zhu ◽

Shengli Tian ◽

Genyuan Du ◽

Meiyi Xie

Keyword(s):

Experimental Evaluation ◽

Privacy Preserving ◽

Data Publishing ◽

Data Utility ◽

Sensitive Attribute ◽

The Optimizing Model ◽

Privacy Preserving Data Publishing ◽

Initial Partition ◽

Optimizing Model

In privacy preserving data publishing, to reduce the correlation loss between sensitive attribute (SA) and non-sensitive attributes(NSAs) caused by anonymization methods (such as generalization, anatomy, slicing and randomization, etc.), the records with same NSAs values should be divided into same blocks to meet the anonymizing demands of ?-diversity. However, there are often many blocks (of the initial partition), in which there are more than ? records with different SA values, and the frequencies of different SA values are uneven. Therefore, anonymization on the initial partition causes more correlation loss. To reduce the correlation loss as far as possible, in this paper, an optimizing model is first proposed. Then according to the optimizing model, the refining partition of the initial partition is generated, and anonymization is applied on the refining partition. Although anonymization on refining partition can be used on top of any existing partitioning method to reduce the correlation loss, we demonstrate that a new partitioning method tailored for refining partition could further improve data utility. An experimental evaluation shows that our approach could efficiently reduce correlation loss.

Download Full-text

Analytical Study on Privacy Attack Models in Privacy Preserving Data Publishing

Security Solutions and Applied Cryptography in Smart Grid Communications - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-1829-7.ch006 ◽

2016 ◽

pp. 98-116 ◽

Cited By ~ 1

Author(s):

Sowmyarani C. N. ◽

Dayananda P.

Keyword(s):

Private Information ◽

Background Knowledge ◽

Privacy Preserving ◽

Data Publishing ◽

Background Information ◽

Published Data ◽

Privacy Attack ◽

Privacy Preserving Data Publishing ◽

Available Information ◽

Attack Models

Download Full-text

Evaluation of proposed amalgamated anonymization approach

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v16.i3.pp1439-1446 ◽

2019 ◽

Vol 16 (3) ◽

pp. 1439

Author(s):

Deepak Narula ◽

Pardeep Kumar ◽

Shuchita Upadhyaya

Keyword(s):

Equivalence Class ◽

Class Size ◽

Information Loss ◽

Data Publishing ◽

Published Data ◽

Sensitive Information ◽

Electronic Data ◽

Current Scenario ◽

Privacy Preserving Data Publishing ◽

Modern Era

<p>In the current scenario of modern era, providing security to an individual is always a matter of concern when a huge volume of electronic data is gathering daily. Now providing security to the gathered data is not only a matter of concern but also remains a notable topic of research. The concept of Privacy Preserving Data Publishing (PPDP) defines accessing the published data without disclosing the non required information about an individual. Hence PPDP faces the problem of publishing useful data while keeping the privacy about sensitive information about an individual. A variety of techniques for anonymization has been found in literature, but suffers from different kind of problems in terms of data information loss, discernibility and average equivalence class size. This paper proposes amalgamated approach along with its verification with respect to information loss, value of discernibility and the value of average equivalence class size metric. The result have been found encouraging as compared to existing <em>k-</em>anonymity based algorithms such as Datafly, Mondrian and Incognito on various publically available datasets.</p>

Download Full-text