Anonymization Based on Improved Bucketization (AIB): A Privacy-Preserving Data Publishing Technique for Improving Data Utility in Healthcare Data

Privacy attack on individual records has great concern in privacy preserving data publishing. When an intruder who is interested to know the private information of particular person of his interest, will acquire background knowledge about the person. This background knowledge may be gained though publicly available information such as Voter's id or through social networks. Combining this background information with published data; intruder may get the private information causing a privacy attack of that person. There are many privacy attack models. Most popular attack models are discussed in this chapter. The study of these attack models plays a significant role towards the invention of robust Privacy preserving models.

Download Full-text

A New View of Privacy in Social Networks

Cyber Law, Privacy, and Security ◽

10.4018/978-1-5225-8897-9.ch025 ◽

2019 ◽

pp. 517-541

Author(s):

Wei Chang ◽

Jie Wu

Keyword(s):

Privacy Protection ◽

Privacy Preserving ◽

Propagation Path ◽

Data Publishing ◽

Mobile Social Network ◽

Data Utility ◽

Privacy Requirements ◽

Individual Privacy ◽

Utility Loss ◽

Privacy Preserving Data Publishing

Many smartphone-based applications need microdata, but publishing a microdata table may leak respondents' privacy. Conventional researches on privacy-preserving data publishing focus on providing identical privacy protection to all data requesters. Considering that, instead of trapping in a small coterie, information usually propagates from friend to friend. The authors study the privacy-preserving data publishing problem on a mobile social network. Along a propagation path, a series of tables will be locally created at each participant, and the tables' privacy-levels should be gradually enhanced. However, the tradeoff between these tables' overall utility and their individual privacy requirements are not trivial: any inappropriate sanitization operation under a lower privacy requirement may cause dramatic utility loss on the subsequent tables. For solving the problem, the authors propose an approximation algorithm by previewing the future privacy requirements. Extensive results show that this approach successfully increases the overall data utility, and meet the strengthening privacy requirements.

Download Full-text

Data mining as a tool in privacy-preserving data publishing

Tatra Mountains Mathematical Publications ◽

10.2478/v10127-010-0011-z ◽

2010 ◽

Vol 45 (1) ◽

pp. 151-159 ◽

Cited By ~ 2

Author(s):

Michal Sramka

Keyword(s):

Data Mining ◽

Decision Making ◽

Data Analysis ◽

Privacy Preserving ◽

Data Publishing ◽

Sensitive Information ◽

Private Data ◽

Privacy Preserving Data Publishing

ABSTRACTMany databases contain data about individuals that are valuable for research, marketing, and decision making. Sharing or publishing data about individuals is however prone to privacy attacks, breaches, and disclosures. The concern here is about individuals’ privacy-keeping the sensitive information about individuals private to them. Data mining in this setting has been shown to be a powerful tool to breach privacy and make disclosures. In contrast, data mining can be also used in practice to aid data owners in their decision on how to share and publish their databases. We present and discuss the role and uses of data mining in these scenarios and also briefly discuss other approaches to private data analysis.

Download Full-text

Anonymization on refining partition: Same privacy, more utility

Computer Science and Information Systems ◽

10.2298/csis141212052z ◽

2015 ◽

Vol 12 (4) ◽

pp. 1193-1216 ◽

Cited By ~ 1

Author(s):

Hong Zhu ◽

Shengli Tian ◽

Genyuan Du ◽

Meiyi Xie

Keyword(s):

Experimental Evaluation ◽

Privacy Preserving ◽

Data Publishing ◽

Data Utility ◽

Sensitive Attribute ◽

The Optimizing Model ◽

Privacy Preserving Data Publishing ◽

Initial Partition ◽

Optimizing Model

In privacy preserving data publishing, to reduce the correlation loss between sensitive attribute (SA) and non-sensitive attributes(NSAs) caused by anonymization methods (such as generalization, anatomy, slicing and randomization, etc.), the records with same NSAs values should be divided into same blocks to meet the anonymizing demands of ?-diversity. However, there are often many blocks (of the initial partition), in which there are more than ? records with different SA values, and the frequencies of different SA values are uneven. Therefore, anonymization on the initial partition causes more correlation loss. To reduce the correlation loss as far as possible, in this paper, an optimizing model is first proposed. Then according to the optimizing model, the refining partition of the initial partition is generated, and anonymization is applied on the refining partition. Although anonymization on refining partition can be used on top of any existing partitioning method to reduce the correlation loss, we demonstrate that a new partitioning method tailored for refining partition could further improve data utility. An experimental evaluation shows that our approach could efficiently reduce correlation loss.

Download Full-text

Analytical Study on Privacy Attack Models in Privacy Preserving Data Publishing

Security Solutions and Applied Cryptography in Smart Grid Communications - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-1829-7.ch006 ◽

2016 ◽

pp. 98-116 ◽

Cited By ~ 1

Author(s):

Sowmyarani C. N. ◽

Dayananda P.

Keyword(s):

Private Information ◽

Background Knowledge ◽

Privacy Preserving ◽

Data Publishing ◽

Background Information ◽

Published Data ◽

Privacy Attack ◽

Privacy Preserving Data Publishing ◽

Available Information ◽

Attack Models

Privacy attack on individual records has great concern in privacy preserving data publishing. When an intruder who is interested to know the private information of particular person of his interest, will acquire background knowledge about the person. This background knowledge may be gained though publicly available information such as Voter's id or through social networks. Combining this background information with published data; intruder may get the private information causing a privacy attack of that person. There are many privacy attack models. Most popular attack models are discussed in this chapter. The study of these attack models plays a significant role towards the invention of robust Privacy preserving models.

Download Full-text

Evaluation of proposed amalgamated anonymization approach

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v16.i3.pp1439-1446 ◽

2019 ◽

Vol 16 (3) ◽

pp. 1439

Author(s):

Deepak Narula ◽

Pardeep Kumar ◽

Shuchita Upadhyaya

Keyword(s):

Equivalence Class ◽

Class Size ◽

Information Loss ◽

Data Publishing ◽

Published Data ◽

Sensitive Information ◽

Electronic Data ◽

Current Scenario ◽

Privacy Preserving Data Publishing ◽

Modern Era

<p>In the current scenario of modern era, providing security to an individual is always a matter of concern when a huge volume of electronic data is gathering daily. Now providing security to the gathered data is not only a matter of concern but also remains a notable topic of research. The concept of Privacy Preserving Data Publishing (PPDP) defines accessing the published data without disclosing the non required information about an individual. Hence PPDP faces the problem of publishing useful data while keeping the privacy about sensitive information about an individual. A variety of techniques for anonymization has been found in literature, but suffers from different kind of problems in terms of data information loss, discernibility and average equivalence class size. This paper proposes amalgamated approach along with its verification with respect to information loss, value of discernibility and the value of average equivalence class size metric. The result have been found encouraging as compared to existing <em>k-</em>anonymity based algorithms such as Datafly, Mondrian and Incognito on various publically available datasets.</p>

Download Full-text

Privacy Preserving Anonymity for Periodical Releases of Spontaneous ADE Reporting Data: Algorithm Development and Validation (Preprint)

10.2196/preprints.28752 ◽

2021 ◽

Author(s):

Wen-Yang Lin ◽

Jie-Teng Wang

Keyword(s):

Adverse Drug Events ◽

Personal Information ◽

Data Publishing ◽

Published Data ◽

Sensitive Information ◽

Personal Privacy ◽

Data Utility ◽

Data Anonymization ◽

Privacy Model ◽

Bounding Model

BACKGROUND Increasingly, spontaneous reporting systems (SRS) have been established to collect adverse drug events to foster the research of ADR detection and analysis. SRS data contains personal information and so its publication requires data anonymization to prevent the disclosure of individual privacy. We previously have proposed a privacy model called MS(k, θ*)-bounding and the associated MS-Anonymization algorithm to fulfill the anonymization of SRS data. In the real world, the SRS data usually are released periodically, e.g., FAERS, to accommodate newly collected adverse drug events. Different anonymized releases of SRS data available to the attacker may thwart our single-release-focus method, i.e., MS(k, θ*)-bounding. OBJECTIVE We investigate the privacy threat caused by periodical releases of SRS data and propose anonymization methods to prevent the disclosure of personal privacy information while maintain the utility of published data. METHODS We identify some potential attacks on periodical releases of SRS data, namely BFL-attacks, that are mainly caused by follow-up cases. We present a new privacy model called PPMS(k, θ*)-bounding, and propose the associated PPMS-Anonymization algorithm along with two improvements, PPMS+-Anonymization and PPMS++-Anonymization. Empirical evaluations were performed using 32 selected FAERS quarter datasets, from 2004Q1 to 2011Q4. The performance of the proposed three versions of PPMS-Anonymization were inspected against MS-Anonymization from some aspects, including data distortion, measured by Normalized Information Loss (NIS); privacy risk of anonymized data, measured by Dangerous Identity Ratio (DIR) and Dangerous Sensitivity Ratio (DSR); and data utility, measured by bias of signal counting and strength (PRR). RESULTS The results show that our new method can prevent privacy disclosure for periodical releases of SRS data with reasonable sacrifice of data utility and acceptable deviation of the strength of ADR signals. The best version of PPMS-Anonymization, PPMS++-Anonymization, achieves nearly the same quality as MS-Anonymization both in privacy protection and data utility. CONCLUSIONS The proposed PPMS(k, θ*)-bounding model and PPMS-Anonymization algorithm are effective in anonymizing SRS datasets in the periodical data publishing scenario, preventing the series of releases from the disclosure of personal sensitive information caused by BFL-attacks while maintaining the data utility for ADR signal detection.

Download Full-text

Efficiently Supporting Online Privacy-Preserving Data Publishing in a Distributed Computing Environment

Applied Sciences ◽

10.3390/app112210740 ◽

2021 ◽

Vol 11 (22) ◽

pp. 10740

Author(s):

Jong Kim

Keyword(s):

Personal Information ◽

Privacy Preserving ◽

Online Privacy ◽

Data Publishing ◽

Sensitive Information ◽

Data Anonymization ◽

Query Result ◽

Individual Entity ◽

Privacy Preserving Data Publishing ◽

Increasing Demand

There has recently been an increasing need for the collection and sharing of microdata containing information regarding an individual entity. Because microdata typically contain sensitive information on an individual, releasing it directly for public use may violate existing privacy requirements. Thus, extensive studies have been conducted on privacy-preserving data publishing (PPDP), which ensures that any microdata released satisfy the privacy policy requirements. Most existing privacy-preserving data publishing algorithms consider a scenario in which a data publisher, receiving a request for the release of data containing personal information, anonymizes the data prior to publishing—a process that is usually conducted offline. However, with the increasing demand for the sharing of data among various parties, it is more desirable to integrate the data anonymization functionality into existing systems that are capable of supporting online query processing. Thus, we developed a novel scheme that is able to efficiently anonymize the query results on the fly, and thus support efficient online privacy-preserving data publishing. In particular, given a user’s query, the proposed approach effectively estimates the generalization level of each quasi-identifier attribute, thereby achieving the k-anonymity property in the query result datasets based on the statistical information without applying k-anonymity on all actual datasets, which is a costly procedure. The experiment results show that, through the proposed method, significant gains in processing time can be achieved.

Download Full-text

Analytical Study on Privacy Attack Models in Privacy Preserving Data Publishing

Censorship, Surveillance, and Privacy ◽

10.4018/978-1-5225-7113-1.ch061 ◽

2019 ◽

pp. 1273-1293

Author(s):

Sowmyarani C. N. ◽

Dayananda P.

Keyword(s):

Private Information ◽

Background Knowledge ◽

Privacy Preserving ◽

Data Publishing ◽

Background Information ◽

Published Data ◽

Privacy Attack ◽

Privacy Preserving Data Publishing ◽

Available Information ◽

Attack Models

Privacy attack on individual records has great concern in privacy preserving data publishing. When an intruder who is interested to know the private information of particular person of his interest, will acquire background knowledge about the person. This background knowledge may be gained though publicly available information such as Voter's id or through social networks. Combining this background information with published data; intruder may get the private information causing a privacy attack of that person. There are many privacy attack models. Most popular attack models are discussed in this chapter. The study of these attack models plays a significant role towards the invention of robust Privacy preserving models.

Download Full-text

A New View of Privacy in Social Networks

Handbook of Research on Modern Cryptographic Solutions for Computer and Cyber Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-0105-3.ch002 ◽

2016 ◽

pp. 28-51 ◽

Cited By ~ 1

Author(s):

Wei Chang ◽

Jie Wu

Keyword(s):

Privacy Protection ◽

Privacy Preserving ◽

Propagation Path ◽

Data Publishing ◽

Mobile Social Network ◽

Data Utility ◽

Privacy Requirements ◽

Individual Privacy ◽

Utility Loss ◽

Privacy Preserving Data Publishing

Many smartphone-based applications need microdata, but publishing a microdata table may leak respondents' privacy. Conventional researches on privacy-preserving data publishing focus on providing identical privacy protection to all data requesters. Considering that, instead of trapping in a small coterie, information usually propagates from friend to friend. The authors study the privacy-preserving data publishing problem on a mobile social network. Along a propagation path, a series of tables will be locally created at each participant, and the tables' privacy-levels should be gradually enhanced. However, the tradeoff between these tables' overall utility and their individual privacy requirements are not trivial: any inappropriate sanitization operation under a lower privacy requirement may cause dramatic utility loss on the subsequent tables. For solving the problem, the authors propose an approximation algorithm by previewing the future privacy requirements. Extensive results show that this approach successfully increases the overall data utility, and meet the strengthening privacy requirements.

Download Full-text