Mutual Correlation-Based Anonymization for Privacy Preserving Medical Data Publishing

Author(s):  
Ashoka Kukkuvada ◽  
Poornima Basavaraju

Currently the industry is focused on managing, retrieving, and securing massive amounts of data. Hence, privacy preservation is a significant concern for those organizations that publish/share personal data for vernacular analysis. In this chapter, the authors presented an innovative approach that makes use of information gain of the quasi attributes with respect to sensitive attributes for anonymizing the data, which gives the fruitfulness of an attribute in classifying the data elements, which is a two-way correlation among attributes. The authors show that the proposed approach preserves better data utility and has lesser complexity than former methods.

Author(s):  
Ashoka Kukkuvada ◽  
Poornima Basavaraju

Currently the industry is focused on managing, retrieving, and securing massive amounts of data. Hence, privacy preservation is a significant concern for those organizations that publish/share personal data for vernacular analysis. In this chapter, the authors presented an innovative approach that makes use of information gain of the quasi attributes with respect to sensitive attributes for anonymizing the data, which gives the fruitfulness of an attribute in classifying the data elements, which is a two-way correlation among attributes. The authors show that the proposed approach preserves better data utility and has lesser complexity than former methods.


Author(s):  
Nancy Victor ◽  
Daphne Lopez

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.


Author(s):  
Nancy Victor ◽  
Daphne Lopez

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.


2021 ◽  
Vol 11 (12) ◽  
pp. 3164-3173
Author(s):  
R. Indhumathi ◽  
S. Sathiya Devi

Data sharing is essential in present biomedical research. A large quantity of medical information is gathered and for different objectives of analysis and study. Because of its large collection, anonymity is essential. Thus, it is quite important to preserve privacy and prevent leakage of sensitive information of patients. Most of the Anonymization methods such as generalisation, suppression and perturbation are proposed to overcome the information leak which degrades the utility of the collected data. During data sanitization, the utility is automatically diminished. Privacy Preserving Data Publishing faces the main drawback of maintaining tradeoff between privacy and data utility. To address this issue, an efficient algorithm called Anonymization based on Improved Bucketization (AIB) is proposed, which increases the utility of published data while maintaining privacy. The Bucketization technique is used in this paper with the intervention of the clustering method. The proposed work is divided into three stages: (i) Vertical and Horizontal partitioning (ii) Assigning Sensitive index to attributes in the cluster (iii) Verifying each cluster against privacy threshold (iv) Examining for privacy breach in Quasi Identifier (QI). To increase the utility of published data, the threshold value is determined based on the distribution of elements in each attribute, and the anonymization method is applied only to the specific QI element. As a result, the data utility has been improved. Finally, the evaluation results validated the design of paper and demonstrated that our design is effective in improving data utility.


Author(s):  
Wei Chang ◽  
Jie Wu

Many smartphone-based applications need microdata, but publishing a microdata table may leak respondents' privacy. Conventional researches on privacy-preserving data publishing focus on providing identical privacy protection to all data requesters. Considering that, instead of trapping in a small coterie, information usually propagates from friend to friend. The authors study the privacy-preserving data publishing problem on a mobile social network. Along a propagation path, a series of tables will be locally created at each participant, and the tables' privacy-levels should be gradually enhanced. However, the tradeoff between these tables' overall utility and their individual privacy requirements are not trivial: any inappropriate sanitization operation under a lower privacy requirement may cause dramatic utility loss on the subsequent tables. For solving the problem, the authors propose an approximation algorithm by previewing the future privacy requirements. Extensive results show that this approach successfully increases the overall data utility, and meet the strengthening privacy requirements.


Author(s):  
Yu Niu ◽  
Ji-Jiang Yang ◽  
Qing Wang

With the pervasive using of Electronic Medical Records (EMR) and telemedicine technologies, more and more digital healthcare data are accumulated from multiple sources. As healthcare data is valuable for both commercial and scientific research, the demand of sharing healthcare data has been growing rapidly. Nevertheless, health care data normally contains a large amount of personal information, and sharing them directly would bring huge threaten to the patient privacy. This paper proposes a privacy preserving framework for medical data sharing with the view of practical application. The framework focuses on three key issues of privacy protection during the data sharing, which are privacy definition/detection, privacy policy management, and privacy preserving data publishing. A case study for Chinese Electronic Medical Record (ERM) publishing with privacy preserving is implemented based on the proposed framework. Specific Chinese free text EMR segmentation, Protected Health Information (PHI) extraction, and K-anonymity PHI anonymous algorithms are proposed in each component. The real-life data from hospitals are used to evaluate the performance of the proposed framework and system.


2020 ◽  
Vol 17 (9) ◽  
pp. 4623-4626
Author(s):  
Nisha Nehra ◽  
Suneet Kumar

Now days, due to the sheer amount of data, its complexity and the rate at which it is generated, traditional algorithms that are present so far for the privacy preservation of relation data publishing are not capable enough to ensure privacy as efficiently for transactional data also. From last two decades the interest also increases to provide better data preserving schemes for data publishing. There are a number of algorithms, schemes, models and techniques in the literature that ensure privacy against identity disclosure and attribute disclosure attacks. This paper is a comprehensive survey of the past work done in the field of anonymization to provide privacy against transactional data publishing.


Author(s):  
Salheddine Kabou ◽  
Sidi mohamed Benslimane ◽  
Mhammed Mosteghanemi

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.


Sign in / Sign up

Export Citation Format

Share Document