Privacy Preserving Big Data Publishing

2018 ◽

pp. 47-70 ◽

Cited By ~ 3

Author(s):

Nancy Victor ◽

Daphne Lopez

Keyword(s):

Big Data ◽

Language Processing ◽

Data Privacy ◽

Privacy Preservation ◽

Personal Data ◽

Privacy Preserving ◽

Data Publishing ◽

Time Data ◽

Digital World ◽

Distributed Framework

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.

Download Full-text

Mutual Correlation-Based Anonymization for Privacy Preserving Medical Data Publishing

Handbook of Research on Information Security in Biomedical Signal Processing - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-5152-2.ch016 ◽

2018 ◽

pp. 304-319

Author(s):

Ashoka Kukkuvada ◽

Poornima Basavaraju

Keyword(s):

Privacy Preservation ◽

Information Gain ◽

Personal Data ◽

Privacy Preserving ◽

Medical Data ◽

Innovative Approach ◽

Data Publishing ◽

Mutual Correlation ◽

Data Utility ◽

Data Elements

Currently the industry is focused on managing, retrieving, and securing massive amounts of data. Hence, privacy preservation is a significant concern for those organizations that publish/share personal data for vernacular analysis. In this chapter, the authors presented an innovative approach that makes use of information gain of the quasi attributes with respect to sensitive attributes for anonymizing the data, which gives the fruitfulness of an attribute in classifying the data elements, which is a two-way correlation among attributes. The authors show that the proposed approach preserves better data utility and has lesser complexity than former methods.

Download Full-text

D2D Big Data Privacy-Preserving Framework Based on (a, k)-Anonymity Model

Mathematical Problems in Engineering ◽

10.1155/2019/2076542 ◽

2019 ◽

Vol 2019 ◽

pp. 1-11 ◽

Cited By ~ 1

Author(s):

Jie Wang ◽

Hongtao Li ◽

Feng Guo ◽

Wenyin Zhang ◽

Yifeng Cui

Keyword(s):

Big Data ◽

Private Information ◽

Data Privacy ◽

Privacy Preservation ◽

Computing Time ◽

Privacy Preserving ◽

D2d Communication ◽

Group Data ◽

Big Data Privacy ◽

Daunting Challenge

As a novel and promising technology for 5G networks, device-to-device (D2D) communication has garnered a significant amount of research interest because of the advantages of rapid sharing and high accuracy on deliveries as well as its variety of applications and services. Big data technology offers unprecedented opportunities and poses a daunting challenge to D2D communication and sharing, where the data often contain private information concerning users or organizations and thus are at risk of being leaked. Privacy preservation is necessary for D2D services but has not been extensively studied. In this paper, we propose an (a, k)-anonymity privacy-preserving framework for D2D big data deployed on MapReduce. Firstly, we provide a framework for the D2D big data sharing and analyze the threat model. Then, we propose an (a, k)-anonymity privacy-preserving framework for D2D big data deployed on MapReduce. In our privacy-preserving framework, we adopt (a, k)-anonymity as privacy-preserving model for D2D big data and use the distributed MapReduce to classify and group data for massive datasets. The results of experiments and theoretical analysis show that our privacy-preserving algorithm deployed on MapReduce is effective for D2D big data privacy protection with less information loss and computing time.

Download Full-text

A Survey on Privacy Preserving Dynamic Data Publishing

Research Anthology on Privatizing and Securing Data ◽

10.4018/978-1-7998-8954-0.ch079 ◽

2021 ◽

pp. 1635-1657

Author(s):

Salheddine Kabou ◽

Sidi mohamed Benslimane ◽

Mhammed Mosteghanemi

Keyword(s):

Data Privacy ◽

Privacy Preservation ◽

Personal Information ◽

Privacy Preserving ◽

Data Publishing ◽

Future Research ◽

Dynamic Data ◽

Practical Applications ◽

New Process ◽

Future Research Directions

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.

Download Full-text

A Practical Privacy-Preserving Publishing Mechanism Based on Personalized k-Anonymity and Temporal Differential Privacy for Wearable IoT Applications

Symmetry ◽

10.3390/sym13061043 ◽

2021 ◽

Vol 13 (6) ◽

pp. 1043

Author(s):

Junqi Guo ◽

Minghui Yang ◽

Boxin Wan

Keyword(s):

Data Privacy ◽

Differential Privacy ◽

Rapid Development ◽

Wearable Devices ◽

Privacy Preserving ◽

Physiological Signals ◽

Data Publishing ◽

Time Data ◽

Daily Lives ◽

Whole Process

With the rapid development of the Internet of Things (IoT), wearable devices have become ubiquitous and interconnected in daily lives. Because wearable devices collect, transmit, and monitor humans’ physiological signals, data privacy should be a concern, as well as fully protected, throughout the whole process. However, the existing privacy protection methods are insufficient. In this paper, we propose a practical privacy-preserving mechanism for physiological signals collected by intelligent wearable devices. In the data acquisition and transmission stage, we employed existing asymmetry encryption-based methods. In the data publishing stage, we proposed a new model based on the combination and optimization of k-anonymity and differential privacy. An entropy-based personalized k-anonymity algorithm is proposed to improve the performance on processing the static and long-term data. Moreover, we use the symmetry of differential privacy and propose the temporal differential privacy mechanism for real-time data to suppress the privacy leakage while updating data. It is proved theoretically that the combination of the two algorithms is reasonable. Finally, we use smart bracelets as an example to verify the performance of our mechanism. The experiment results show that personalized k-anonymity improves up to 6.25% in terms of security index compared with traditional k-anonymity, and the grouping results are more centralized. Moreover, temporal differential privacy effectively reduces the amount of information exposed, which protects the privacy of IoT-based users.

Download Full-text

A Survey on Privacy Preserving Dynamic Data Publishing

International Journal of Organizational and Collective Intelligence ◽

10.4018/ijoci.2018100101 ◽

2018 ◽

Vol 8 (4) ◽

pp. 1-20 ◽

Cited By ~ 2

Author(s):

Salheddine Kabou ◽

Sidi mohamed Benslimane ◽

Mhammed Mosteghanemi

Keyword(s):

Data Privacy ◽

Privacy Preservation ◽

Personal Information ◽

Privacy Preserving ◽

Data Publishing ◽

Future Research ◽

Research Directions ◽

Dynamic Data ◽

Practical Applications ◽

Future Research Directions

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.

Download Full-text

Mutual Correlation-Based Anonymization for Privacy Preserving Medical Data Publishing

Censorship, Surveillance, and Privacy ◽

10.4018/978-1-5225-7113-1.ch034 ◽

2019 ◽

pp. 644-659

Author(s):

Ashoka Kukkuvada ◽

Poornima Basavaraju

Keyword(s):

Privacy Preservation ◽

Information Gain ◽

Personal Data ◽

Privacy Preserving ◽

Medical Data ◽

Innovative Approach ◽

Data Publishing ◽

Mutual Correlation ◽

Data Utility ◽

Data Elements

Currently the industry is focused on managing, retrieving, and securing massive amounts of data. Hence, privacy preservation is a significant concern for those organizations that publish/share personal data for vernacular analysis. In this chapter, the authors presented an innovative approach that makes use of information gain of the quasi attributes with respect to sensitive attributes for anonymizing the data, which gives the fruitfulness of an attribute in classifying the data elements, which is a two-way correlation among attributes. The authors show that the proposed approach preserves better data utility and has lesser complexity than former methods.

Download Full-text

Misusability Measure Based Sanitization of Big Data for Privacy Preserving MapReduce Programming

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i6.pp4524-4532 ◽

2018 ◽

Vol 8 (6) ◽

pp. 4524

Author(s):

D. Radhika ◽

D. Aruna Kumari

Keyword(s):

Data Mining ◽

Big Data ◽

Data Privacy ◽

Hybrid Approach ◽

Privacy Preserving ◽

Data Publishing ◽

Distributed Data Mining ◽

Distributed Data ◽

Public Cloud ◽

Sensitive Data

Leakage and misuse of sensitive data is a challenging problem to enterprises. It has become more serious problem with the advent of cloud and big data. The rationale behind this is the increase in outsourcing of data to public cloud and publishing data for wider visibility. Therefore Privacy Preserving Data Publishing (PPDP), Privacy Preserving Data Mining (PPDM) and Privacy Preserving Distributed Data Mining (PPDM) are crucial in the contemporary era. PPDP and PPDM can protect privacy at data and process levels respectively. Therefore, with big data privacy to data became indispensable due to the fact that data is stored and processed in semi-trusted environment. In this paper we proposed a comprehensive methodology for effective sanitization of data based on misusability measure for preserving privacy to get rid of data leakage and misuse. We followed a hybrid approach that caters to the needs of privacy preserving MapReduce programming. We proposed an algorithm known as Misusability Measure-Based Privacy serving Algorithm (MMPP) which considers level of misusability prior to choosing and application of appropriate sanitization on big data. Our empirical study with Amazon EC2 and EMR revealed that the proposed methodology is useful in realizing privacy preserving Map Reduce programming.

Download Full-text

Privacy-Preserving and Lightweight Selective Aggregation with Fault-Tolerance for Edge Computing-Enhanced IoT

Sensors ◽

10.3390/s21165369 ◽

2021 ◽

Vol 21 (16) ◽

pp. 5369

Author(s):

Qiannan Wang ◽

Haibing Mu

Keyword(s):

Fault Tolerance ◽

Data Privacy ◽

Privacy Preservation ◽

Fault Tolerant ◽

Homomorphic Encryption ◽

Privacy Preserving ◽

Edge Computing ◽

Communication Overhead ◽

Time Data ◽

Selective Aggregation

Edge computing has been introduced to the Internet of Things (IoT) to meet the requirements of IoT applications. At the same time, data aggregation is widely used in data processing to reduce the communication overhead and energy consumption in IoT. Most existing schemes aggregate the overall data without filtering. In addition, aggregation schemes also face huge challenges, such as the privacy of the individual IoT device’s data or the fault-tolerant and lightweight requirements of the schemes. In this paper, we present a privacy-preserving and lightweight selective aggregation scheme with fault tolerance (PLSA-FT) for edge computing-enhanced IoT. In PLSA-FT, selective aggregation can be achieved by constructing Boolean responses and numerical responses according to specific query conditions of the cloud center. Furthermore, we modified the basic Paillier homomorphic encryption to guarantee data privacy and support fault tolerance of IoT devices’ malfunctions. An online/offline signature mechanism is utilized to reduce computation costs. The system characteristic analyses prove that the PLSA-FT scheme achieves confidentiality, privacy preservation, source authentication, integrity verification, fault tolerance, and dynamic membership management. Moreover, performance evaluation results show that PLSA-FT is lightweight with low computation costs and communication overheads.

Download Full-text

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

International Journal of Distributed and Cloud Computing ◽

10.21863/ijdcc/2015.3.2.009 ◽

2015 ◽

Vol 3 (2) ◽

Author(s):

Shalin Eliabeth S. ◽

Sarju S.

Keyword(s):

Big Data ◽

Data Privacy ◽

Privacy Preservation ◽

Experimental Result ◽

Map Reduce ◽

Distributed Environment ◽

Top Down ◽

Two Phase ◽

Data Anonymization ◽

Big Data Privacy

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Download Full-text