Privacy Preserving Big Data Publishing

Author(s):  
Nancy Victor ◽  
Daphne Lopez

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.

Author(s):  
Nancy Victor ◽  
Daphne Lopez

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.


Author(s):  
Ashoka Kukkuvada ◽  
Poornima Basavaraju

Currently the industry is focused on managing, retrieving, and securing massive amounts of data. Hence, privacy preservation is a significant concern for those organizations that publish/share personal data for vernacular analysis. In this chapter, the authors presented an innovative approach that makes use of information gain of the quasi attributes with respect to sensitive attributes for anonymizing the data, which gives the fruitfulness of an attribute in classifying the data elements, which is a two-way correlation among attributes. The authors show that the proposed approach preserves better data utility and has lesser complexity than former methods.


2019 ◽  
Vol 2019 ◽  
pp. 1-11 ◽  
Author(s):  
Jie Wang ◽  
Hongtao Li ◽  
Feng Guo ◽  
Wenyin Zhang ◽  
Yifeng Cui

As a novel and promising technology for 5G networks, device-to-device (D2D) communication has garnered a significant amount of research interest because of the advantages of rapid sharing and high accuracy on deliveries as well as its variety of applications and services. Big data technology offers unprecedented opportunities and poses a daunting challenge to D2D communication and sharing, where the data often contain private information concerning users or organizations and thus are at risk of being leaked. Privacy preservation is necessary for D2D services but has not been extensively studied. In this paper, we propose an (a, k)-anonymity privacy-preserving framework for D2D big data deployed on MapReduce. Firstly, we provide a framework for the D2D big data sharing and analyze the threat model. Then, we propose an (a, k)-anonymity privacy-preserving framework for D2D big data deployed on MapReduce. In our privacy-preserving framework, we adopt (a, k)-anonymity as privacy-preserving model for D2D big data and use the distributed MapReduce to classify and group data for massive datasets. The results of experiments and theoretical analysis show that our privacy-preserving algorithm deployed on MapReduce is effective for D2D big data privacy protection with less information loss and computing time.


Author(s):  
Salheddine Kabou ◽  
Sidi mohamed Benslimane ◽  
Mhammed Mosteghanemi

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.


Symmetry ◽  
2021 ◽  
Vol 13 (6) ◽  
pp. 1043
Author(s):  
Junqi Guo ◽  
Minghui Yang ◽  
Boxin Wan

With the rapid development of the Internet of Things (IoT), wearable devices have become ubiquitous and interconnected in daily lives. Because wearable devices collect, transmit, and monitor humans’ physiological signals, data privacy should be a concern, as well as fully protected, throughout the whole process. However, the existing privacy protection methods are insufficient. In this paper, we propose a practical privacy-preserving mechanism for physiological signals collected by intelligent wearable devices. In the data acquisition and transmission stage, we employed existing asymmetry encryption-based methods. In the data publishing stage, we proposed a new model based on the combination and optimization of k-anonymity and differential privacy. An entropy-based personalized k-anonymity algorithm is proposed to improve the performance on processing the static and long-term data. Moreover, we use the symmetry of differential privacy and propose the temporal differential privacy mechanism for real-time data to suppress the privacy leakage while updating data. It is proved theoretically that the combination of the two algorithms is reasonable. Finally, we use smart bracelets as an example to verify the performance of our mechanism. The experiment results show that personalized k-anonymity improves up to 6.25% in terms of security index compared with traditional k-anonymity, and the grouping results are more centralized. Moreover, temporal differential privacy effectively reduces the amount of information exposed, which protects the privacy of IoT-based users.


Author(s):  
Salheddine Kabou ◽  
Sidi mohamed Benslimane ◽  
Mhammed Mosteghanemi

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.


Author(s):  
Ashoka Kukkuvada ◽  
Poornima Basavaraju

Currently the industry is focused on managing, retrieving, and securing massive amounts of data. Hence, privacy preservation is a significant concern for those organizations that publish/share personal data for vernacular analysis. In this chapter, the authors presented an innovative approach that makes use of information gain of the quasi attributes with respect to sensitive attributes for anonymizing the data, which gives the fruitfulness of an attribute in classifying the data elements, which is a two-way correlation among attributes. The authors show that the proposed approach preserves better data utility and has lesser complexity than former methods.


Author(s):  
D. Radhika ◽  
D. Aruna Kumari

Leakage and misuse of sensitive data is a challenging problem to enterprises. It has become more serious problem with the advent of cloud and big data. The rationale behind this is the increase in outsourcing of data to public cloud and publishing data for wider visibility. Therefore Privacy Preserving Data Publishing (PPDP), Privacy Preserving Data Mining (PPDM) and Privacy Preserving Distributed Data Mining (PPDM) are crucial in the contemporary era. PPDP and PPDM can protect privacy at data and process levels respectively. Therefore, with big data privacy to data became indispensable due to the fact that data is stored and processed in semi-trusted environment. In this paper we proposed a comprehensive methodology for effective sanitization of data based on misusability measure for preserving privacy to get rid of data leakage and misuse. We followed a hybrid approach that caters to the needs of privacy preserving MapReduce programming. We proposed an algorithm known as Misusability Measure-Based Privacy serving Algorithm (MMPP) which considers level of misusability prior to choosing and application of appropriate sanitization on big data. Our empirical study with Amazon EC2 and EMR revealed that the proposed methodology is useful in realizing privacy preserving Map Reduce programming.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5369
Author(s):  
Qiannan Wang ◽  
Haibing Mu

Edge computing has been introduced to the Internet of Things (IoT) to meet the requirements of IoT applications. At the same time, data aggregation is widely used in data processing to reduce the communication overhead and energy consumption in IoT. Most existing schemes aggregate the overall data without filtering. In addition, aggregation schemes also face huge challenges, such as the privacy of the individual IoT device’s data or the fault-tolerant and lightweight requirements of the schemes. In this paper, we present a privacy-preserving and lightweight selective aggregation scheme with fault tolerance (PLSA-FT) for edge computing-enhanced IoT. In PLSA-FT, selective aggregation can be achieved by constructing Boolean responses and numerical responses according to specific query conditions of the cloud center. Furthermore, we modified the basic Paillier homomorphic encryption to guarantee data privacy and support fault tolerance of IoT devices’ malfunctions. An online/offline signature mechanism is utilized to reduce computation costs. The system characteristic analyses prove that the PLSA-FT scheme achieves confidentiality, privacy preservation, source authentication, integrity verification, fault tolerance, and dynamic membership management. Moreover, performance evaluation results show that PLSA-FT is lightweight with low computation costs and communication overheads.


Author(s):  
Shalin Eliabeth S. ◽  
Sarju S.

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.


Sign in / Sign up

Export Citation Format

Share Document