A Survey on Privacy Preserving Dynamic Data Publishing

Author(s):  
Salheddine Kabou ◽  
Sidi mohamed Benslimane ◽  
Mhammed Mosteghanemi

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.

Author(s):  
Salheddine Kabou ◽  
Sidi mohamed Benslimane ◽  
Mhammed Mosteghanemi

Many organizations, especially small and medium business (SMB) enterprises require the collection and sharing of data containing personal information. The privacy of this data must be preserved before outsourcing to the commercial public. Privacy preserving data publishing PPDP refers to the process of publishing useful information while preserving data privacy. A variety of approaches have been proposed to ensure privacy by applying traditional anonymization models which focused only on the single publication of datasets. In practical applications, data publishing is more complicated where the organizations publish multiple times for different recipients or after modifications to provide up-to-date data. Privacy preserving dynamic data publication PPDDP is a new process in privacy preservation which addresses the anonymization of the data for different purposes. In this survey, the author will systematically evaluate and summarize different studies to PPDDP, clarify the differences and requirements between the scenarios that can exist, and propose future research directions.


Author(s):  
Nancy Victor ◽  
Daphne Lopez

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.


2023 ◽  
Vol 55 (1) ◽  
pp. 1-39
Author(s):  
Kinza Sarwar ◽  
Sira Yongchareon ◽  
Jian Yu ◽  
Saeed Ur Rehman

Despite the rapid growth and advancement in the Internet of Things (IoT ), there are critical challenges that need to be addressed before the full adoption of the IoT. Data privacy is one of the hurdles towards the adoption of IoT as there might be potential misuse of users’ data and their identity in IoT applications. Several researchers have proposed different approaches to reduce privacy risks. However, most of the existing solutions still suffer from various drawbacks, such as huge bandwidth utilization and network latency, heavyweight cryptosystems, and policies that are applied on sensor devices and in the cloud. To address these issues, fog computing has been introduced for IoT network edges providing low latency, computation, and storage services. In this survey, we comprehensively review and classify privacy requirements for an in-depth understanding of privacy implications in IoT applications. Based on the classification, we highlight ongoing research efforts and limitations of the existing privacy-preservation techniques and map the existing IoT schemes with Fog-enabled IoT schemes to elaborate on the benefits and improvements that Fog-enabled IoT can bring to preserve data privacy in IoT applications. Lastly, we enumerate key research challenges and point out future research directions.


2014 ◽  
Vol 11 (2) ◽  
pp. 163-170
Author(s):  
Binli Wang ◽  
Yanguang Shen

Recently, with the rapid development of network, communications and computer technology, privacy preserving data mining (PPDM) has become an increasingly important research in the field of data mining. In distributed environment, how to protect data privacy while doing data mining jobs from a large number of distributed data is more far-researching. This paper describes current research of PPDM at home and abroad. Then it puts emphasis on classifying the typical uses and algorithms of PPDM in distributed environment, and summarizing their advantages and disadvantages. Furthermore, it points out the future research directions in the field.


Author(s):  
Nancy Victor ◽  
Daphne Lopez

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.


2014 ◽  
Vol 556-562 ◽  
pp. 3532-3535
Author(s):  
Heng Li ◽  
Xue Fang Wu

With the rapid development of computer technology and the popularity of the network, database scale, scope and depth of the constantly expanding, which has accumulated vast amounts of different forms of stored data. The use of data mining technology can access valuable information from a lot of data. Privacy preserving has been one of the greater concerns in data mining. Privacy preserving data mining has a rapid development in a short year. But it still faces many challenges in the future. A number of methods and techniques have been developed for privacy preserving data mining. This paper analyzed the representative techniques for privacy preservation. Finally the present problems and directions for future research are discussed.


2019 ◽  
Author(s):  
Shamila Mohammed

BACKGROUND In the past decade, importance of genomic data has been increased in medical research. The cost of genomic sequencing is reducing day by day we can include genomic data in routine medical care. This data is being used to detect/ prevent inherited diseases. But using this data in research purpose may increase the chance of leakage privacy or genetic information (sensitive information of individuals) to unidentified users. In current, many issues and challenges exist in preserving privacy of genomic data. In general, Identity Tracing attack, completion attack and attribute disclosure attack are three attacks (mitigated) on genomic data (in-current). Also, accessing and integrating genomic is difficult to handle and analysis to make a useful decision for future. This paper discusses about the available sequencing methods (for genomic data), where and how genomic data will be useful in prediction (i.e., in various applications). And also provide a picture of future using genomic analytics for extracting useful patterns from this data.Note that many attempts have been made towards this topic but all existed worksare strictly rule based, i.e., has no quantitative measurement of the risk of privacy breaches (genotype and phenotype information). Here, privacy-preserving linkage of genotype and phenotype information (across different locations) means genotypes stored in a sequencing facility and phenotypes stored in an electronic health record. This article discusses about several aspects in genomic privacy, with a focus on security vulnerabilities identified by them and their (possible) suggested solutions. In this article, we focus to accelerate discoveries using best prediction tools with explaining a clear cut approach, i.e., we need to protect genomic data or not or it is just a myth. In last, we listed several genomic data protection techniques against re-identification attacks and systematic comparison of existing genomic privacy preserving methodologies (attempts made by several researchers in the previous decade) in Appendix A. OBJECTIVE importance of genomic data existing genomic data privacy preservation methods comparison METHODS Re- identification Cryptographic RESULTS Comparison of different methods CONCLUSIONS Privacy is a sensitive issue and need to be protected from outsider world/ from malicious (unidentified) users. Towards this serious concern, in this article we have shared several useful suggestions, opinions with respect to genomic data (also other type of data). This paper has started with introduction to genomic data (also its characteristics), to its scope/ importance in medical care. We highlighted the related works done towards this area. We also explained evolution of genomic sequencing and various metrics to measure the performance. Later we explained the importance of genomic data in terms of where this data is useful and why it is useful with the help of one use case. Then we described how genomic data is different from other types of big data. Later, we have discussed several serious concerns, challenges, and research gaps and have provided some opportunity to the future researchers (in genomic privacy). In this article we also make a comparison between genomic privacy and other types of privacy (in brief). Hence, we find out that privacy especially genomic is necessary to protect and require attention form research communities. We request to computer science community to provide/ make/ develop some techniques for data privacy and confidentiality protection, which work/ use on real –world problems/ tested.


2021 ◽  
Vol 54 (6) ◽  
pp. 1-36
Author(s):  
Xuefei Yin ◽  
Yanming Zhu ◽  
Jiankun Hu

The past four years have witnessed the rapid development of federated learning (FL). However, new privacy concerns have also emerged during the aggregation of the distributed intermediate results. The emerging privacy-preserving FL (PPFL) has been heralded as a solution to generic privacy-preserving machine learning. However, the challenge of protecting data privacy while maintaining the data utility through machine learning still remains. In this article, we present a comprehensive and systematic survey on the PPFL based on our proposed 5W-scenario-based taxonomy. We analyze the privacy leakage risks in the FL from five aspects, summarize existing methods, and identify future research directions.


2021 ◽  
Vol 13 (8) ◽  
pp. 4206
Author(s):  
Jamilya Nurgazina ◽  
Udsanee Pakdeetrakulwong ◽  
Thomas Moser ◽  
Gerald Reiner

The lack of transparency and traceability in food supply chains (FSCs) is raising concerns among consumers and stakeholders about food information credibility, food quality, and safety. Insufficient records, a lack of digitalization and standardization of processes, and information exchange are some of the most critical challenges, which can be tackled with disruptive technologies, such as the Internet of Things (IoT), blockchain, and distributed ledger technologies (DLTs). Studies provide evidence that novel technological and sustainable practices in FSCs are necessary. This paper aims to describe current practical applications of DLTs and IoT in FSCs, investigating the challenges of implementation, and potentials for future research directions, thus contributing to achievement of the United Nations’ Sustainable Development Goals (SDGs). Within a systematic literature review, the content of 69 academic publications was analyzed, describing aspects of implementation and measures to address the challenges of scalability, security, and privacy of DLT, and IoT solutions. The challenges of high costs, standardization, regulation, interoperability, and energy consumption of DLT solutions were also classified as highly relevant, but were not widely addressed in literature. The application of DLTs in FSCs can potentially contribute to 6 strategic SDGs, providing synergies and possibilities for more sustainable, traceable, and transparent FSCs.


Author(s):  
Zheng Wang ◽  
Zhixiang Wang ◽  
Yinqiang Zheng ◽  
Yang Wu ◽  
Wenjun Zeng ◽  
...  

An efficient and effective person re-identification (ReID) system relieves the users from painful and boring video watching and accelerates the process of video analysis. Recently, with the explosive demands of practical applications, a lot of research efforts have been dedicated to heterogeneous person re-identification (Hetero-ReID). In this paper, we provide a comprehensive review of state-of-the-art Hetero-ReID methods that address the challenge of inter-modality discrepancies. According to the application scenario, we classify the methods into four categories --- low-resolution, infrared, sketch, and text. We begin with an introduction of ReID, and make a comparison between Homogeneous ReID (Homo-ReID) and Hetero-ReID tasks. Then, we describe and compare existing datasets for performing evaluations, and survey the models that have been widely employed in Hetero-ReID. We also summarize and compare the representative approaches from two perspectives, i.e., the application scenario and the learning pipeline. We conclude by a discussion of some future research directions. Follow-up updates are available at https://github.com/lightChaserX/Awesome-Hetero-reID


Sign in / Sign up

Export Citation Format

Share Document