A Secure Protocol for High-Dimensional Big Data Providing Data Privacy

Author(s):  
Anitha J. ◽  
Prasad S. P.

Due to recent technological development, a huge amount of data generated by social networking, sensor networks, internet, etc., adds more challenges when performing data storage and processing tasks. During PPDP, the collected data may contain sensitive information about the data owner. Directly releasing this for further processing may violate the privacy of the data owner, hence data modification is needed so that it does not disclose any personal information. The existing techniques of data anonymization have a fixed scheme with a small number of dimensions. There are various types of attacks on the privacy of data like linkage attack, homogeneity attack, and background knowledge attack. To provide an effective technique in big data to maintain data privacy and prevent linkage attacks, this paper proposes a privacy preserving protocol, UNION, for a multi-party data provider. Experiments show that this technique provides a better data utility to handle high dimensional data, and scalability with respect to the data size compared with existing anonymization techniques.

Author(s):  
Anitha J. ◽  
Prasad S. P.

Due to recent technological development, a huge amount of data generated by social networking, sensor networks, internet, etc., adds more challenges when performing data storage and processing tasks. During PPDP, the collected data may contain sensitive information about the data owner. Directly releasing this for further processing may violate the privacy of the data owner, hence data modification is needed so that it does not disclose any personal information. The existing techniques of data anonymization have a fixed scheme with a small number of dimensions. There are various types of attacks on the privacy of data like linkage attack, homogeneity attack, and background knowledge attack. To provide an effective technique in big data to maintain data privacy and prevent linkage attacks, this paper proposes a privacy preserving protocol, UNION, for a multi-party data provider. Experiments show that this technique provides a better data utility to handle high dimensional data, and scalability with respect to the data size compared with existing anonymization techniques.


Author(s):  
Shalin Eliabeth S. ◽  
Sarju S.

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.


Author(s):  
Martha Davis

Big data and analytics have not only changed how businesses interact with consumers, but also how consumers interact with the larger world. Smart cities, IoT, cloud, and edge computing technologies are all enabled by data and can provide significant societal benefits via efficiencies and reduction of waste. However, data breaches have also caused serious harm to customers by exposing personal information. Consumers often are unable to make informed decisions about their digital privacy because they are in a position of asymmetric information. There are an increasing number of privacy regulations to give consumers more control over their data. This chapter provides an overview of data privacy regulations, including GDPR. In today's globalized economy, the patchwork of international privacy regulations is difficult to navigate, and, in many instances, fails to provide adequate business certainty or consumer protection. This chapter also discusses current research and implications for costs, data-driven innovation, and consumer trust.


2017 ◽  
Vol 261 ◽  
pp. 184-192 ◽  
Author(s):  
Linlin Ding ◽  
Yu Liu ◽  
Baishuo Han ◽  
Shiwen Zhang ◽  
Baoyan Song

Sensors ◽  
2018 ◽  
Vol 18 (7) ◽  
pp. 2307 ◽  
Author(s):  
Yancheng Shi ◽  
Zhenjiang Zhang ◽  
Han-Chieh Chao ◽  
Bo Shen

With the rapid development of information technology, large-scale personal data, including those collected by sensors or IoT devices, is stored in the cloud or data centers. In some cases, the owners of the cloud or data centers need to publish the data. Therefore, how to make the best use of the data in the risk of personal information leakage has become a popular research topic. The most common method of data privacy protection is the data anonymization, which has two main problems: (1) The availability of information after clustering will be reduced, and it cannot be flexibly adjusted. (2) Most methods are static. When the data is released multiple times, it will cause personal privacy leakage. To solve the problems, this article has two contributions. The first one is to propose a new method based on micro-aggregation to complete the process of clustering. In this way, the data availability and the privacy protection can be adjusted flexibly by considering the concepts of distance and information entropy. The second contribution of this article is to propose a dynamic update mechanism that guarantees that the individual privacy is not compromised after the data has been subjected to multiple releases, and minimizes the loss of information. At the end of the article, the algorithm is simulated with real data sets. The availability and advantages of the method are demonstrated by calculating the time, the average information loss and the number of forged data.


2021 ◽  
Vol 2083 (4) ◽  
pp. 042077
Author(s):  
Tongtong Xu ◽  
Lei Shi

Abstract Cloud computing is a new way of computing and storage. Users do not need to master professional skills, but can enjoy convenient network services as long as they pay according to their own needs. When we use cloud services, we need to upload data to cloud servers. As the cloud is an open environment, it is easy for attackers to use cloud computing to conduct excessive computational analysis on big data, which is bound to infringe on others’ privacy. In this process, we inevitably face the challenge of data security. How to ensure data privacy security in the cloud environment has become an urgent problem to be solved. This paper studies the big data security privacy protection based on cloud computing platform. This paper starts from two aspects: implicit security mechanism and display security mechanism (encryption mechanism), so as to protect the security privacy of cloud big data platform in data storage and data computing processing.


2018 ◽  
pp. 73-81
Author(s):  
Heena Makhija ◽  
Bhavesh Bharad

India is agriculture country. Big data has found its way to the agriculture industry. The problem of inflation, wastage, low productivity, soil fertility, productivity, financing to farmers and the lack of institutional farmers can be addressed through the data. However, while it can be helpful with full of opportunities on one level it comes with handful of challenges. The study focuses on challenges such as the use of collected data by farmers and companies, who collect and store data on everything from fertilizers, rate to yield to soil conditions. The study focuses on issues such as data security, data privacy and data analyzing. The paper also highlights challenges faced in agriculture data revolution, such as the approach of companies to sell the data to others or make a new product based on sensitive information.


2017 ◽  
Vol 28 (06) ◽  
pp. 683-703 ◽  
Author(s):  
Youwen Zhu ◽  
Xingxin Li ◽  
Jian Wang ◽  
Yining Liu ◽  
Zhiguo Qu

Cloud can provide much convenience for big data storage and analysis. To enjoy the advantage of cloud service with privacy preservation, huge data is increasingly outsourced to cloud in encrypted form. Unfortunately, encryption may impede the analysis and computation over the outsourced dataset. Naïve Bayesian classification is an effective algorithm to predict the class label of unlabeled samples. In this paper, we investigate naïve Bayesian classification on encrypted large-scale dataset in cloud, and propose a practical and secure scheme for the challenging problem. In our scheme, all the computation task of naïve Bayesian classification are completed by the cloud, which can dramatically reduce the burden of data owner and users. We give a formal security proof for our scheme. Based on the theoretical proof, we can strictly guarantee the privacy of both input dataset and output classification results, i.e., the cloud can learn nothing useful about the training data of data owner and the test samples of users throughout the computation. Additionally, we not only theoretically analyze our computation complexity and communication overheads, but also evaluate our implementation cost by leveraging extensive experiments over real dataset, which shows our scheme can achieve practical efficiency.


2014 ◽  
Vol 39 (4) ◽  
Author(s):  
Julie Frizzo-Barker ◽  
Peter Chow-White

Genomic big data is an emerging information technology, which presents new opportunities for medical innovation as well as new challenges to our current ethical, social and legal infrastructure. Rapid, affordable whole genomic sequencing translates patients’ most sensitive personal information into petabytes of digital health data. While a biomedical approach traditionally focuses on risks and benefits to the human body, the fields of communication and science and technology studies (STS) can provide some of the critical and theoretical tools necessary to navigate the newly emerging terrain of the human body as digital code. Core areas of expertise from these fields including the Internet, the network society and the social constructions of technology ground our discussion of the social implications of open access genomic databases, privacy and informational risk.Le « Big Data » en génomique est une technologie de l’information émergente, qui offre de nouvelles possibilités pour l’innovation médicale et présente de nouveaux défis pour nos structures éthique, sociale et juridique. Un séquençage génomique rapide et abordable, convertit les renseignements personnels les plus sensibles des patients en pétaoctets de données numériques de santé. Tandis que l’approche biomédicale traditionnellement se concentre sur les risques et les bénéfices pour la santé, les Études de la Communication, de la Science et de la Technologie (STS) peuvent fournir certains outils critiques et théoriques nécessaires afin d’explorer le terrain émergent de la représentation numérique du corps humain. Les domaines principaux de ces champs d’étude dont l’Internet, la société en réseau et les constructions sociales de la technologie, forment la base de notre discussion sur les implications sociales de l’accès ouvert aux bases de données génomiques, la confidentialité et les risques liés au stockage et la diffusion de l’information.


Sign in / Sign up

Export Citation Format

Share Document