Privacy Preservation over Big Data in Cloud Systems

Author(s):  
Xuyun Zhang ◽  
Chang Liu ◽  
Surya Nepal ◽  
Chi Yang ◽  
Jinjun Chen
Author(s):  
Shalin Eliabeth S. ◽  
Sarju S.

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.


Author(s):  
Sebastian Dippl ◽  
Michael C. Jaeger ◽  
Achim Luhn ◽  
Alexandra Shulman-Peleg ◽  
Gil Vernik

While it is common to use storage in a cloud-based manner, the question of true interoperability is rarely fully addressed. This question becomes even more relevant since the steadily growing amount of data that needs to be stored will supersede the capacity of a single system in terms of resources, availability, and network throughput quite soon. The logical conclusion is that a network of systems needs to be created that is able to cope with the requirements of big data applications and data deluge scenarios. This chapter shows how federation and interoperability will fit into a cloud storage scenario. The authors take a look at the challenges that federation imposes on autonomous, heterogeneous, and distributed cloud systems, and present approaches that help deal with the special requirements introduced by the VISION Cloud use cases from healthcare, media, telecommunications, and enterprise domains. Finally, the authors give an overview on how VISION Cloud addresses these requirements in its research scenarios and architecture.


Author(s):  
Ramgopal Kashyap ◽  
Albert D. Piersson

The motivation behind this chapter is to highlight the qualities, security issue, advantages, and disadvantages of big data. In the recent researches, the issue and challenges are due to the exponential growth of social media data and other images and videos. Big data security threats are rising, which is affecting the data heterogeneity adaptability and privacy preservation analytics. Big data analytics helps cyber security, but no new application can be envisioned without delivering new types of information, working on data-driven calculations and expending determined measure of information. This chapter demonstrates how innate attributes of big data are protected.


Author(s):  
Nancy Victor ◽  
Daphne Lopez

Data privacy plays a noteworthy part in today's digital world where information is gathered at exceptional rates from different sources. Privacy preserving data publishing refers to the process of publishing personal data without questioning the privacy of individuals in any manner. A variety of approaches have been devised to forfend consumer privacy by applying traditional anonymization mechanisms. But these mechanisms are not well suited for Big Data, as the data which is generated nowadays is not just structured in manner. The data which is generated at very high velocities from various sources includes unstructured and semi-structured information, and thus becomes very difficult to process using traditional mechanisms. This chapter focuses on the various challenges with Big Data, PPDM and PPDP techniques for Big Data and how well it can be scaled for processing both historical and real-time data together using Lambda architecture. A distributed framework for privacy preservation in Big Data by combining Natural language processing techniques is also proposed in this chapter.


2018 ◽  
Vol 5 (1) ◽  
Author(s):  
P. Ram Mohan Rao ◽  
S. Murali Krishna ◽  
A. P. Siva Kumar

Sign in / Sign up

Export Citation Format

Share Document