Privacy and Security Policies in Big Data - Advances in Information Security, Privacy, and Ethics
Latest Publications


TOTAL DOCUMENTS

12
(FIVE YEARS 0)

H-INDEX

2
(FIVE YEARS 0)

Published By IGI Global

9781522524861, 9781522524878

Author(s):  
Trupti Vishwambhar Kenekar ◽  
Ajay R. Dani

As Big Data is group of structured, unstructured and semi-structure data collected from various sources, it is important to mine and provide privacy to individual data. Differential Privacy is one the best measure which provides strong privacy guarantee. The chapter proposed differentially private frequent item set mining using map reduce requires less time for privately mining large dataset. The chapter discussed problem of preserving data privacy, different challenges to preserving data privacy in big data environment, Data privacy techniques and their applications to unstructured data. The analyses of experimental results on structured and unstructured data set are also presented.


Author(s):  
Jens Kohler ◽  
Christian Richard Lorenz ◽  
Markus Gumbel ◽  
Thomas Specht ◽  
Kiril Simov

In recent years, Cloud Computing has drastically changed IT-Architectures in enterprises throughout various branches and countries. Dynamically scalable capabilities like CPUs, storage space, virtual networks, etc. promise cost savings, as huge initial infrastructure investments are not required anymore. This development shows that Cloud Computing is also a promising technology driver for Big Data, as the storage of unstructured data when no concrete and defined data schemes (variety) can be managed with upcoming NoSQL architectures. However, in order to fully exploit these advantages, the integration of a trustworthy 3rd party public cloud provider is necessary. Thus, challenging questions concerning security, compliance, anonymization, and privacy emerge and are still unsolved. To address these challenges, this work presents, implements and evaluates a security-by-distribution approach for NoSQL document stores that distributes data across various cloud providers such that every provider only gets a small data chunk which is worthless without the others.


Author(s):  
Sonali Tidke

MongoDB is a NoSQL type of database management system which does not adhere to the commonly used relational database management model. MongoDB is used for horizontal scaling across a large number of servers which may have tens, hundreds or even thousands of servers. This horizontal scaling is performed using sharding. Sharding is a database partitioning technique which partitions large database into smaller parts which are easy to manage and faster to access. There are hundreds of NoSQL databases available in the market. But each NoSQL product is different in terms of features, implementations and behavior. NoSQL and RDBMS solve different set of problems and have different requirements. MongoDB has a powerful query language which extends SQL to JSON enabling developers to take benefit of power of SQL and flexibility of JSON. Along with support for select/from/where type of queries, MongoDB supports aggregation, sorting, joins as well as nested array and collections. To improve query performance, indexes and many more features are also available.


Author(s):  
Krishnan Umachandran ◽  
Debra Sharon Ferdinand-James

Continued technological advancements of the 21st Century afford massive data generation in sectors of our economy to include the domains of agriculture, manufacturing, and education. However, harnessing such large-scale data, using modern technologies for effective decision-making appears to be an evolving science that requires knowledge of Big Data management and analytics. Big data in agriculture, manufacturing, and education are varied such as voluminous text, images, and graphs. Applying Big data science techniques (e.g., functional algorithms) for extracting intelligence data affords decision markers quick response to productivity, market resilience, and student enrollment challenges in today's unpredictable markets. This chapter serves to employ data science for potential solutions to Big Data applications in the sectors of agriculture, manufacturing and education to a lesser extent, using modern technological tools such as Hadoop, Hive, Sqoop, and MongoDB.


Author(s):  
Marmar Moussa ◽  
Steven A. Demurjian

This chapter presents a survey of the most important security and privacy issues related to large-scale data sharing and mining in big data with focus on differential privacy as a promising approach for achieving privacy especially in statistical databases often used in healthcare. A case study is presented utilizing differential privacy in healthcare domain, the chapter analyzes and compares the major differentially private data release strategies and noise mechanisms such as the Laplace and the exponential mechanisms. The background section discusses several security and privacy approaches in big data including authentication and encryption protocols, and privacy preserving techniques such as k-anonymity. Next, the chapter introduces the differential privacy concepts used in the interactive and non-interactive data sharing models and the various noise mechanisms used. An instrumental case study is then presented to examine the effect of applying differential privacy in analytics. The chapter then explores the future trends and finally, provides a conclusion.


Author(s):  
Ritesh Anilkumar Gangwal ◽  
Ratnadeep R. Deshmukh ◽  
M. Emmanuel

Big data as the name would refer to a subsequently large quantity of data which is being processed. With the advent of social media the data presently available is text, images, audio video. In order to process this data belonging to variety of format led to the concept of Big Data processing. To overcome these challenges of data, big data techniques evolved. Various tools are available for the big data naming MAP Reduce, etc. But to get the taste of Cloud based tool we would be working with the Microsoft Azure. Microsoft Azure is an integrated environment for the Big data analytics along with the SaaS Cloud platform. For the purpose of experiment, the Prostate cancer data is used to perform the predictive analysis for the Cancer growth in the gland. An experiment depending on the segmentation results of Prostate MRI scans is used for the predictive analytics using the SVM. Performance analysis with the ROC, Accuracy and Confusion matrix gives the resultant analysis with the visual artifacts. With the trained model, the proposed experiment can statistically predict the cancer growth.


Author(s):  
Lokukaluge P. Perera ◽  
Brage Mo

Modern vessels are monitored by Onboard Internet of Things (IoT), sensors and data acquisition (DAQ), to observe ship performance and navigation conditions. Such IoT may create various shipping industrial challenges under large-scale data handling situations. These large-scale data handling issues are often categorized as “Big Data” challenges and this chapter discusses various solutions to overcome such challenges. That consists of a data-handling framework with various data analytics under onboard IoT. The basis for such data analytics is under data driven models presented and developed with engine-propeller combinator diagrams of vessels. The respective results on data analytics of data classification, sensor faults detection, data compression and expansion, integrity verification and regression, and visualization and decision support, are presented along the proposed data handling framework of a selected vessel. Finally, the results are useful for energy efficiency and system reliability applications of shipping discussed.


Author(s):  
Shraddha Pankaj Phansalkar ◽  
Ajay Dani

Contemporary web-applications are deployed on the cloud data-stores for realizing requirements like low latency and high scalability. Although cloud-based database applications exhibit high performance with these features, they compromise on the weaker consistency levels. Rationing the consistency guarantees of an application is a necessity to achieve the augmented metrics of application performance. The proposed work is a paradigm shift from monotonic transaction consistency to selective data consistency in web database applications. The selective data consistency model leverages consistency of critical data-objects and leaves consistency of non-critical data-objects to underlying cloud data-store; it is called selective consistency and it results in better performance of the cloud-based applications. The consistency of the underlying data-object is defined from user-perspective with a user-friendly consistency metric called Consistency Index (CI). The selective data consistency model is implemented on a cloud data-store with OLTP workload and the performance is gauged.


Author(s):  
Meenu Gupta ◽  
Neha Singla

Data can be anything but from a large data base extraction of useful information is known as data mining. Cloud computing is a term which represent a collection of huge amount of data. Cloud computing can be correlated with data mining and Big Data Hadoop. Big data is high volume, high velocity, and/or high variety information asset that require new form of processing to enable enhanced decision making, insight discovery and process optimization. Data growth, speed and complexity are being accompanied by deployment of smart sensors and devices that transmit data commonly called the Internet of Things, multimedia and by other sources of semi-structured and structured data. Big Data is defined as the core element of nearly every digital transformation today.


Author(s):  
Sharvari C. Tamane ◽  
Vijender K. Solanki ◽  
Madhuri S. Joshi

The chapter is written on two important buildings, the basics of Big data and their security concern. The chapter is classifying in different sections. The chapter starts with the basic of big data and is concluded with security concern. The chapter is enriched with different category examples to make texts easy for author understanding. The chapter begins with the introduction of big data, their memory size followed by the examples. The chapter explains the category of big data in type of structured, semi-structured and unstructured data. The discussion on operational data service and big data application is also included to ensure the basic understanding to readers. The second portion of chapter which is based on security in big data. It's explaining the issues and challenges in big data. The section also focusing on the shift paradigm from cloud environment to big data environment changes and the problems encounter by organizations. The section discusses the framework issue and concluded with the necessity of understanding security in the big data, keeping in view of expansion of information technology infrastructure in the 21st century.


Sign in / Sign up

Export Citation Format

Share Document