Machine Learning Concepts for Correlated Big Data Privacy

Abstract With data becoming a salient asset worldwide, dependence within data kept on growing, hence the real world datasets that one works upon in today's time are highly correlated. Since the past few years, researchers have given attention to this aspect of data privacy and found that where there exists a correlation among data, the existing privacy guarantees could not be assured with existing privacy algorithms. The privacy guarantees provided by existing algorithms were enough when there existed no relation between data in the datasets. Hence, by keeping the existence of data correlation into account, there is a dire need, to reconsider the privacy algorithms. Some of the research have considered to utilize a well known machine learning concept, i.e., Data Correlation Analysis to understand the relationship between data in a better way. This has given some promising results as well. Though its less but still a considerable amount of research has been done for correlated data privacy. But correlated big data privacy is very less explored. The real world datasets that are worked upon, are often large in size (technologically termed as big data) and house a high amount of data correlation. Hence, there is a grave need to understand and propose solutions for correlated big data privacy.

Download Full-text

Machine learning concepts for correlated Big Data privacy

Journal Of Big Data ◽

10.1186/s40537-021-00530-x ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Sreemoyee Biswas ◽

Nilay Khare ◽

Pragati Agrawal ◽

Priyank Jain

Keyword(s):

Machine Learning ◽

Big Data ◽

Correlation Analysis ◽

Real World ◽

Data Privacy ◽

Correlated Data ◽

Data Correlation ◽

The Real ◽

Big Data Privacy ◽

Real World Datasets

AbstractWith data becoming a salient asset worldwide, dependence amongst data kept on growing. Hence the real-world datasets that one works upon in today’s time are highly correlated. Since the past few years, researchers have given attention to this aspect of data privacy and found a correlation among data. The existing data privacy guarantees cannot assure the expected data privacy algorithms. The privacy guarantees provided by existing algorithms were enough when there existed no relation between data in the datasets. Hence, by keeping the existence of data correlation into account, there is a dire need to reconsider the privacy algorithms. Some of the research has considered utilizing a well-known machine learning concept, i.e., Data Correlation Analysis, to understand the relationship between data in a better way. This concept has given some promising results as well. Though it is still concise, the researchers did a considerable amount of research on correlated data privacy. Researchers have provided solutions using probabilistic models, behavioral analysis, sensitivity analysis, information theory models, statistical correlation analysis, exhaustive combination analysis, temporal privacy leakages, and weighted hierarchical graphs. Nevertheless, researchers are doing work upon the real-world datasets that are often large (technologically termed big data) and house a high amount of data correlation. Firstly, the data correlation in big data must be studied. Researchers are exploring different analysis techniques to find the best suitable. Then, they might suggest a measure to guarantee privacy for correlated big data. This survey paper presents a detailed survey of the methods proposed by different researchers to deal with the problem of correlated data privacy and correlated big data privacy and highlights the future scope in this area. The quantitative analysis of the reviewed articles suggests that data correlation is a significant threat to data privacy. This threat further gets magnified with big data. While considering and analyzing data correlation, then parameters such as Maximum queries executed, Mean average error values show better results when compared with other methods. Hence, there is a grave need to understand and propose solutions for correlated big data privacy.

Download Full-text

Big Data Privacy and Challenges for Machine Learning

2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC) ◽

10.1109/i-smac49090.2020.9243527 ◽

2020 ◽

Author(s):

Puneet Himthani ◽

Ghanshyam Prasad Dubey ◽

Brij Mohan Sharma ◽

Ankur Taneja

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Privacy ◽

Big Data Privacy

Download Full-text

International Law and New Challenges to Democracy in the Digital Age: Big Data, Privacy and Interferences with the Political Process

SSRN Electronic Journal ◽

10.2139/ssrn.3430035 ◽

2019 ◽

Author(s):

Dominik Steiger

Keyword(s):

Big Data ◽

International Law ◽

Data Privacy ◽

Political Process ◽

Digital Age ◽

The Political ◽

Big Data Privacy ◽

New Challenges

Download Full-text

Big Data Privacy Preservation Using Two Phase Top-Down Specialization Algorithm with Multidimensional Map Reduce Framework on Hadoop

International Journal of Distributed and Cloud Computing ◽

10.21863/ijdcc/2015.3.2.009 ◽

2015 ◽

Vol 3 (2) ◽

Author(s):

Shalin Eliabeth S. ◽

Sarju S.

Keyword(s):

Big Data ◽

Data Privacy ◽

Privacy Preservation ◽

Experimental Result ◽

Map Reduce ◽

Distributed Environment ◽

Top Down ◽

Two Phase ◽

Data Anonymization ◽

Big Data Privacy

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Download Full-text

Treatment of Mycobacterium avium–intracellulare complex lung disease in the real world: a retrospective big data analysis

Drugs & Therapy Perspectives ◽

10.1007/s40267-019-00687-9 ◽

2019 ◽

Vol 36 (2) ◽

pp. 75-82

Author(s):

Tomohide Iwao ◽

Genta Kato ◽

Isao Ito ◽

Toyohiro Hirai ◽

Tomohiro Kuroda

Keyword(s):

Big Data ◽

Data Analysis ◽

Lung Disease ◽

Mycobacterium Avium ◽

Real World ◽

Big Data Analysis ◽

The Real ◽

Mycobacterium Avium Intracellulare

Download Full-text

A Review on Big Data: Privacy and Security Challenges

2021 3rd International Conference on Signal Processing and Communication (ICPSC) ◽

10.1109/icspc51351.2021.9451749 ◽

2021 ◽

Author(s):

Parth Goel ◽

Radhika Patel ◽

Dweepna Garg ◽

Amit Ganatra

Keyword(s):

Big Data ◽

Data Privacy ◽

Privacy And Security ◽

Security Challenges ◽

Big Data Privacy

Download Full-text

Big Data Privacy Breach Prevention Strategies

2020 IEEE International Symposium on Sustainable Energy, Signal Processing and Cyber Security (iSSSC) ◽

10.1109/isssc50941.2020.9358878 ◽

2020 ◽

Author(s):

Shipra Varshney ◽

Dheeraj Munjal ◽

Orijit Bhattacharya ◽

Shagun Saboo ◽

Nikunj Aggarwal

Keyword(s):

Big Data ◽

Data Privacy ◽

Prevention Strategies ◽

Privacy Breach ◽

Big Data Privacy

Download Full-text

Clinical efficacy of chloroquine derivatives in COVID-19 infection: comparative meta-analysis between the big data and the real world

New Microbes and New Infections ◽

10.1016/j.nmni.2020.100709 ◽

2020 ◽

Vol 38 ◽

pp. 100709 ◽

Cited By ~ 20

Author(s):

M. Million ◽

P. Gautret ◽

P. Colson ◽

Y. Roussel ◽

G. Dubourg ◽

...

Keyword(s):

Big Data ◽

Clinical Efficacy ◽

Real World ◽

Meta Analysis ◽

The Real

Download Full-text

How to train your robot with deep reinforcement learning: lessons we have learned

The International Journal of Robotics Research ◽

10.1177/0278364920987859 ◽

2021 ◽

pp. 027836492098785

Author(s):

Julian Ibarz ◽

Jie Tan ◽

Chelsea Finn ◽

Mrinal Kalakrishnan ◽

Peter Pastor ◽

...

Keyword(s):

Machine Learning ◽

Reinforcement Learning ◽

Case Studies ◽

Real World ◽

Review Article ◽

The Real ◽

Complex Skills ◽

Real World Learning ◽

Level Sensor ◽

Embodied Agent

Deep reinforcement learning (RL) has emerged as a promising approach for autonomously acquiring complex behaviors from low-level sensor observations. Although a large portion of deep RL research has focused on applications in video games and simulated control, which does not connect with the constraints of learning in real environments, deep RL has also demonstrated promise in enabling physical robots to learn complex skills in the real world. At the same time, real-world robotics provides an appealing domain for evaluating such algorithms, as it connects directly to how humans learn: as an embodied agent in the real world. Learning to perceive and move in the real world presents numerous challenges, some of which are easier to address than others, and some of which are often not considered in RL research that focuses only on simulated domains. In this review article, we present a number of case studies involving robotic deep RL. Building off of these case studies, we discuss commonly perceived challenges in deep RL and how they have been addressed in these works. We also provide an overview of other outstanding challenges, many of which are unique to the real-world robotics setting and are not often the focus of mainstream RL research. Our goal is to provide a resource both for roboticists and machine learning researchers who are interested in furthering the progress of deep RL in the real world.

Download Full-text

LRDM: Local Record-Driving Mechanism for Big Data Privacy Preservation in Social Networks

2016 IEEE First International Conference on Data Science in Cyberspace (DSC) ◽

10.1109/dsc.2016.94 ◽

2016 ◽

Cited By ~ 2

Author(s):

Weihao Li ◽

Hui Li

Keyword(s):

Social Networks ◽

Big Data ◽

Data Privacy ◽

Privacy Preservation ◽

Driving Mechanism ◽

Big Data Privacy

Download Full-text