Big Data Anonymization Requirements vs Privacy Models

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Download Full-text

Two-phase entropy based approach to big data anonymization

2016 International Conference on Computing, Communication and Automation (ICCCA) ◽

10.1109/ccaa.2016.7813693 ◽

2016 ◽

Cited By ~ 2

Author(s):

Ashish Ranjan ◽

Prabhat Ranjan

Keyword(s):

Big Data ◽

Two Phase ◽

Data Anonymization

Download Full-text

Population-Based Linkage of Big Data in Dental Research

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph15112357 ◽

2018 ◽

Vol 15 (11) ◽

pp. 2357 ◽

Cited By ~ 6

Author(s):

Tim Joda ◽

Tuomas Waltimo ◽

Christiane Pauli-Magnus ◽

Nicole Probst-Hensch ◽

Nicola Zitzmann

Keyword(s):

Clinical Trials ◽

Big Data ◽

Healthcare Services ◽

Population Based ◽

Services Research ◽

Research Education ◽

Data Anonymization ◽

Dental Research ◽

Epidemiological Surveys ◽

Patient Consent

Population-based linkage of patient-level information opens new strategies for dental research to identify unknown correlations of diseases, prognostic factors, novel treatment concepts and evaluate healthcare systems. As clinical trials have become more complex and inefficient, register-based controlled (clinical) trials (RC(C)T) are a promising approach in dental research. RC(C)Ts provide comprehensive information on hard-to-reach populations, allow observations with minimal loss to follow-up, but require large sample sizes with generating high level of external validity. Collecting data is only valuable if this is done systematically according to harmonized and inter-linkable standards involving a universally accepted general patient consent. Secure data anonymization is crucial, but potential re-identification of individuals poses several challenges. Population-based linkage of big data is a game changer for epidemiological surveys in Public Health and will play a predominant role in future dental research by influencing healthcare services, research, education, biotechnology, insurance, social policy and governmental affairs.

Download Full-text

Scalable Two-Phase Top-Down Specification for Big Data Anonymization Using Apache Pig

Advances in Intelligent Systems and Computing - Advances in Artificial Intelligence and Data Engineering ◽

10.1007/978-981-15-3514-7_75 ◽

2020 ◽

pp. 1009-1021

Author(s):

Anushree Raj ◽

Rio D’Souza

Keyword(s):

Big Data ◽

Top Down ◽

Two Phase ◽

Data Anonymization ◽

Apache Pig

Download Full-text

A Secure Protocol for High-Dimensional Big Data Providing Data Privacy

Research Anthology on Privatizing and Securing Data ◽

10.4018/978-1-7998-8954-0.ch015 ◽

2021 ◽

pp. 327-343

Author(s):

Anitha J. ◽

Prasad S. P.

Keyword(s):

Big Data ◽

Data Storage ◽

Data Privacy ◽

Personal Information ◽

Technological Development ◽

High Dimensional ◽

Sensitive Information ◽

Data Anonymization ◽

Secure Protocol ◽

Data Owner

Due to recent technological development, a huge amount of data generated by social networking, sensor networks, internet, etc., adds more challenges when performing data storage and processing tasks. During PPDP, the collected data may contain sensitive information about the data owner. Directly releasing this for further processing may violate the privacy of the data owner, hence data modification is needed so that it does not disclose any personal information. The existing techniques of data anonymization have a fixed scheme with a small number of dimensions. There are various types of attacks on the privacy of data like linkage attack, homogeneity attack, and background knowledge attack. To provide an effective technique in big data to maintain data privacy and prevent linkage attacks, this paper proposes a privacy preserving protocol, UNION, for a multi-party data provider. Experiments show that this technique provides a better data utility to handle high dimensional data, and scalability with respect to the data size compared with existing anonymization techniques.

Download Full-text

Toward a Posthumanist Ethics of Qualitative Research in a Big Data Era

American Behavioral Scientist ◽

10.1177/0002764218792701 ◽

2018 ◽

Vol 63 (6) ◽

pp. 669-698 ◽

Cited By ~ 4

Author(s):

Natasha S. Mauthner

Keyword(s):

Qualitative Research ◽

Big Data ◽

Ethical Issues ◽

Ethical Concern ◽

Power Relationship ◽

Ethical Practice ◽

Research Subjects ◽

Data Anonymization ◽

Research Participants ◽

The World

The Big Data phenomenon, and its uptake in qualitative research, raises ethical issues around data aggregation, data linkages, and data anonymization as well as concerns around changing meanings and possibilities of informed consent and privacy protection. In this article, I address the ethical issues that arise from Big Data through a posthumanist philosophical framework. The humanist ethics that underpins normative ethical concerns—as outlined above—focuses on the unequal power relationship between researchers and research subjects and the potential harm that research can cause to research participants. Ethical practice consists in following guidelines and codes of ethical conduct designed, not so much to avoid these power differentials, but to protect research participants from potential exploitation and infringements of their human rights. Unethical research is understood as research that breaches these principles and/or harms its research subjects. A posthumanist ethics treats knowledge-making itself as a matter of ethical concern. It shifts the focus away from the power of researchers over research participants toward the “world-making” powers of practices of inquiry: their ability to constitute (and not simply discover) the very nature of their objects/subjects of study. Its focus of ethical concern—what it regards as unethical—is research that claims to represent the world “as it really is.” On this approach, ethical practice consists in accounting for the ways in which research ontologically constitutes its objects and subjects of study. The critical intervention made possible by bringing a posthumanist perspective to bear on the ethics of qualitative research in a Big Data era is to foreground Big Data’s treatment of data as self-evident, and its positivist claim to represent the world innocently, accurately, and objectively, as matters of ethical concern.

Download Full-text

Semantic-based graph data anonymization for big data analysis

2016 International Conference on Machine Learning and Cybernetics (ICMLC) ◽

10.1109/icmlc.2016.7872955 ◽

2016 ◽

Cited By ~ 1

Author(s):

Shu-Ming Hsieh ◽

Mao-Hsu Yen ◽

Li-Jen Kao

Keyword(s):

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Graph Data ◽

Data Anonymization

Download Full-text

A Secure Protocol for High-Dimensional Big Data Providing Data Privacy

Handbook of Research on Machine and Deep Learning Applications for Cyber Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-9611-0.ch016 ◽

2020 ◽

pp. 347-363

Author(s):

Anitha J. ◽

Prasad S. P.

Keyword(s):

Big Data ◽

Data Storage ◽

Data Privacy ◽

Personal Information ◽

Technological Development ◽

High Dimensional ◽

Sensitive Information ◽

Data Anonymization ◽

Secure Protocol ◽

Data Owner

Due to recent technological development, a huge amount of data generated by social networking, sensor networks, internet, etc., adds more challenges when performing data storage and processing tasks. During PPDP, the collected data may contain sensitive information about the data owner. Directly releasing this for further processing may violate the privacy of the data owner, hence data modification is needed so that it does not disclose any personal information. The existing techniques of data anonymization have a fixed scheme with a small number of dimensions. There are various types of attacks on the privacy of data like linkage attack, homogeneity attack, and background knowledge attack. To provide an effective technique in big data to maintain data privacy and prevent linkage attacks, this paper proposes a privacy preserving protocol, UNION, for a multi-party data provider. Experiments show that this technique provides a better data utility to handle high dimensional data, and scalability with respect to the data size compared with existing anonymization techniques.

Download Full-text

A Scalable and Cost-Effective Data Anonymization over Big Data using MapReduce on Cloud

i-manager’s Journal on Cloud Computing ◽

10.26634/jcc.2.2.3449 ◽

2015 ◽

Vol 2 (2) ◽

pp. 31-39 ◽

Cited By ~ 1

Author(s):

Shalin Elizabeth. S ◽

◽

S. Sarju ◽

Keyword(s):

Big Data ◽

Cost Effective ◽

Data Anonymization

Download Full-text