- Departments Involved in Enterprise Data Anonymization Program

Big data privacy preservation is one of the most disturbed issues in current industry. Sometimes the data privacy problems never identified when input data is published on cloud environment. Data privacy preservation in hadoop deals in hiding and publishing input dataset to the distributed environment. In this paper investigate the problem of big data anonymization for privacy preservation from the perspectives of scalability and time factor etc. At present, many cloud applications with big data anonymization faces the same kind of problems. For recovering this kind of problems, here introduced a data anonymization algorithm called Two Phase Top-Down Specialization (TPTDS) algorithm that is implemented in hadoop. For the data anonymization-45,222 records of adults information with 15 attribute values was taken as the input big data. With the help of multidimensional anonymization in map reduce framework, here implemented proposed Two-Phase Top-Down Specialization anonymization algorithm in hadoop and it will increases the efficiency on the big data processing system. By conducting experiment in both one dimensional and multidimensional map reduce framework with Two Phase Top-Down Specialization algorithm on hadoop, the better result shown in multidimensional anonymization on input adult dataset. Data sets is generalized in a top-down manner and the better result was shown in multidimensional map reduce framework by the better IGPL values generated by the algorithm. The anonymization was performed with specialization operation on taxonomy tree. The experiment shows that the solutions improves the IGPL values, anonymity parameter and decreases the execution time of big data privacy preservation by compared to the existing algorithm. This experimental result will leads to great application to the distributed environment.

Download Full-text

Attribute Compartmentation and Greedy UCC Discovery for High-Dimensional Data Anonymization

Proceedings of the Ninth ACM Conference on Data and Application Security and Privacy - CODASPY '19 ◽

10.1145/3292006.3300019 ◽

2019 ◽

Cited By ~ 1

Author(s):

Nikolai J. Podlesny ◽

Anne V.D.M. Kayem ◽

Christoph Meinel

Keyword(s):

High Dimensional Data ◽

High Dimensional ◽

Data Anonymization

Download Full-text

Application of data anonymization in Learning Analytics

Proceedings of the 3rd International Conference on Applications of Intelligent Systems ◽

10.1145/3378184.3378229 ◽

2020 ◽

Author(s):

Janneth Chicaiza ◽

Ma. Carmen Cabrera-Loayza ◽

Rene Elizalde ◽

Nelson Piedra

Keyword(s):

Learning Analytics ◽

Data Anonymization

Download Full-text

Two-phase entropy based approach to big data anonymization

2016 International Conference on Computing, Communication and Automation (ICCCA) ◽

10.1109/ccaa.2016.7813693 ◽

2016 ◽

Cited By ~ 2

Author(s):

Ashish Ranjan ◽

Prabhat Ranjan

Keyword(s):

Big Data ◽

Two Phase ◽

Data Anonymization

Download Full-text

On the Role of Data Anonymization in Machine Learning Privacy

2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) ◽

10.1109/trustcom50675.2020.00093 ◽

2020 ◽

Author(s):

Navoda Senavirathne ◽

Vicenc Torra

Keyword(s):

Machine Learning ◽

Data Anonymization

Download Full-text

Population-Based Linkage of Big Data in Dental Research

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph15112357 ◽

2018 ◽

Vol 15 (11) ◽

pp. 2357 ◽

Cited By ~ 6

Author(s):

Tim Joda ◽

Tuomas Waltimo ◽

Christiane Pauli-Magnus ◽

Nicole Probst-Hensch ◽

Nicola Zitzmann

Keyword(s):

Clinical Trials ◽

Big Data ◽

Healthcare Services ◽

Population Based ◽

Services Research ◽

Research Education ◽

Data Anonymization ◽

Dental Research ◽

Epidemiological Surveys ◽

Patient Consent

Population-based linkage of patient-level information opens new strategies for dental research to identify unknown correlations of diseases, prognostic factors, novel treatment concepts and evaluate healthcare systems. As clinical trials have become more complex and inefficient, register-based controlled (clinical) trials (RC(C)T) are a promising approach in dental research. RC(C)Ts provide comprehensive information on hard-to-reach populations, allow observations with minimal loss to follow-up, but require large sample sizes with generating high level of external validity. Collecting data is only valuable if this is done systematically according to harmonized and inter-linkable standards involving a universally accepted general patient consent. Secure data anonymization is crucial, but potential re-identification of individuals poses several challenges. Population-based linkage of big data is a game changer for epidemiological surveys in Public Health and will play a predominant role in future dental research by influencing healthcare services, research, education, biotechnology, insurance, social policy and governmental affairs.

Download Full-text

- The Different Phases of a Data Anonymization Program

The Complete Book of Data Anonymization ◽

10.1201/b13097-9 ◽

2013 ◽

pp. 74-87

Keyword(s):

Data Anonymization

Download Full-text

SparkDA: RDD-Based High-Performance Data Anonymization Technique for Spark Platform

Network and System Security - Lecture Notes in Computer Science ◽

10.1007/978-3-030-36938-5_40 ◽

2019 ◽

pp. 646-662

Author(s):

Sibghat Ullah Bazai ◽

Julian Jang-Jaccard

Keyword(s):

High Performance ◽

Performance Data ◽

Data Anonymization

Download Full-text

The issues connected with the anonymization of medical data. Part 2. Advanced anonymization and anonymization controlled by owner of protected sensitive data

HIGHER SCHOOL’S PULSE ◽

10.5604/01.3001.0003.3161 ◽

2014 ◽

Vol 8 (2) ◽

pp. 13-24 ◽

Cited By ~ 1

Author(s):

Arkadiusz Liber

Keyword(s):

Privacy Protection ◽

Differential Privacy ◽

Data Access ◽

Medical Data ◽

Medical Documentation ◽

Research Review ◽

Sensitive Data ◽

Data Anonymization ◽

Data Access Control ◽

Anonymized Data

Introduction: Medical documentation ought to be accessible with the preservation of its integrity as well as the protection of personal data. One of the manners of its protection against disclosure is anonymization. Contemporary methods ensure anonymity without the possibility of sensitive data access control. it seems that the future of sensitive data processing systems belongs to the personalized method. In the first part of the paper k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, and (k,e)-Anonymity methods were discussed. these methods belong to well - known elementary methods which are the subject of a significant number of publications. As the source papers to this part, Samarati, Sweeney, wang, wong and zhang’s works were accredited. the selection of these publications is justified by their wider research review work led, for instance, by Fung, Wang, Fu and y. however, it should be noted that the methods of anonymization derive from the methods of statistical databases protection from the 70s of 20th century. Due to the interrelated content and literature references the first and the second part of this article constitute the integral whole.Aim of the study: The analysis of the methods of anonymization, the analysis of the methods of protection of anonymized data, the study of a new security type of privacy enabling device to control disclosing sensitive data by the entity which this data concerns.Material and methods: Analytical methods, algebraic methods.Results: Delivering material supporting the choice and analysis of the ways of anonymization of medical data, developing a new privacy protection solution enabling the control of sensitive data by entities which this data concerns.Conclusions: In the paper the analysis of solutions for data anonymization, to ensure privacy protection in medical data sets, was conducted. the methods of: k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, (k,e)-Anonymity, (X,y)-Privacy, lKc-Privacy, l-Diversity, (X,y)-linkability, t-closeness, confidence Bounding and Personalized Privacy were described, explained and analyzed. The analysis of solutions of controlling sensitive data by their owner was also conducted. Apart from the existing methods of the anonymization, the analysis of methods of the protection of anonymized data was included. In particular, the methods of: δ-Presence, e-Differential Privacy, (d,γ)-Privacy, (α,β)-Distributing Privacy and protections against (c,t)-isolation were analyzed. Moreover, the author introduced a new solution of the controlled protection of privacy. the solution is based on marking a protected field and the multi-key encryption of sensitive value. The suggested way of marking the fields is in accordance with Xmlstandard. For the encryption, (n,p) different keys cipher was selected. to decipher the content the p keys of n were used. The proposed solution enables to apply brand new methods to control privacy of disclosing sensitive data.

Download Full-text

A Survey on Data Anonymization Using Mapreduce on Cloud with Scalable Two-Phase Top-Down Approach

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.20.14773 ◽

2018 ◽

Vol 7 (2.20) ◽

pp. 254

Author(s):

M Dhasaratham ◽

R P. Singh

Keyword(s):

Large Scale ◽

Public Information ◽

Top Down ◽

Two Phase ◽

Data Anonymization ◽

Security Issues ◽

Cloud Applications ◽

Massive Information ◽

Broad Scale

Endless forces anticipate that customers can cut non-public information like electronic prosperity records for information examination or mining, transferral security issues. Anonymizing instructional accumulations by ways for hypothesis to satisfy bound assurance necessities, parenthetically, k-anonymity may be a for the foremost half used arrangement of security shielding frameworks. At appear, the live of information in varied cloud applications augments massively consistent with the massive information slant, on these lines creating it a take a look at for habitually used programming instruments to confine, supervise, and method such large scale information within an appropriate snuck hobby. during this manner, it's a take a look at for existing anonymization approaches to manage accomplish security preservation on insurance sensitive monumental scale instructive files as a results of their insufficiency of skillfulness. during this paper, we have a tendency to propose a versatile 2 part top-down specialization (TDS) to anonymize broad scale instructive accumulations victimisation the MapReduce structure on cloud. In mboth times of our approach, we have a tendency to advisedly layout a affair of innovative MapReduce occupations to determinedly accomplish the specialization reckoning in an awfully versatile means. wildcat assessment happens demonstrate that with our approach, the flexibleness and adequacy of TDS may be basically redesigned over existing philosophies.

Download Full-text