Privacy Preserving Data Mining

Author(s):  
Aris Gkoulalas-Divanis ◽  
Vassilios S. Verykios

Since its inception in 2000, privacy preserving data mining has gained increasing popularity in the data mining research community. This line of research can be primarily attributed to the growing concern of individuals, organizations and the government regarding the violation of privacy in the mining of their data by the existing data mining technology. As a result, a whole new body of research was introduced to allow for the mining of data, while at the same time prohibiting the leakage of any private and sensitive information. In this chapter, the authors introduce the readers to the field of privacy preserving data mining; they discuss the reasons that led to its inception, the most prominent research directions, as well as some important methodologies per direction. Following that, the authors focus their attention on very recently investigated methodologies for the offering of privacy during the mining of user mobility data. In the end of the chapter, they provide a roadmap along with potential future research directions both with respect to the field of privacy-aware mobility data mining and to privacy preserving data mining at large.

2014 ◽  
Vol 11 (2) ◽  
pp. 163-170
Author(s):  
Binli Wang ◽  
Yanguang Shen

Recently, with the rapid development of network, communications and computer technology, privacy preserving data mining (PPDM) has become an increasingly important research in the field of data mining. In distributed environment, how to protect data privacy while doing data mining jobs from a large number of distributed data is more far-researching. This paper describes current research of PPDM at home and abroad. Then it puts emphasis on classifying the typical uses and algorithms of PPDM in distributed environment, and summarizing their advantages and disadvantages. Furthermore, it points out the future research directions in the field.


2018 ◽  
Vol 12 (3) ◽  
pp. 141-163 ◽  
Author(s):  
S. Vijayarani Mohan ◽  
Tamilarasi Angamuthu

This article describes how privacy preserving data mining has become one of the most important and interesting research directions in data mining. With the help of data mining techniques, people can extract hidden information and discover patterns and relationships between the data items. In most of the situations, the extracted knowledge contains sensitive information about individuals and organizations. Moreover, this sensitive information can be misused for various purposes which violate the individual's privacy. Association rules frequently predetermine significant target marketing information about a business. Significant association rules provide knowledge to the data miner as they effectively summarize the data, while uncovering any hidden relations among items that hold in the data. Association rule hiding techniques are used for protecting the knowledge extracted by the sensitive association rules during the process of association rule mining. Association rule hiding refers to the process of modifying the original database in such a way that certain sensitive association rules disappear without seriously affecting the data and the non-sensitive rules. In this article, two new hiding techniques are proposed namely hiding technique based on genetic algorithm (HGA) and dummy items creation (DIC) technique. Hiding technique based on genetic algorithm is used for hiding sensitive association rules and the dummy items creation technique hides the sensitive rules as well as it creates dummy items for the modified sensitive items. Experimental results show the performance of the proposed techniques.


Author(s):  
Constanţa-Nicoleta Bodea ◽  
Maria-Iuliana Dascalu ◽  
Radu Ioan Mogos ◽  
Stelian Stancu

Reinforcement of the technology-enhanced education transformed education into a data-intensive domain. As in many other data-intensive domains, the interest for data analysis through various analytics is growing. The article starts by defining LA, with relevant views on the literature. A discussion about the relationships between LA, educational data mining and academic analytics is included in the background section. In the main section of the article, the learning analytics, as an emerging trend in the educational systems is describe, by discussing the main issues, controversies, problems on this topic. Final part of the article presents the future research directions and the conclusion.


2008 ◽  
pp. 2379-2401 ◽  
Author(s):  
Igor Nai Fovino

Intense work in the area of data mining technology and in its applications to several domains has resulted in the development of a large variety of techniques and tools able to automatically and intelligently transform large amounts of data in knowledge relevant to users. However, as with other kinds of useful technologies, the knowledge discovery process can be misused. It can be used, for example, by malicious subjects in order to reconstruct sensitive information for which they do not have an explicit access authorization. This type of “attack” cannot easily be detected, because, usually, the data used to guess the protected information, is freely accessible. For this reason, many research efforts have been recently devoted to addressing the problem of privacy preserving in data mining. The mission of this chapter is therefore to introduce the reader in this new research field and to provide the proper instruments (in term of concepts, techniques and example) in order to allow a critical comprehension of the advantages, the limitations and the open issues of the Privacy Preserving Data Mining Techniques.


2008 ◽  
pp. 849-879
Author(s):  
Dan A. Simovici

This chapter presents data mining techniques that make use of metrics defined on the set of partitions of finite sets. Partitions are naturally associated with object attributes and major data mining problem such as classification, clustering, and data preparation benefit from an algebraic and geometric study of the metric space of partitions. The metrics we find most useful are derived from a generalization of the entropic metric. We discuss techniques that produce smaller classifiers, allow incremental clustering of categorical data and help user to better prepare training data for constructing classifiers. Finally, we discuss open problems and future research directions.


2008 ◽  
pp. 693-704
Author(s):  
Bhavani Thuraisingham

This article first describes the privacy concerns that arise due to data mining, especially for national security applications. Then we discuss privacy-preserving data mining. In particular, we view the privacy problem as a form of inference problem and introduce the notion of privacy constraints. We also describe an approach for privacy constraint processing and discuss its relationship to privacy-preserving data mining. Then we give an overview of the developments on privacy-preserving data mining that attempt to maintain privacy and at the same time extract useful information from data mining. Finally, some directions for future research on privacy as related to data mining are given.


Author(s):  
Md Mahbubur Rahim ◽  
Maryam Jabberzadeh ◽  
Nergiz Ilhan

E-procurement systems that have been in place for over a decade have begun incorporating digital tools like big data, cloud computing, internet of things, and data mining. Hence, there exists a rich literature on earlier e-procurement systems and advanced digitally-enabled e-procurement systems. Existing literature on these systems addresses many research issues (e.g., adoption) associated with e-procurement. However, one critical issue that has so far received no rigorous attention is about “unit of analysis,” a methodological concern of importance, for e-procurement research context. Hence, the aim of this chapter is twofold: 1) to discuss how the notion of “unit of analysis” has been conceptualised in the e-procurement literature and 2) to discuss how its use has been justified by e-procurement scholars to address the research issues under investigation. Finally, the chapter provides several interesting findings and outlines future research directions.


2022 ◽  
pp. 1477-1503
Author(s):  
Ali Al Mazari

HIV/AIDS big data analytics evolved as a potential initiative enabling the connection between three major scientific disciplines: (1) the HIV biology emergence and evolution; (2) the clinical and medical complex problems and practices associated with the infections and diseases; and (3) the computational methods for the mining of HIV/AIDS biological, medical, and clinical big data. This chapter provides a review on the computational and data mining perspectives on HIV/AIDS in big data era. The chapter focuses on the research opportunities in this domain, identifies the challenges facing the development of big data analytics in HIV/AIDS domain, and then highlights the future research directions of big data in the healthcare sector.


Author(s):  
Boutheina Fessi ◽  
Yacine Djemaiel ◽  
Noureddine Boudriga

This chapter provides a review about the usefulness of applying data mining techniques to detect intrusion within dynamic environments and its contribution in digital investigation. Numerous applications and models are described based on data mining analytics. The chapter addresses also different requirements that should be fulfilled to efficiently perform cyber-crime investigation based on data mining analytics. It states, at the end, future research directions related to cyber-crime investigation that could be investigated and presents new trends of data mining techniques that deal with big data to detect attacks.


2016 ◽  
Vol 7 (3) ◽  
pp. 1-9 ◽  
Author(s):  
Sahar A. El-Rahman Ismail ◽  
Dalal Al Makhdhub ◽  
Amal A. Al Qahtani ◽  
Ghadah A. Al Shabanat ◽  
Nouf M. Omair ◽  
...  

We live in an information era where sensitive information extracted from data mining systems is vulnerable to exploitation. Privacy preserving data mining aims to prevent the discovery of sensitive information. Information hiding systems provide excellent privacy and confidentiality, where securing confidential communications in public channels can be achieved using steganography. A cover media are exploited using steganography techniques where they hide the payload's existence within appropriate multimedia carriers. This paper aims to study steganography techniques in spatial and frequency domains, and then analyzes the performance of Discrete Cosine Transform (DCT) based steganography using the low frequency and the middle frequency to compare their performance using Peak Signal to Noise Ratio (PSNR) and Mean Square Error (MSE). The experimental results show that middle frequency has the larger message capacity and best performance.


Sign in / Sign up

Export Citation Format

Share Document