Association Rule Hiding Methods

2009 ◽  
pp. 2268-2274
Author(s):  
Vassilios S. Verykios

The enormous expansion of data collection and storage facilities has created an unprecedented increase in the need for data analysis and processing power. Data mining has long been the catalyst for automated and sophisticated data analysis and interrogation. Recent advances in data mining and knowledge discovery have generated controversial impact in both scientific and technological arenas. On the one hand, data mining is capable of analyzing vast amounts of information within a minimum amount of time, an analysis that has exceeded the expectations of even the most imaginative scientists of the last decade. On the other hand, the excessive processing power of intelligent algorithms which is brought with this new research area puts at risk sensitive and confidential information that resides in large and distributed data stores.

Author(s):  
Vassilios S. Verykios

The enormous expansion of data collection and storage facilities has created an unprecedented increase in the need for data analysis and processing power. Data mining has long been the catalyst for automated and sophisticated data analysis and interrogation. Recent advances in data mining and knowledge discovery have generated controversial impact in both scientific and technological arenas. On the one hand, data mining is capable of analyzing vast amounts of information within a minimum amount of time, an analysis that has exceeded the expectations of even the most imaginative scientists of the last decade. On the other hand, the excessive processing power of intelligent algorithms which is brought with this new research area puts at risk sensitive and confidential information that resides in large and distributed data stores. Privacy and security risks arising from the use of data mining techniques have been first investigated in an early paper by O’ Leary (1991). Clifton & Marks (1996) were the first to propose possible remedies to the protection of sensitive data and sensitive knowledge from the use of data mining. In particular, they suggested a variety of ways like the use of controlled access to the data, fuzzification of the data, elimination of unnecessary groupings in the data, data augmentation, as well as data auditing. A subsequent paper by Clifton (2000) made concrete early results in the area by demonstrating an interesting approach for privacy protection that relies on sampling. A main result of Clifton’s paper was to show how to determine the right sample size of the public data (data to be disclosed to the public where sensitive information has been trimmed off), by estimating at the same time the error that is introduced from the sampling to the significance of the rules. Agrawal and Srikant (2000) were the first to establish a new research area, the privacy preserving data mining, which had as its goal to consider privacy and confidentiality issues originating in the mining of the data. The authors proposed an approach known as data perturbation that relies on disclosing a modified database with noisy data instead of the original database. The modified database could produce very similar patterns with those of the original database.


2014 ◽  
Vol 1 (2) ◽  
pp. 293-314 ◽  
Author(s):  
Jianqing Fan ◽  
Fang Han ◽  
Han Liu

Abstract Big Data bring new opportunities to modern society and challenges to data scientists. On the one hand, Big Data hold great promises for discovering subtle population patterns and heterogeneities that are not possible with small-scale data. On the other hand, the massive sample size and high dimensionality of Big Data introduce unique computational and statistical challenges, including scalability and storage bottleneck, noise accumulation, spurious correlation, incidental endogeneity and measurement errors. These challenges are distinguished and require new computational and statistical paradigm. This paper gives overviews on the salient features of Big Data and how these features impact on paradigm change on statistical and computational methods as well as computing architectures. We also provide various new perspectives on the Big Data analysis and computation. In particular, we emphasize on the viability of the sparsest solution in high-confidence set and point out that exogenous assumptions in most statistical methods for Big Data cannot be validated due to incidental endogeneity. They can lead to wrong statistical inferences and consequently wrong scientific conclusions.


Author(s):  
Vasaki Ponnusamy ◽  
Yen Pei Tay ◽  
Lam Hong Lee ◽  
Tang Jung Low ◽  
Cheah Wai Zhao

Internet of Things (IoT) has becoming a central theme in current technology trend whereby objects, people or even animals and plants can exchange information over the Internet. IoT can be referred as a network of interconnected devices such as wearables, sensors and implantables, that has the ability to sense, interact and make collective decisions autonomously. In short, IoT enables a full spectrum of machine-to-machine communications equipped with distributed data collection capabilities and connected through the cloud to facilitate centralized data analysis. Despite its great potential, the reliability of IoT devices is impeded with limited energy supply if these devices were to deploy particularly in energy-scarced locations or where no human intervention is possible. The best possible deployment of IoT technology is directed to cater for unattended situations like structural or environmental health monitoring. This opens up a new research area in IoT energy efficiency domain. A possible alternative to address such energy constraint is to look into re-generating power of IoT devices or more precisely known as energy harvesting or energy scavenging. This chapter presents the review of various energy harvesting mechanisms, current application of energy harvesting in IoT domain and its future design challenges.


2016 ◽  
Vol 2016 ◽  
pp. 1-11 ◽  
Author(s):  
Ivan Kholod ◽  
Ilya Petukhov ◽  
Andrey Shorov

This paper describes the construction of a Cloud for Distributed Data Analysis (CDDA) based on the actor model. The design uses an approach to map the data mining algorithms on decomposed functional blocks, which are assigned to actors. Using actors allows users to move the computation closely towards the stored data. The process does not require loading data sets into the cloud and allows users to analyze confidential information locally. The results of experiments show that the efficiency of the proposed approach outperforms established solutions.


2014 ◽  
Vol 23 (05) ◽  
pp. 1450004 ◽  
Author(s):  
Ibrahim S. Alwatban ◽  
Ahmed Z. Emam

In recent years, a new research area known as privacy preserving data mining (PPDM) has emerged and captured the attention of many researchers interested in preventing the privacy violations that may occur during data mining. In this paper, we provide a review of studies on PPDM in the context of association rules (PPARM). This paper systematically defines the scope of this survey and determines the PPARM models. The problems of each model are formally described, and we discuss the relevant approaches, techniques and algorithms that have been proposed in the literature. A profile of each model and the accompanying algorithms are provided with a comparison of the PPARM models.


Smart systems are the one of the most significant inventions of our times. These systems rely on powerful information mining techniques to achieve intelligence in decision making. Frequent item set mining (FIM), has become one of the most significant research area of data mining. The information present in databases is in-general ambiguous and uncertain. In such databases, one should think of weighted FIM to discover item sets which are significant from end user’s perspective. Be that as it may, with introduction of weight-factor for FIM makes the weighted continuous item sets may not fulfil the descending conclusion property anymore. Subsequently, the pursuit space of successive item set can't be limited by descending conclusion property which prompts a poor time effectiveness. In this paper, we introduce two properties for FIM, first one is, weight judgment downward closure property (WD-FIM), it is for weighted FIM and the second one is existence property for its subsets. In view of above two properties, the WD-FIM calculation is proposed to limit the looking through space of the weighted regular item sets and improve the time effectiveness. In addition, the culmination and time productivity of WD-FIM calculation are examined hypothetically. At last, the exhibition of the proposed WD-FIM calculation is confirmed on both engineered and genuine data sets


Author(s):  
Vasaki Ponnusamy ◽  
Yen Pei Tay ◽  
Lam Hong Lee ◽  
Tang Jung Low ◽  
Cheah Wai Zhao

Internet of Things (IoT) has becoming a central theme in current technology trend whereby objects, people or even animals and plants can exchange information over the Internet. IoT can be referred as a network of interconnected devices such as wearables, sensors and implantables, that has the ability to sense, interact and make collective decisions autonomously. In short, IoT enables a full spectrum of machine-to-machine communications equipped with distributed data collection capabilities and connected through the cloud to facilitate centralized data analysis. Despite its great potential, the reliability of IoT devices is impeded with limited energy supply if these devices were to deploy particularly in energy-scarced locations or where no human intervention is possible. The best possible deployment of IoT technology is directed to cater for unattended situations like structural or environmental health monitoring. This opens up a new research area in IoT energy efficiency domain. A possible alternative to address such energy constraint is to look into re-generating power of IoT devices or more precisely known as energy harvesting or energy scavenging. This chapter presents the review of various energy harvesting mechanisms, current application of energy harvesting in IoT domain and its future design challenges.


2021 ◽  
Vol 14(63) (1) ◽  
pp. 122-136
Author(s):  
Maria Magdalena POPESCU ◽  

Fake News and Deepfakes have lately been highlighted in informative videos, research papers and literature reviews as tools for disinformation, along with filter bubble and echo chamber, polarization and mistrust. To counteract the unconventional weapons of word and imagery, a new research area has been defined as cognition security, a transdisciplinary area to understand the threats hybrid wars currently make use of and to determine the proper measures against non-kinetic offensives. For this, data mining and deep analysis are performed with digital instruments in a cognitive security system. Defined by all these, the present paper deconstructs the terms in an experimental monitoring of the media, to connect the realm of Cognition Security to its instruments in Cognitive Security Key words: Fake news, deepfake, cognitive security, narrat


2020 ◽  
pp. 29-42
Author(s):  
Jörg Zimmermann ◽  
Armin B. Cremers

AbstractThe term Artificial Intelligence was coined in 1956. Since then, this new research area has gone through several cycles of fast progress and periods of apparent stagnation. Today, the field has broadened and deepened significantly, and developed a rich variety of theoretical approaches and frameworks on the one side, and increasingly impressive practical applications on the other side. While a thorough foundation for a general theory of cognitive agents is still missing, there is a line of development within AI research which aims at foundational justifications for the design of cognitive agents, enabling the derivation of theorems characterizing the possibilities and limitations of computational cognitive agents.


2016 ◽  
Vol 4 (2) ◽  
pp. 109-117
Author(s):  
Sheena Angra ◽  
Sachin Ahuja

Data mining offers a new advance to data analysis using techniques based on machine learning, together with the conventional methods collectively known as educational data mining (EDM). Educational Data Mining has turned up as an interesting and useful research area for finding methods to improve quality of education and to identify various patterns in educational settings. It is useful in extracting information of students, teachers, courses, administrators from educational institutes such as schools/ colleges/universities and helps to suggest interesting learning experiences to various stakeholders. This paper focuses on the applications of data mining in the field of education and implementation of three widely used data mining techniques using Rapid Miner on the data collected through a survey.


Sign in / Sign up

Export Citation Format

Share Document