scholarly journals Incremental Algorithm for Association Rule Mining under Dynamic Threshold

2019 ◽  
Vol 9 (24) ◽  
pp. 5398 ◽  
Author(s):  
Iyad Aqra ◽  
Norjihan Abdul Ghani ◽  
Carsten Maple ◽  
José Machado ◽  
Nader Sohrabi Safa

Data mining is essentially applied to discover new knowledge from a database through an iterative process. The mining process may be time consuming for massive datasets. A widely used method related to knowledge discovery domain refers to association rule mining (ARM) approach, despite its shortcomings in mining large databases. As such, several approaches have been prescribed to unravel knowledge. Most of the proposed algorithms addressed data incremental issues, especially when a hefty amount of data are added to the database after the latest mining process. Three basic manipulation operations performed in a database include add, delete, and update. Any method devised in light of data incremental issues is bound to embed these three operations. The changing threshold is a long-standing problem within the data mining field. Since decision making refers to an active process, the threshold is indeed changeable. Accordingly, the present study proposes an algorithm that resolves the issue of rescanning a database that had been mined previously and allows retrieval of knowledge that satisfies several thresholds without the need to learn the process from scratch. The proposed approach displayed high accuracy in experimentation, as well as reduction in processing time by almost two-thirds of the original mining execution time.

Author(s):  
Suma B. ◽  
Shobha G.

<div>Association rule mining is a well-known data mining technique used for extracting hidden correlations between data items in large databases. In the majority of the situations, data mining results contain sensitive information about individuals and publishing such data will violate individual secrecy. The challenge of association rule mining is to preserve the confidentiality of sensitive rules when releasing the database to external parties. The association rule hiding technique conceals the knowledge extracted by the sensitive association rules by modifying the database. In this paper, we introduce a border-based algorithm for hiding sensitive association rules. The main purpose of this approach is to conceal the sensitive rule set while maintaining the utility of the database and association rule mining results at the highest level. The performance of the algorithm in terms of the side effects is demonstrated using experiments conducted on two real datasets. The results show that the information loss is minimized without sacrificing the accuracy. </div>


Author(s):  
Vasudha Bhatnagar ◽  
Anamika Gupta ◽  
Naveen Kumar

Association Rule Mining (ARM) is one of the important data mining tasks that has been extensively researched by data-mining community and has found wide applications in industry. An Association Rule is a pattern that implies co-occurrence of events or items in a database. Knowledge of such relationships in a database can be employed in strategic decision making in both commercial and scientific domains. A typical application of ARM is market basket analysis where associations between the different items are discovered to analyze the customer’s buying habits. The discovery of such associations can help to develop better marketing strategies. ARM has been extensively used in other applications like spatial-temporal, health care, bioinformatics, web data etc (Hipp J., Güntzer U., Nakhaeizadeh G. 2000). An association rule is an implication of the form X ? Y where X and Y are independent sets of attributes/ items. An association rule indicates that if a set of items X occurs in a transaction record then the set of items Y also occurs in the same record. X is called the antecedent of the rule and Y is called the consequent of the rule. Processing massive datasets for discovering co-occurring items and generating interesting rules in reasonable time is the objective of all ARM algorithms. The task of discovering co-occurring sets of items cannot be easily accomplished using SQL, as a little reflection will reveal. Use of ‘Count’ aggregate query requires the condition to be specified in the where clause, which finds the frequency of only one set of items at a time. In order to find out all sets of co-occurring items in a database with n items, the number of queries that need to be written is exponential in n. This is the prime motivation for designing algorithms for efficient discovery of co-occurring sets of items, which are required to find the association rules. In this article we focus on the algorithms for association rule mining (ARM) and the scalability issues in ARM. We assume familiarity of the reader with the motivation and applications of association rule mining


2011 ◽  
Vol 402 ◽  
pp. 96-99 ◽  
Author(s):  
Yan Hu ◽  
Zhong Zheng ◽  
Jian Yang

In this work, data mining was applied into in BOF steelmaking endpoint control. Through the characteristic analysis of key factors, the data sheet to control end point was formed. Potential knowledge was explored from the data sheet using association rule mining algorithm, then expert rule are achieved automatically. The results show that through the combination of the effective expert rules and traditional BOF endpoint model, carbon content and temperature were predicted with high accuracy. Therefore it can be a new research method to improve BOF automation.


A Data mining is the method of extracting useful information from various repositories such as Relational Database, Transaction database, spatial database, Temporal and Time-series database, Data Warehouses, World Wide Web. Various functionalities of Data mining include Characterization and Discrimination, Classification and prediction, Association Rule Mining, Cluster analysis, Evolutionary analysis. Association Rule mining is one of the most important techniques of Data Mining, that aims at extracting interesting relationships within the data. In this paper we study various Association Rule mining algorithms, also compare them by using synthetic data sets, and we provide the results obtained from the experimental analysis


2018 ◽  
Vol 7 (2) ◽  
pp. 284-288
Author(s):  
Doni Winarso ◽  
Anwar Karnaidi

Analisis association rule adalah teknik data mining yang digunakan untuk menemukan aturan asosiatif antara suatu kombinasi item. penelitian ini menggunakan algoritma apriori. Dengan  algoritma tersebut dilakukan pencarian  frekuensi dan item barang yang paling sering muncul. hasil dari penelitian in menunjukkan bahwa algoritma apriori  dapat digunakan untuk menganalisis data transaksi sehingga diketahui mana produk yang harus  dipromosikan. Perhitungan metode apriori menghasilkan suatu pola pembelian yang terjadi di PD. XYZ. dengan menganalisis pola tersebut dihasilakn kesimpulan bahwa produk  yang akan dipromosikan yaitu cat tembok ekonomis dan peralatan cat berupa kuas tangan dengan nilai support 11% dan confidence 75% .


Author(s):  
M. Nandhini ◽  
S. N. Sivanandam ◽  
S. Renugadevi

Data mining is likely to explore hidden patterns from the huge quantity of data and provides a way of analyzing and categorizing the data. Associative classification (AC) is an integration of two data mining tasks, association rule mining, and classification which is used to classify the unknown data. Though association rule mining techniques are successfully utilized to construct classifiers, it lacks in generating a small set of significant class association rules (CARs) to build an accurate associative classifier. In this work, an attempt is made to generate significant CARs using Artificial Bee Colony (ABC) algorithm, an optimization technique to construct an efficient associative classifier. Associative classifier, thus built using ABC discovered CARs achieve high prognostic accurateness and interestingness value. Promising results were provided by the ABC based AC when experiments were conducted using health care datasets from the UCI machine learning repository.


Author(s):  
Carson Kai-Sang Leung

The problem of association rule mining was introduced in 1993 (Agrawal et al., 1993). Since then, it has been the subject of numerous studies. Most of these studies focused on either performance issues or functionality issues. The former considered how to compute association rules efficiently, whereas the latter considered what kinds of rules to compute. Examples of the former include the Apriori-based mining framework (Agrawal & Srikant, 1994), its performance enhancements (Park et al., 1997; Leung et al., 2002), and the tree-based mining framework (Han et al., 2000); examples of the latter include extensions of the initial notion of association rules to other rules such as dependence rules (Silverstein et al., 1998) and ratio rules (Korn et al., 1998). In general, most of these studies basically considered the data mining exercise in isolation. They did not explore how data mining can interact with the human user, which is a key component in the broader picture of knowledge discovery in databases. Hence, they provided little or no support for user focus. Consequently, the user usually needs to wait for a long period of time to get numerous association rules, out of which only a small fraction may be interesting to the user. In other words, the user often incurs a high computational cost that is disproportionate to what he wants to get. This calls for constraint-based association rule mining.


Author(s):  
Anne Denton

Most data of practical relevance are structured in more complex ways than is assumed in traditional data mining algorithms, which are based on a single table. The concept of relations allows for discussing many data structures such as trees and graphs. Relational data have much generality and are of significant importance, as demonstrated by the ubiquity of relational database management systems. It is, therefore, not surprising that popular data mining techniques, such as association rule mining, have been generalized to relational data. An important aspect of the generalization process is the identification of challenges that are new to the generalized setting.


Author(s):  
Luminita Dumitriu

The concept of Quantitative Structure-Activity Relationship (QSAR), introduced by Hansch and co-workers in the 1960s, attempts to discover the relationship between the structure and the activity of chemical compounds (SAR), in order to allow the prediction of the activity of new compounds based on knowledge of their chemical structure alone. These predictions can be achieved by quantifying the SAR. Initially, statistical methods have been applied to solve the QSAR problem. For example, pattern recognition techniques facilitate data dimension reduction and transformation techniques from multiple experiments to the underlying patterns of information. Partial least squares (PLS) is used for performing the same operations on the target properties. The predictive ability of this method can be tested using cross-validation on the test set of compounds. Later, data mining techniques have been considered for this prediction problem. Among data mining techniques, the most popular ones are based on neural networks (Wang, Durst, Eberhart, Boyd, & Ben-Miled, 2004) or on neuro-fuzzy approaches (Neagu, Benfenati, Gini, Mazzatorta, & Roncaglioni, 2002) or on genetic programming (Langdon, &Barrett, 2004). All these approaches predict the activity of a chemical compound, without being able to explain the predicted value. In order to increase the understanding on the prediction process, descriptive data mining techniques have started to be used related to the QSAR problem. These techniques are based on association rule mining. In this chapter, we describe the use of association rule-based approaches related to the QSAR problem.


Author(s):  
Ling Zhou ◽  
Stephen Yau

Association rule mining among frequent items has been extensively studied in data mining research. However, in recent years, there is an increasing demand for mining infrequent items (such as rare but expensive items). Since exploring interesting relationships among infrequent items has not been discussed much in the literature, in this chapter, the authors propose two simple, practical and effective schemes to mine association rules among rare items. Their algorithms can also be applied to frequent items with bounded length. Experiments are performed on the well-known IBM synthetic database. The authors’ schemes compare favorably to Apriori and FP-growth under the situation being evaluated. In addition, they explore quantitative association rule mining in transactional databases among infrequent items by associating quantities of items: some interesting examples are drawn to illustrate the significance of such mining.


Sign in / Sign up

Export Citation Format

Share Document