scholarly journals EARMGA and Apriori Algorithm's Performance Evaluation for Association Rule Mining

Association rule mining techniques are important part of data mining to derive relationship between attributes of large databases. Association related rule mining have evolved huge interest among researchers as many challenging problems can be solved using them. Numerous algorithms have been discovered for deriving association rules effectively. It has been evaluated that not all algorithms can give similar results in all scenarios, so decoding these merits becomes important. In this paper two association rule mining algorithms were analyzed, one is popular Apriori algorithm and the other is EARMGA (Evolutionary Association Rules Mining with Genetic Algorithm). Comparison of these two algorithms were experimentally performed based on different datasets and different parameters like Number of rules generated, Average support, Average Confidence, Covered records were detailed.

2011 ◽  
Vol 179-180 ◽  
pp. 55-59
Author(s):  
Ping Shui Wang

Association rule mining is one of the hottest research areas that investigate the automatic extraction of previously unknown patterns or rules from large amounts of data. Finding association rules can be derived based on mining large frequent candidate sets. Aiming at the poor efficiency of the classical Apriori algorithm which frequently scans the business database, studying the existing association rules mining algorithms, we proposed a new algorithm of association rules mining based on relation matrix. Theoretical analysis and experimental results show that the proposed algorithm is efficient and practical.


Author(s):  
Suma B. ◽  
Shobha G.

<div>Association rule mining is a well-known data mining technique used for extracting hidden correlations between data items in large databases. In the majority of the situations, data mining results contain sensitive information about individuals and publishing such data will violate individual secrecy. The challenge of association rule mining is to preserve the confidentiality of sensitive rules when releasing the database to external parties. The association rule hiding technique conceals the knowledge extracted by the sensitive association rules by modifying the database. In this paper, we introduce a border-based algorithm for hiding sensitive association rules. The main purpose of this approach is to conceal the sensitive rule set while maintaining the utility of the database and association rule mining results at the highest level. The performance of the algorithm in terms of the side effects is demonstrated using experiments conducted on two real datasets. The results show that the information loss is minimized without sacrificing the accuracy. </div>


2013 ◽  
Vol 9 (1) ◽  
pp. 1-27 ◽  
Author(s):  
Harihar Kalia ◽  
Satchidananda Dehuri ◽  
Ashish Ghosh

Association rule mining is one of the fundamental tasks of data mining. The conventional association rule mining algorithms, using crisp set, are meant for handling Boolean data. However, in real life quantitative data are voluminous and need careful attention for discovering knowledge. Therefore, to extract association rules from quantitative data, the dataset at hand must be partitioned into intervals, and then converted into Boolean type. In the sequel, it may suffer with the problem of sharp boundary. Hence, fuzzy association rules are developed as a sharp knife to solve the aforesaid problem by handling quantitative data using fuzzy set. In this paper, the authors present an updated survey of fuzzy association rule mining procedures along with a discussion and relevant pointers for further research.


Association Rule Mining (ARM) is a data mining approach for discovering rules that reveal latent associations among persisted entity sets. ARM has many significant applications in the real world such as finding interesting incidents, analyzing stock market data and discovering hidden relationships in healthcare data to mention few. Many algorithms that are efficient to mine association rules are found in the existing literature, apriori-based and Pattern-Growth. Comprehensive understanding of them helps data mining community and its stakeholders to make expert decisions. Dynamic update of association rules that have been discovered already is very challenging due to the fact that the changes are arbitrary and heterogeneous in the kind of operations. When new instances are added to existing dataset that has been subjected to ARM, only those instances are to be used in order to go for incremental mining of rules instead of considering the whole dataset again. Recently some algorithms were developed by researchers especially to achieve incremental ARM. They are broadly grouped into Apriori-based and Pattern-Growth. This paper provides review of Apriori-based and Pattern-Growth techniques that support incremental ARM.


2009 ◽  
Vol 08 (04) ◽  
pp. 345-352 ◽  
Author(s):  
Anjana Pandey ◽  
K. R. Pardasani

In this paper an attempt has been made to develop a progressive partitioning and counting inference approach for mining association rules in temporal databases. A temporal database like a sales database is a set of transactions where each transaction T is a set of items in which each item contains an individual exhibition period. The existing models of association rule mining have problems in handling transactions due to a lack of consideration of the exhibition period of each individual item and lack of an equitable support counting basis for each item. As a remedy to this problem we propose an innovative algorithm PPCI that combines progressive partition approach with counting inference method to discover association rules in a temporal database. The basic idea of PPCI is to first segment the database into sub-databases in such a way that items in each sub-database will have either a common starting time or a common ending time. Then for each sub-database, PPCI progressively filters 1-itemset with a cumulative filtering threshold based on vital partitioning characteristics. Algorithm PPCI is also designed to employ a filtering threshold in each partition to prune out those cumulatively infrequent 1-itemsets early and it also uses counting inference approach to minimise as much as possible the number of pattern support counts performed when extracting frequent patterns. Explicitly the execution time of PPCI in order of magnitude is smaller than those required by the schemes which are directly extended from existing methods.


2014 ◽  
Vol 687-691 ◽  
pp. 1282-1285 ◽  
Author(s):  
Ying Sui

Information security is a matter of concern in any sector and industry, and the vulnerability is the important factor which caused this issue. Therefore it is necessary to analyze and predict the occurrence of vulnerability. This paper used the datas of CNNVD vulnerability database and Apriori algorithm to analyze and predict the occurrence of software vulnerability. In the data preprocessing stage by changing the level of vulnerability rule we can dig out more concept association. In the evaluation stage of association rules by designing filters we can find the rules in line with the degree of user interest. Finally, this papper could demonstrate the effectiveness of of this method by experiments.


2014 ◽  
Vol 687-691 ◽  
pp. 1337-1341
Author(s):  
Ran Bo Yao ◽  
An Ping Song ◽  
Xue Hai Ding ◽  
Ming Bo Li

In the retail enterprises, it is an important problem to choose goods group through their sales record.We should consider not only the direct benefits of product, but also the benefits bring by the cross selling. On the base of the mutual promotion in cross selling, in this paper we propose a new method to generate the optimal selected model. Firstly we use Apriori algorithm to obtain the frequent item sets and analyses the association rules sets between products.And then we analyses the above results to generate the optimal products mixes and recommend relationship in cross selling. The experimental result shows the proposed method has some practical value to the decisions of cross selling.


2011 ◽  
Vol 1 (2) ◽  
Author(s):  
Venkatapathy Umarani ◽  
Muthusamy Punithavalli

AbstractThe discovery of association rules is an important and challenging data mining task. Most of the existing algorithms for finding association rules require multiple passes over the entire database, and I/O overhead incurred is extremely high for very large databases. An obvious approach to reduce the complexity of association rule mining is sampling. In recent times, several sampling-based approaches have been developed for speeding up the process of association rule mining. A proficient progressive sampling-based approach is presented for mining association rules from large databases. At first, frequent itemsets are mined from an initial sample and subsequently, the negative border is computed from the mined frequent itemsets. Based on the support computed for the midpoint itemset in the sorted negative border, the sample size is either increased or association rules are mined from it. In this paper, we have presented an extensive analysis of the progressive sampling-based approach with different real life datasets and, in addition, the performance of the approach is evaluated with the well-known association rule mining algorithm, Apriori. The experimental results show that accuracy and computation time of the progressive sampling-based approach is effectively improved in mining of association rules from the real life datasets.


Author(s):  
Emad Alsukhni ◽  
Ahmed AlEroud ◽  
Ahmad A. Saifan

Association rule mining is a very useful knowledge discovery technique to identify co-occurrence patterns in transactional data sets. In this article, the authors proposed an ontology-based framework to discover multi-dimensional association rules at different levels of a given ontology on user defined pre-processing constraints which may be identified using, 1) a hierarchy discovered in datasets; 2) the dimensions of those datasets; or 3) the features of each dimension. The proposed framework has post-processing constraints to drill down or roll up based on the rule level, making it possible to check the validity of the discovered rules in terms of support and confidence rule validity measures without re-applying association rule mining algorithms. The authors conducted several preliminary experiments to test the framework using the Titanic dataset by identifying the association rules after pre- and post-constraints are applied. The results have shown that the framework can be practically applied for rule pruning and discovering novel association rules.


2014 ◽  
Vol 918 ◽  
pp. 243-245
Author(s):  
Yu Ke Chen ◽  
Tai Xiang Zhao

Most incremental mining and online mining algorithms concentrate on finding association rules or patterns consistent with entire current sets of data. Users cannot easily obtain results from only interesting portion of data. This may prevent the usage of mining from online decision support for multidimensional data. To provide adhoc, query driven, and online mining support, we first propose a relation called the multidimensional pattern relation to structurally and systematically store context and mining information for later analysis.


Sign in / Sign up

Export Citation Format

Share Document