A strategy for increasing the efficiency of rule discovery in data mining

Author(s):  
David McSherry
Keyword(s):  

Author(s):  
Anthony Scime ◽  
Karthik Rajasethupathy ◽  
Kulathur S. Rajasethupathy ◽  
Gregg R. Murray

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.



Data Mining ◽  
2013 ◽  
pp. 28-49
Author(s):  
Anthony Scime ◽  
Karthik Rajasethupathy ◽  
Kulathur S. Rajasethupathy ◽  
Gregg R. Murray

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.



Author(s):  
Ioannis N. Kouris ◽  
Christos Makris ◽  
Evangelos Theodoridis ◽  
Athanasios Tsakalidis

In recent years, we have witnessed an explosive growth in the amount of data generated and stored from practically all possible fields (e.g., science, business, medicine, military just to name a few). However, the ability to store more and more data has not been followed by the same rate of growth in the processing power, and, therefore, much of the data accumulated remains today still unanalyzed. Data mining, which could be defined as the process concerned with applying computational techniques (i.e., algorithms implemented as computer programs) to actually find patterns in the data, tries to bridge this gap. Among others, data mining technologies include association rule discovery, classification, clustering, summarization, regression and sequential pattern discovery (Adrians & Zantige, 1996; Chen, Han, & Yu, 1996; Fayyad, Piatetsky-Shapiro, & Smyth, 1996). This problem has been motivated by applications known as market basket analysis which find items purchased by customers; that is, what kinds of products tend to be purchased together (Agrawal, Imielinski, & Swami, 1993).



2010 ◽  
Vol 09 (01) ◽  
pp. 55-64 ◽  
Author(s):  
Fadi Thabtah ◽  
Qazafi Mahmood ◽  
Lee McCluskey ◽  
Hussein Abdel-Jaber

Associative classification is a branch in data mining that employs association rule discovery methods in classification problems. In this paper, we introduce a novel data mining method called Looking at the Class (LC), which can be utilised in associative classification approach. Unlike known algorithms in associative classification such as Classification based on Association rule (CBA), which combine disjoint itemsets regardless of their class labels in the training phase, our method joins only itemsets with similar class labels. This saves too many unnecessary itemsets combining during the learning step, and consequently results in massive saving in computational time and memory. Moreover, a new prediction method that utilises multiple rules to make the prediction decision is also developed in this paper. The experimental results on different UCI datasets reveal that LC algorithm outperformed CBA with respect to classification accuracy, memory usage, and execution time on most datasets we consider.



Author(s):  
Robert Cattral ◽  
Franz Oppacher ◽  
K. J. Lee Graham
Keyword(s):  


2021 ◽  
Vol 1 (2) ◽  
pp. 54-66
Author(s):  
M. Hamdani Santoso

Data mining can generally be defined as a technique for finding patterns (extraction) or interesting information in large amounts of data that have meaning for decision support. One of the well-known and commonly used association rule discovery data mining methods is the Apriori algorithm. The Association Rule and the Apriori Algorithm are two very prominent algorithms for finding a number of frequently occurring sets of items from transaction data stored in databases. The calculation is done to determine the minimum value of support and minimum confidence that will produce the association rule. The association rule is used to produce the percentage of purchasing activity for an itemset within a certain period of time using the RapidMiner software. The results of the test using the priori algorithm method show that the association rule, that customers often buy toothpaste and detergents that have met the minimum confidence value. By searching for patterns using this a priori algorithm, it is hoped that the resulting information can improve further sales strategies.



Data Mining ◽  
2011 ◽  
pp. 191-208 ◽  
Author(s):  
Rafael S. Parpinelli ◽  
Heitor S. Lopes ◽  
Alex A. Freitas

This work proposes an algorithm for rule discovery called Ant-Miner (Ant Colony-Based Data Miner). The goal of Ant-Miner is to extract classification rules from data. The algorithm is based on recent research on the behavior of real ant colonies as well as in some data mining concepts. We compare the performance of Ant-Miner with the performance of the wellknown C4.5 algorithm on six public domain data sets. The results provide evidence that: (a) Ant-Miner is competitive with C4.5 with respect to predictive accuracy; and (b) the rule sets discovered by Ant-Miner are simpler (smaller) than the rule sets discovered by C4.5.



Sign in / Sign up

Export Citation Format

Share Document