Uncovering Actionable Knowledge in Corporate Data with Qualified Association Rules

2011 ◽  
Vol 2 (2) ◽  
pp. 1-21 ◽  
Author(s):  
Nenad Jukic ◽  
Svetlozar Nestorov ◽  
Miguel Velasco ◽  
Jami Eddington

Association rules mining is one of the most successfully applied data mining methods in today’s business settings (e.g. Amazon or Netflix recommendations to customers). Qualified association rules mining is an extension of the association rules data mining method, that uncovers previously unknown correlations that only manifest themselves under certain circumstances (e.g. on a particular day of the week), with the goal of improving action results, e.g. turning an underperforming campaign (spread too thin over the entire audience) into a highly targeted campaign that delivers results. Such correlations have not been easily reachable using standard data mining tools so far. This paper describes the method for straightforward discovery of qualified association rules and demonstrates the use of qualified association rules mining on an actual corporate data set. The data set is a subset of a corporate data warehouse for Sam’s Club, a division of Wal-Mart Stores, INC. The experiments described in this paper illustrate how qualified association rules supplement standard association rules data mining methods and provide additional information which can be used to better target corporate actions.

Author(s):  
Nenad Jukic ◽  
Svetlozar Nestorov ◽  
Miguel Velasco ◽  
Jami Eddington

Association rules mining is one of the most successfully applied data mining methods in today’s business settings (e.g. Amazon or Netflix recommendations to customers). Qualified association rules mining is an extension of the association rules data mining method, that uncovers previously unknown correlations that only manifest themselves under certain circumstances (e.g. on a particular day of the week), with the goal of improving action results, e.g. turning an underperforming campaign (spread too thin over the entire audience) into a highly targeted campaign that delivers results. Such correlations have not been easily reachable using standard data mining tools so far. This paper describes the method for straightforward discovery of qualified association rules and demonstrates the use of qualified association rules mining on an actual corporate data set. The data set is a subset of a corporate data warehouse for Sam’s Club, a division of Wal-Mart Stores, INC. The experiments described in this paper illustrate how qualified association rules supplement standard association rules data mining methods and provide additional information which can be used to better target corporate actions.


2012 ◽  
Vol 3 (2) ◽  
pp. 24-41 ◽  
Author(s):  
Tutut Herawan ◽  
Prima Vitasari ◽  
Zailani Abdullah

One of the most popular techniques used in data mining applications is association rules mining. The purpose of this study is to apply an enhanced association rules mining method, called SLP-Growth (Significant Least Pattern Growth) for capturing interesting rules from students suffering mathematics and examination anxieties datasets. The datasets were taken from a survey exploring study anxieties among engineering students in Universiti Malaysia Pahang (UMP). The results of this research provide useful information for educators to make decisions on their students more accurately and adapt their teaching strategies accordingly. It also can assist students in handling their fear of mathematics and examination and increase the quality of learning.


2014 ◽  
Vol 998-999 ◽  
pp. 842-845 ◽  
Author(s):  
Jia Mei Guo ◽  
Yin Xiang Pei

Association rules extraction is one of the important goals of data mining and analyzing. Aiming at the problem that information lose caused by crisp partition of numerical attribute , in this article, we put forward a fuzzy association rules mining method based on fuzzy logic. First, we use c-means clustering to generate fuzzy partitions and eliminate redundant data, and then map the original data set into fuzzy interval, in the end, we extract the fuzzy association rules on the fuzzy data set as providing the basis for proper decision-making. Results show that this method can effectively improve the efficiency of data mining and the semantic visualization and credibility of association rules.


10.14311/931 ◽  
2008 ◽  
Vol 48 (1) ◽  
Author(s):  
P. Haluzová

This paper describes the application of data mining methods in the database of the DORIS transportation information system, currently used by the Prague Public Transit Company. The goal is to create knowledge about the behavior of objects within this information system. Data is analyzed partly with the help of descriptive statistical methods, and partly with the help of association rules, which may discover common combinations of attributes that occur most frequently within a given data set. Two types of quantifiers were used when creating the association rules; namely “founded implication” and “above average”. The results of the analysis are presented in the form of graphs and hypotheses. 


Author(s):  
Anthony Scime ◽  
Karthik Rajasethupathy ◽  
Kulathur S. Rajasethupathy ◽  
Gregg R. Murray

Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences to find particularly robust, consistent, and noteworthy rules among much larger potential rule sets. More specifically, this research focuses on using association rules and classification mining to select the persistently strong association rules. Persistently strong association rules are association rules that are verifiable by classification mining the same data set. The process for finding persistent strong rules was executed against two data sets obtained from the American National Election Studies. Analysis of the first data set resulted in one persistent strong rule and one persistent rule, while analysis of the second data set resulted in 11 persistent strong rules and 10 persistent rules. The persistent strong rule discovery process suggests these rules are the most robust, consistent, and noteworthy among the much larger potential rule sets.


Author(s):  
Barak Chizi ◽  
Lior Rokach ◽  
Oded Maimon

Dimensionality (i.e., the number of data set attributes or groups of attributes) constitutes a serious obstacle to the efficiency of most data mining algorithms (Maimon and Last, 2000). The main reason for this is that data mining algorithms are computationally intensive. This obstacle is sometimes known as the “curse of dimensionality” (Bellman, 1961). The objective of Feature Selection is to identify features in the data-set as important, and discard any other feature as irrelevant and redundant information. Since Feature Selection reduces the dimensionality of the data, data mining algorithms can be operated faster and more effectively by using Feature Selection. In some cases, as a result of feature selection, the performance of the data mining method can be improved. The reason for that is mainly a more compact, easily interpreted representation of the target concept. The filter approach (Kohavi , 1995; Kohavi and John ,1996) operates independently of the data mining method employed subsequently -- undesirable features are filtered out of the data before learning begins. These algorithms use heuristics based on general characteristics of the data to evaluate the merit of feature subsets. A sub-category of filter methods that will be refer to as rankers, are methods that employ some criterion to score each feature and provide a ranking. From this ordering, several feature subsets can be chosen by manually setting There are three main approaches for feature selection: wrapper, filter and embedded. The wrapper approach (Kohavi, 1995; Kohavi and John,1996), uses an inducer as a black box along with a statistical re-sampling technique such as cross-validation to select the best feature subset according to some predictive measure. The embedded approach (see for instance Guyon and Elisseeff, 2003) is similar to the wrapper approach in the sense that the features are specifically selected for a certain inducer, but it selects the features in the process of learning.


2017 ◽  
Vol 9 (1) ◽  
pp. 38-49
Author(s):  
Fatma Önay Koçoğlu ◽  
İlkim Ecem Emre ◽  
Çiğdem Selçukcan Erol

The aim of this study is to analyze success in e-learning with data mining methods and find out potential patterns. In this context, 374.073 data of 2013-14 period taken from an institution serving in e-learning field in Turkey are used. Data set, which is collected from information technology, banking and pharmaceutical industries, includes success and industry of employees', trainings which they complete, whether the trainings are completed, first login and last logout dates, training completion date and duration of experience in training. Using this data set, success status of participants is observed by using data mining methods (C5.0, Random Forest and Gini). By observing using accuracy, error rate, specificity and f- score from performance evaluation criteria, C5.0 has chosen the algorithm which gives the best performance results. According to the results of the study, it has been determined that the sectors of the employees are not important, on the contrary the ones that are important are the completion status, the duration of experience and training.


Sign in / Sign up

Export Citation Format

Share Document