Mining Associations Rules on a NCR Teradata System

Author(s):  
Soon M. Chung ◽  
Murali Mangamuri

Data mining from relations is becoming increasingly important with the advent of parallel database systems. In this paper, we propose a new algorithm for mining association rules from relations. The new algorithm is an enhanced version of the SETM algorithm (Houtsma & Swami 1995), and it reduces the number of candidate itemsets considerably. We implemented and evaluated the new algorithm on a parallel NCR Teradata database system. The new algorithm is much faster than the SETM algorithm, and its performance is quite scalable.

Author(s):  
Hong Shen ◽  
Susumu Horiguchi

The problem of mining association rules from databases was introduced by Agrawal, Imielinski, & Swami (1993). In this problem, we give a set of items and a large collection of transactions, which are subsets (baskets) of these items. The task is to find relationships between the occurrences of various items within those baskets. Mining association rules has been a central task of data mining, which is a recent research focus in database systems and machine learning and shows interesting applications in various fields, including information management, query processing, and process control.


2016 ◽  
Vol 26 (1) ◽  
pp. 31-54 ◽  
Author(s):  
Jiexing Li ◽  
Jeffrey F. Naughton ◽  
Rimma V. Nehme

2007 ◽  
Vol 68 (5) ◽  
pp. 847-859 ◽  
Author(s):  
P. S. Kostenetskii ◽  
A. V. Lepikhov ◽  
L. V. Sokolinskii

Author(s):  
Ling Feng

The discovery of association rules from large amounts of structured or semi-structured data is an important data mining problem [Agrawal et al. 1993, Agrawal and Srikant 1994, Miyahara et al. 2001, Termier et al. 2002, Braga et al. 2002, Cong et al. 2002, Braga et al. 2003, Xiao et al. 2003, Maruyama and Uehara 2000, Wang and Liu 2000]. It has crucial applications in decision support and marketing strategy. The most prototypical application of association rules is market basket analysis using transaction databases from supermarkets. These databases contain sales transaction records, each of which details items bought by a customer in the transaction. Mining association rules is the process of discovering knowledge such as “80% of customers who bought diapers also bought beer, and 35% of customers bought both diapers and beer”, which can be expressed as “diaper ? beer” (35%, 80%), where 80% is the confidence level of the rule, and 35% is the support level of the rule indicating how frequently the customers bought both diapers and beer. In general, an association rule takes the form X ? Y (s, c), where X and Y are sets of items, and s and c are support and confidence, respectively. In the XML Era, mining association rules is confronted with more challenges than in the traditional well-structured world due to the inherent flexibilities of XML in both structure and semantics [Feng and Dillon 2005]. First, XML data has a more complex hierarchical structure than a database record. Second, elements in XML data have contextual positions, which thus carry the order notion. Third, XML data appears to be much bigger than traditional data. To address these challenges, the classic association rule mining framework originating with transactional databases needs to be re-examined.


2008 ◽  
pp. 303-335
Author(s):  
Haorianto Cokrowijoyo Tjioe ◽  
David Taniar

Data mining applications have enormously altered the strategic decision-making processes of organizations. The application of association rules algorithms is one of the well-known data mining techniques that have been developed to cope with multidimensional databases. However, most of these algorithms focus on multidimensional data models for transactional data. As data warehouses can be presented using a multidimensional model, in this paper we provide another perspective to mine association rules in data warehouses by focusing on a measurement of summarized data. We propose four algorithms — VAvg, HAvg, WMAvg, and ModusFilter — to provide efficient data initialization for mining association rules in data warehouses by concentrating on the measurement of aggregate data. Then we apply those algorithms both on a non-repeatable predicate, which is known as mining normal association rules, using GenNLI, and a repeatable predicate using ComDims and GenHLI, which is known as mining hybrid association rules.


Sign in / Sign up

Export Citation Format

Share Document