A novel Bit Vector Product algorithm for mining frequent itemsets from large datasets using MapReduce framework

2017 ◽  
Vol 21 (2) ◽  
pp. 1365-1380 ◽  
Author(s):  
Sumalatha Saleti ◽  
R. B. V. Subramanyam
2012 ◽  
Vol 7 (23) ◽  
pp. 393-399
Author(s):  
Quanzhu Yao ◽  
Yubing Zhang ◽  
Jiulong Zhang

2004 ◽  
Vol 03 (04) ◽  
pp. 317-329 ◽  
Author(s):  
Imad Rahal ◽  
Dongmei Ren ◽  
William Perrizo

Association rule mining (ARM) is the data-mining process for finding all association rules in datasets matching user-defined measures of interest such as support and confidence. Usually, ARM proceeds by mining all frequent itemsets — a step known to be very computationally intensive — from which rules are then derived in a straight forward manner. In general, mining all frequent itemsets prunes the space by using the downward closure (or anti-monotonicity) property of support which states that no itemset can be frequent unless all of its subsets are frequent. A large number of papers have addressed the problem of ARM but not many of them have focused on scalability over very large datasets (i.e. when datasets contain a very large number of transactions). In this paper, we propose a new model for representing data and mining frequent itemsets that is based on the P-tree technology for compression and faster logical operations over vertically structured data and on set enumeration trees for fast itemset enumeration. Experimental results presented hereinafter show big improvements for our approach over large datasets when compared to other contemporary approaches in the literature.


ARM is a significant area of knowledge mining which enables association rules which are essential for decision making. Frequent itemset mining has a challenge against large datasets. As going on the dataset size increases the burden and time to discover rules will increase. In this paper the ARM algorithms with tree structures like FP-tree, FIN with POC tree and PPC tree are discussed for reducing overheads and time consuming. These algorithms use highly competent data structures for mining frequent itemsets from the database. FIN uses nodeset a unique and novel data structure to extract frequent itemsets and POC tree to store frequent itemset information. These techniques are extremely helpful in the marketing fields. The proposed and implemented techniques reveal that they have improved about performance by means of time and efficiency


2014 ◽  
Vol 989-994 ◽  
pp. 3747-3750
Author(s):  
Nai Li Liu

Because of the weakness of traditional Apriori algorithm, this paper presents an improved algorithm for mining frequent itemsets, which constructs bit vector and graph, the algorithm deletes node and the adjacent edges according to the number of node’s edges, which need traverse graph to generate candidate itemsets and verify candidate itemset by bit vector. Experimental results show that the improved algorithm has better efficiency than Apriori algorithm.


Sign in / Sign up

Export Citation Format

Share Document