A novel Bit Vector Product algorithm for mining frequent itemsets from large datasets using MapReduce framework

Sumalatha Saleti; R. B. V. Subramanyam

doi:10.1007/s10586-017-1249-x

A novel Bit Vector Product algorithm for mining frequent itemsets from large datasets using MapReduce framework

Cluster Computing ◽

10.1007/s10586-017-1249-x ◽

2017 ◽

Vol 21 (2) ◽

pp. 1365-1380 ◽

Cited By ~ 1

Author(s):

Sumalatha Saleti ◽

R. B. V. Subramanyam

Keyword(s):

Frequent Itemsets ◽

Large Datasets ◽

Vector Product ◽

Mapreduce Framework ◽

Bit Vector ◽

Mining Frequent Itemsets

Download Full-text

Mining Frequent Itemsets Based on A Vertical Bit-Vector Dot-Product CBD-Tree

Journal of Convergence Information Technology ◽

10.4156/jcit.vol7.issue23.46 ◽

2012 ◽

Vol 7 (23) ◽

pp. 393-399

Author(s):

Quanzhu Yao ◽

Yubing Zhang ◽

Jiulong Zhang

Keyword(s):

Frequent Itemsets ◽

Dot Product ◽

Bit Vector ◽

Mining Frequent Itemsets

Download Full-text

A Scalable Vertical Model for Mining Association Rules

Journal of Information & Knowledge Management ◽

10.1142/s0219649204000912 ◽

2004 ◽

Vol 03 (04) ◽

pp. 317-329 ◽

Cited By ~ 2

Author(s):

Imad Rahal ◽

Dongmei Ren ◽

William Perrizo

Keyword(s):

Association Rules ◽

Association Rule ◽

Monotonicity Property ◽

Frequent Itemsets ◽

Large Datasets ◽

Rule Mining ◽

Very Large Datasets ◽

Computationally Intensive ◽

Downward Closure ◽

Mining Frequent Itemsets

Association rule mining (ARM) is the data-mining process for finding all association rules in datasets matching user-defined measures of interest such as support and confidence. Usually, ARM proceeds by mining all frequent itemsets — a step known to be very computationally intensive — from which rules are then derived in a straight forward manner. In general, mining all frequent itemsets prunes the space by using the downward closure (or anti-monotonicity) property of support which states that no itemset can be frequent unless all of its subsets are frequent. A large number of papers have addressed the problem of ARM but not many of them have focused on scalability over very large datasets (i.e. when datasets contain a very large number of transactions). In this paper, we propose a new model for representing data and mining frequent itemsets that is based on the P-tree technology for compression and faster logical operations over vertically structured data and on set enumeration trees for fast itemset enumeration. Experimental results presented hereinafter show big improvements for our approach over large datasets when compared to other contemporary approaches in the literature.

Download Full-text

Implementation of Improved Association Rule Mining Algorithms for Fast Mining with Efficient Tree Structures on Large Datasets

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b3876.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 5136-5141

Keyword(s):

Association Rule ◽

Frequent Itemsets ◽

Large Datasets ◽

Frequent Itemset ◽

Rule Mining ◽

Tree Structures ◽

Significant Area ◽

Dataset Size ◽

Mining Algorithms ◽

Mining Frequent Itemsets

ARM is a significant area of knowledge mining which enables association rules which are essential for decision making. Frequent itemset mining has a challenge against large datasets. As going on the dataset size increases the burden and time to discover rules will increase. In this paper the ARM algorithms with tree structures like FP-tree, FIN with POC tree and PPC tree are discussed for reducing overheads and time consuming. These algorithms use highly competent data structures for mining frequent itemsets from the database. FIN uses nodeset a unique and novel data structure to extract frequent itemsets and POC tree to store frequent itemset information. These techniques are extremely helpful in the marketing fields. The proposed and implemented techniques reveal that they have improved about performance by means of time and efficiency

Download Full-text

Discovering Frequent Itemsets an Improved Algorithm of Bit Vector and Graph

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.989-994.3747 ◽

2014 ◽

Vol 989-994 ◽

pp. 3747-3750

Author(s):

Nai Li Liu

Keyword(s):

Frequent Itemsets ◽

Experimental Results ◽

Apriori Algorithm ◽

Bit Vector ◽

Mining Frequent Itemsets ◽

Improved Algorithm

Because of the weakness of traditional Apriori algorithm, this paper presents an improved algorithm for mining frequent itemsets, which constructs bit vector and graph, the algorithm deletes node and the adjacent edges according to the number of node’s edges, which need traverse graph to generate candidate itemsets and verify candidate itemset by bit vector. Experimental results show that the improved algorithm has better efficiency than Apriori algorithm.

Download Full-text