A Memory Efficient Algorithm for Frequent Itemset Mining

Data Stream Mining algorithms performs under constraints called space used and time taken, which is due to the streaming property. The relaxation in these constraints is inversely proportional to the streaming speed of the data. Since the caching and mining the streaming-data is sensitive, here in this paper a scalable, memory efficient caching and frequent itemset mining model is devised. The proposed model is an incremental approach that builds single level multi node trees called bushes from each window of the streaming data; henceforth we refer this proposed algorithm as a Tree (bush) based Incremental Frequent Itemset Mining (TIFIM) over data streams.

Download Full-text

Memory-efficient frequent-itemset mining

Proceedings of the 14th International Conference on Extending Database Technology - EDBT/ICDT '11 ◽

10.1145/1951365.1951420 ◽

2011 ◽

Cited By ~ 12

Author(s):

Benjamin Schlegel ◽

Rainer Gemulla ◽

Wolfgang Lehner

Keyword(s):

Frequent Itemset ◽

Frequent Itemset Mining ◽

Itemset Mining ◽

Memory Efficient

Download Full-text

AT-Mine: An Efficient Algorithm of Frequent Itemset Mining on Uncertain Dataset

Journal of Computers ◽

10.4304/jcp.8.6.1417-1426 ◽

2013 ◽

Vol 8 (6) ◽

Cited By ~ 14

Author(s):

Le Wang ◽

Lin Feng ◽

Mingfei Wu

Keyword(s):

Efficient Algorithm ◽

Frequent Itemset ◽

Frequent Itemset Mining ◽

Itemset Mining

Download Full-text

Frequent Itemset Mining A Metadata Based Approach for Knowledge Discovery

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i3.316320 ◽

2018 ◽

Vol 6 (3) ◽

pp. 316-320

Author(s):

Basavaraj A. Goudannavar ◽

◽

Prashant Bhat ◽

Keyword(s):

Knowledge Discovery ◽

Frequent Itemset ◽

Frequent Itemset Mining ◽

Itemset Mining

Download Full-text

Inverse Frequent Itemset Mining Based on FP-Tree

Journal of Software ◽

10.3724/sp.j.1001.2008.00338 ◽

2008 ◽

Vol 19 (2) ◽

pp. 338-350 ◽

Cited By ~ 2

Author(s):

Yu-Hong GUO

Keyword(s):

Frequent Itemset ◽

Frequent Itemset Mining ◽

Itemset Mining

Download Full-text

A Synopsis Based Approach for Itemset Frequency Estimation over Massive Multi-Transaction Stream

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3465238 ◽

2021 ◽

Vol 16 (2) ◽

pp. 1-30

Author(s):

Guangtao Wang ◽

Gao Cong ◽

Ying Zhang ◽

Zhen Hai ◽

Jieping Ye

Keyword(s):

Frequency Estimation ◽

Frequent Itemsets ◽

Frequent Itemset ◽

Experimental Results ◽

Closure Property ◽

Frequent Itemset Mining ◽

Itemset Mining ◽

Minimum Value ◽

Downward Closure ◽

Bounded Size

The streams where multiple transactions are associated with the same key are prevalent in practice, e.g., a customer has multiple shopping records arriving at different time. Itemset frequency estimation on such streams is very challenging since sampling based methods, such as the popularly used reservoir sampling, cannot be used. In this article, we propose a novel k -Minimum Value (KMV) synopsis based method to estimate the frequency of itemsets over multi-transaction streams. First, we extract the KMV synopses for each item from the stream. Then, we propose a novel estimator to estimate the frequency of an itemset over the KMV synopses. Comparing to the existing estimator, our method is not only more accurate and efficient to calculate but also follows the downward-closure property. These properties enable the incorporation of our new estimator with existing frequent itemset mining (FIM) algorithm (e.g., FP-Growth) to mine frequent itemsets over multi-transaction streams. To demonstrate this, we implement a KMV synopsis based FIM algorithm by integrating our estimator into existing FIM algorithms, and we prove it is capable of guaranteeing the accuracy of FIM with a bounded size of KMV synopsis. Experimental results on massive streams show our estimator can significantly improve on the accuracy for both estimating itemset frequency and FIM compared to the existing estimators.

Download Full-text