scholarly journals Finding Frequent Itemsets over Data Streams with Sliding Window

2018 ◽  
Vol 11 (8) ◽  
pp. 85-94
Author(s):  
Jeong Hee Hwang ◽  
Hyeok Kim ◽  
Jeong Hee Chi
2012 ◽  
Vol 256-259 ◽  
pp. 2910-2913
Author(s):  
Jun Tan

Online mining of frequent closed itemsets over streaming data is one of the most important issues in mining data streams. In this paper, we proposed a novel sliding window based algorithm. The algorithm exploits lattice properties to limit the search to frequent close itemsets which share at least one item with the new transaction. Experiments results on synthetic datasets show that our proposed algorithm is both time and space efficient.


Author(s):  
Jia-Ling Koh ◽  
Shu-Ning Shin ◽  
Yuan-Bin Don

Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets over data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of patterns within a sliding window has to be maintained completely in the traditional approach. For estimating the approximate supports of patterns within a sliding window, the frequency changing point (FCP) method is proposed for monitoring the recent occurrences of itemsets over a data stream. In addition to a basic design proposed under the assumption that exact one transaction arrives at each time point, the FCP method is extended for maintaining recent patterns over a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within a fixed time unit. Accordingly, the recently frequent itemsets or representative patterns are discovered from the maintained structure approximately. Experimental studies demonstrate that the proposed algorithms achieve high true positive rates and guarantees no false dismissal to the results yielded. A theoretic analysis is provided for the guarantee. In addition, the authors’ approach outperforms the previously proposed method in terms of reducing the run-time memory usage significantly.


2018 ◽  
Vol 16 (6) ◽  
pp. 961-969 ◽  
Author(s):  
Saihua Cai ◽  
Shangbo Hao ◽  
Ruizhi Sun ◽  
Gang Wu

Abstract: The huge number of data streams makes it impossible to mine recent frequent itemsets. Due to the maximal frequent itemsets can perfectly imply all the frequent itemsets and the number is much smaller, therefore, the time cost and the memory usage for mining maximal frequent itemsets are much more efficient. This paper proposes an improved method called Recent Maximal Frequent Itemsets Mining (RMFIsM) to mine recent maximal frequent itemsets over data streams with sliding window. The RMFIsM method uses two matrixes to store the information of data streams, the first matrix stores the information of each transaction and the second one stores the frequent 1-itemsets. The frequent p-itemsets are mined with “extension” process of frequent 2-itemsets, and the maximal frequent itemsets are obtained by deleting the sub-itemsets of long frequent itemsets. Finally, the performance of the RMFIsM method is conducted by a series of experiments, the results show that the proposed RMFIsM method can mine recent maximal frequent itemsets efficiently


Sign in / Sign up

Export Citation Format

Share Document