scholarly journals Discovering Frequent Itemsets Reflected User Characteristics Using Weighted Batch based on Data Stream

2011 ◽  
Vol 11 (1) ◽  
pp. 56-64
Author(s):  
Bok-Il Seo ◽  
Jae-In Kim ◽  
Bu-Hyun Hwang
2010 ◽  
Vol 26-28 ◽  
pp. 118-122
Author(s):  
Chong Huan Xu ◽  
Chun Hua Ju

According to the features of data streams and combined sliding window, a new algorithm A-MFI which is based on self-adjusting and orderly-compound policy for mining maximal frequent itemsets in data stream is proposed. This algorithm which is based on basic window updates information from data stream flow fragments and scans the stream only once to gain and store it in frequent itemsets list when the data stream flows. The core idea of this algorithm: construct self-adjusting and orderly-compound FP-tree, use mixed subset pruning techniques to reduce the search space, merge nodes which has equal minsup in the same branch and compress to generate the orderly-compound FP-tree to avoid superset checking when mining maximal frequent itemsets. The experimental results show that the algorithm has higher efficiency in time and space, and also has good scalability.


Author(s):  
Rodrigo Salvador Monteiro ◽  
Geraldo Zimbrão ◽  
Holger Schwarz ◽  
Bernhard Mitschang ◽  
Jano Moreira de Souza

Calendar-based pattern mining aims at identifying patterns on specific calendar partitions. Potential calendar partitions are for example: every Monday, every first working day of each month, every holiday. Providing flexible mining capabilities for calendar-based partitions is especially challenging in a data stream scenario. The calendar partitions of interest are not known a priori and at each point in time only a subset of the detailed data is available. The authors show how a data warehouse approach can be applied to this problem. The data warehouse that keeps track of frequent itemsets holding on different partitions of the original stream has low storage requirements. Nevertheless, it allows to derive sets of patterns that are complete and precise. Furthermore, the authors demonstrate the effectiveness of their approach by a series of experiments.


2010 ◽  
Vol 26-28 ◽  
pp. 113-117
Author(s):  
Pei Shuai Chen ◽  
Chong Huan Xu

Mining maximal frequent itemsets get the advantage of a relatively small number of itemsets. Compared to mining frequent itemsets and mining frequent closed itemsets, such algorithm has higher time and space efficiency. According to the features of data streams and combined sliding window, a new algorithm E-FPMFI which is based on orderly-compound policy for mining maximal frequent itemsets in data stream is proposed. The algorithm based on basic window updates information from data stream flow fragment and scans the stream only once to gain and store it in frequent itemsets list. The algorithm construct FP-tree, then compress orderly FP-tree by merging nodes which has equal minsup in same branch, also uses subset mix pruning technique, avoid superset checking. The experimental results show the algorithm has higher time, space efficiency and good scalability.


2019 ◽  
Vol 4 (3) ◽  
pp. 237
Author(s):  
Maliha Momtaz ◽  
Abu Ahmed Ferdaus ◽  
Chowdhury Farhan Ahmed ◽  
Mohammad Samiullah

Author(s):  
Jia-Ling Koh ◽  
Shu-Ning Shin ◽  
Yuan-Bin Don

Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets over data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of patterns within a sliding window has to be maintained completely in the traditional approach. For estimating the approximate supports of patterns within a sliding window, the frequency changing point (FCP) method is proposed for monitoring the recent occurrences of itemsets over a data stream. In addition to a basic design proposed under the assumption that exact one transaction arrives at each time point, the FCP method is extended for maintaining recent patterns over a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within a fixed time unit. Accordingly, the recently frequent itemsets or representative patterns are discovered from the maintained structure approximately. Experimental studies demonstrate that the proposed algorithms achieve high true positive rates and guarantees no false dismissal to the results yielded. A theoretic analysis is provided for the guarantee. In addition, the authors’ approach outperforms the previously proposed method in terms of reducing the run-time memory usage significantly.


Sign in / Sign up

Export Citation Format

Share Document