Data Stream Frequent Closed Item Sets Mining Based on Fast Sliding Window

According to the mobility and continuity of the flow of data streams，this paper presents an algorithm called NSWR to mine the frequent item sets from a fast sliding window over data streams and it meets people’s needs of getting the frequent item sets over data that recently arrive. NWSR, using an effective bit-sequence representation of items based on the data stream sliding window, helps to store data; to support different support threshold value inquiry through hash-table-based frequent closed item sets results query method; to offer screening method based on the classification of closed item sets for reducing the number of item sets that need closure judgments, effectively reducing the computational complexity. Experiments show that the algorithm has better time and space efficiency.

Download Full-text

AFARTICA

Journal of Database Management ◽

10.4018/jdm.2019070104 ◽

2019 ◽

Vol 30 (3) ◽

pp. 71-93

Author(s):

Saubhik Paladhi ◽

Sankhadeep Chatterjee ◽

Takaaki Goto ◽

Soumya Sen

Keyword(s):

Threshold Value ◽

Search Space ◽

Apriori Algorithm ◽

The Novel ◽

Novel Technique ◽

Frequent Item ◽

Artificial Cell ◽

Typical Item ◽

Support Threshold ◽

Frequent Item Sets

Frequent item-set mining has been exhaustively studied in the last decade. Several successful approaches have been made to identify the maximal frequent item-sets from a set of typical item-sets. The present work has introduced a novel pruning mechanism which has proved itself to be significant time efficient. The novel technique is based on the Artificial Cell Division (ACD) algorithm which has been found to be highly successful in solving tasks that involve a multi-way search of the search space. The necessity conditions of the ACD process have been modified accordingly to tackle the pruning procedure. The proposed algorithm has been compared with the apriori algorithm implemented in WEKA. Accurate experimental evaluation has been conducted and the experimental results have proved the superiority of AFARTICA over apriori algorithm. The results have also indicated that the proposed algorithm can lead to better performance when the support threshold value is more for the same set of item-sets.

Download Full-text

Mining Prominent Closed Frequent Item sets from Data Streams using Dynamic and Adaptive Minimum Support Threshold

2018 3rd International Conference on Computational Systems and Information Technology for Sustainable Solutions (CSITSS) ◽

10.1109/csitss.2018.8768749 ◽

2018 ◽

Author(s):

Pavitra Bai S. ◽

Ravi Kumar G.K.

Keyword(s):

Data Streams ◽

Minimum Support ◽

Frequent Item ◽

Support Threshold ◽

Frequent Item Sets

Download Full-text

Mining Maximum Frequent Item Sets Over Data Streams Using Transaction Sliding Window Techniques

International Journal of Information Technology Convergence and Services ◽

10.5121/ijitcs.2013.3201 ◽

2013 ◽

Vol 3 (2) ◽

pp. 1-10

Author(s):

Neeraj ◽

Anuradha

Keyword(s):

Data Streams ◽

Sliding Window ◽

Frequent Item ◽

Frequent Item Sets

Download Full-text

An Efficient Closed Frequent Item Sets Mining Algorithm-For Mining Closed Frequent Item Sets from Data Streams

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2016.5741 ◽

2016 ◽

Vol 13 (10) ◽

pp. 7467-7474

Author(s):

Venu Madhav Kuthadi ◽

Rajalakshmi Selvaraj

Keyword(s):

Data Streams ◽

Data Stream ◽

Processing Time ◽

Frequent Itemset ◽

Memory Usage ◽

Data Set ◽

Frequent Item ◽

Mining Algorithm ◽

Data Elements ◽

Frequent Item Sets

A data stream is a continuous sequence of data elements generated from a specified source. Mining frequent item sets in dynamic databases and data streams encounters some challenges that make the mining task harder than static databases. Many research works were developed in the frequent itemset mining, but these methods have the familiar problem of memory usage and processing time. Because, in data streams data elements are arrive at a rapid rate. The incoming data is unbounded and probably infinite. Due to high speed and large amount of incoming data, frequent item set mining algorithm must require a limited memory and processing time. To reduce this drawback in the existing method, a new algorithm is proposed in this paper. Here, a new algorithm is named as CFIM is developed for mining closed frequent item sets from the data streams based on their utility and consistency. During the closed frequent item sets mining, a hash table is maintained to check whether the given item set is closed or not. The computation of closed frequent item sets from the data stream will minimize the memory usage and processing time. Thus our proposed technique performance is analyzed by using the synthetic data set and compared with the exiting mining techniques.

Download Full-text

A Method for Processing Top-k Continuous Query on Uncertain Data Stream in Sliding Window Model

WSEAS TRANSACTIONS ON SYSTEMS AND CONTROL ◽

10.37394/23203.2021.16.22 ◽

2021 ◽

Vol 16 ◽

pp. 261-269

Author(s):

Raja Azhan Syah Raja Wahab ◽

Siti Nurulain Mohd Rum ◽

Hamidah Ibrahim ◽

Fatimah Sidi ◽

Iskandar Ishak

Keyword(s):

Query Processing ◽

Data Streams ◽

Data Stream ◽

Uncertain Data ◽

Research Work ◽

Computational Cost ◽

Sliding Window ◽

Possible World ◽

Processing Methods ◽

Uncertain Data Streams

The data stream is a series of data generated at sequential time from different sources. Processing such data is very important in many contemporary applications such as sensor networks, RFID technology, mobile computing and many more. The huge amount data generated and frequent changes in a short time makes the conventional processing methods insufficient. The Sliding Window Model (SWM) was introduced by Datar et. al to handle this problem. Avoiding multiple scans of the whole data sets, optimizing memory usage, and processing only the most recent tuple are the main challenges. The number of possible world instances grows exponentially in uncertain data and it is highly difficult to comprehend what it takes to meet Top-k query processing in the shortest amount of time. Following the generation of rules and the probability theory of this model, a framework was anticipated to sustain top-k processing algorithm over the SWM approach until the candidates expired. Based on the literature review study, none of the existing work have been made to tackle the issue arises from the top-k query processing of the possible world instance of the uncertain data streams within the SWM. The major issue resulted from these scenarios need to be addressed especially in the computation redundancy area that contributed to the increases of computational cost within the SWM. Therefore, the main objective of this research work is to propose the top-k query processing methods over uncertain data streams in SWM utilizing the score and the Possible World (PW) setting. In this study, a novel expiration and object indexing method is introduced to address the computational redundancy issues. We believed the proposed method can reduce computational costs and by managing insertion and exit policy on the right tuple candidates within a specified window frame. This research work will contribute to the area of computational query processing.

Download Full-text

An Approximate Approach for Maintaining Recent Occurrences of Itemsets in a Sliding Window over Data Streams

Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development ◽

10.4018/978-1-60566-748-5.ch014 ◽

2010 ◽

pp. 308-327

Author(s):

Jia-Ling Koh ◽

Shu-Ning Shin ◽

Yuan-Bin Don

Keyword(s):

Data Streams ◽

Data Stream ◽

Traditional Approach ◽

Experimental Studies ◽

Dynamic Environment ◽

Sliding Window ◽

Fixed Time ◽

Frequent Itemsets ◽

Embedded Knowledge ◽

Data Elements

Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets over data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of patterns within a sliding window has to be maintained completely in the traditional approach. For estimating the approximate supports of patterns within a sliding window, the frequency changing point (FCP) method is proposed for monitoring the recent occurrences of itemsets over a data stream. In addition to a basic design proposed under the assumption that exact one transaction arrives at each time point, the FCP method is extended for maintaining recent patterns over a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within a fixed time unit. Accordingly, the recently frequent itemsets or representative patterns are discovered from the maintained structure approximately. Experimental studies demonstrate that the proposed algorithms achieve high true positive rates and guarantees no false dismissal to the results yielded. A theoretic analysis is provided for the guarantee. In addition, the authors’ approach outperforms the previously proposed method in terms of reducing the run-time memory usage significantly.

Download Full-text

estMax: Tracing Maximal Frequent Item Sets Instantly over Online Transactional Data Streams

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2008.233 ◽

2009 ◽

Vol 21 (10) ◽

pp. 1418-1431 ◽

Cited By ~ 12

Author(s):

Ho Jin Woo ◽

Won Suk Lee

Keyword(s):

Data Streams ◽

Frequent Item ◽

Frequent Item Sets ◽

Transactional Data ◽

Transactional Data Streams

Download Full-text

An Efficient Algorithm for Mining Of frequent items using incremental model

International Journal of Computer Science and Informatics ◽

10.47893/ijcsi.2011.1004 ◽

2011 ◽

pp. 18-22

Author(s):

Nibedita Panigrahi ◽

P.K. Pattnaik ◽

S.K. Padhi

Keyword(s):

Data Streams ◽

Efficient Algorithm ◽

Experimental Result ◽

Current Frequency ◽

The Past ◽

Frequent Item ◽

Incremental Model ◽

Current State ◽

Frequent Items ◽

Frequent Item Sets

Data mining is a part of know ledge Discovery in database process (KDD). As technology advances, floods of data can be produced and shared in many appliances such as wireless Sensor networks or Web click streams. This calls for extracting useful information and knowledge from streams of data. In this paper, We have proposed an efficient algorithm, where, at any time the current frequencies of all frequent item sets can be immediately produced. The current frequency of an item set in a stream is defined as its maximal frequency over all possible windows in the stream from any point in the past until the current state. The experimental result shows the proposed algorithm not only maintains a small summery of information for one item set but also consumes less memory then existing algorithms for mining frequent item sets over recent data streams.

Download Full-text

Research of a De-Noising Algorithm Based on Sliding Window

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.461.355 ◽

2012 ◽

Vol 461 ◽

pp. 355-359

Author(s):

Wen Chuan Yang ◽

Ying Hua Song ◽

Ting Xi Gou

Keyword(s):

Knowledge Discovery ◽

Association Rules ◽

Sliding Window ◽

Experimental Results ◽

Data Set ◽

Model Based ◽

Frequent Item ◽

Telecom Service ◽

Mining Algorithms ◽

Frequent Item Sets

Top-quality and efficient service increases in importance in the telecom service. One of its challenging issues is to deal with the atypical incidents. While the traditional mining algorithms are focus on the high-frequent item sets, a de-noising algorithm related to the atypical incidents still remains unsettled. This paper proposed a de-noising model based on the sliding window. In this model, FP-tree and multi-association rules are introduced to fix the thresholds of the sliding window. Experimental results demonstrate that the proposed algorithm can apply an appropriate data set to the knowledge discovery of the atypical incidents

Download Full-text

Data Mining Algorithm of Frequent Probability Item Based on Sliding Window

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.602-605.3268 ◽

2014 ◽

Vol 602-605 ◽

pp. 3268-3271

Author(s):

Zhi Zhang ◽

Qi Fu

Keyword(s):

Data Stream ◽

Uncertain Data ◽

Sliding Window ◽

Relevant Information ◽

Data Mining Algorithm ◽

Performance Requirement ◽

Frequent Item ◽

Mining Algorithm ◽

Frequent Items ◽

And Performance

In order to meet the uncertain data stream mining demand in large dynamic database, a frequent probability item mining algorithm was proposed base on sliding window. The mass data in the database was regarded as a data stream. In the window model of data stream, the frequent item set was extracted according to the probability frequency distribution information of data. Compared to the traditional algorithm, the mining environmental constraints of the certain data stream was overcome, the defect that the relevant information was easy to lose was improved. The true information of data was reflected fully, and the most accurate frequent item was minded. Simulation result shows that the new algorithm can mine the frequent items accurately, and the accuracy rate is higher than the traditional method. It can process the data quickly. It provides effective strategy for analyzing the large database, and it can meet the memory requirement and performance requirement in database analysis and mining.

Download Full-text