Top-k Framework: Processing Snapshot & Continuous Queries over Uncertain Data Stream using Sliding Window

The data stream is a series of data generated at sequential time from different sources. Processing such data is very important in many contemporary applications such as sensor networks, RFID technology, mobile computing and many more. The huge amount data generated and frequent changes in a short time makes the conventional processing methods insufficient. The Sliding Window Model (SWM) was introduced by Datar et. al to handle this problem. Avoiding multiple scans of the whole data sets, optimizing memory usage, and processing only the most recent tuple are the main challenges. The number of possible world instances grows exponentially in uncertain data and it is highly difficult to comprehend what it takes to meet Top-k query processing in the shortest amount of time. Following the generation of rules and the probability theory of this model, a framework was anticipated to sustain top-k processing algorithm over the SWM approach until the candidates expired. Based on the literature review study, none of the existing work have been made to tackle the issue arises from the top-k query processing of the possible world instance of the uncertain data streams within the SWM. The major issue resulted from these scenarios need to be addressed especially in the computation redundancy area that contributed to the increases of computational cost within the SWM. Therefore, the main objective of this research work is to propose the top-k query processing methods over uncertain data streams in SWM utilizing the score and the Possible World (PW) setting. In this study, a novel expiration and object indexing method is introduced to address the computational redundancy issues. We believed the proposed method can reduce computational costs and by managing insertion and exit policy on the right tuple candidates within a specified window frame. This research work will contribute to the area of computational query processing.

Download Full-text

Data Mining Algorithm of Frequent Probability Item Based on Sliding Window

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.602-605.3268 ◽

2014 ◽

Vol 602-605 ◽

pp. 3268-3271

Author(s):

Zhi Zhang ◽

Qi Fu

Keyword(s):

Data Stream ◽

Uncertain Data ◽

Sliding Window ◽

Relevant Information ◽

Data Mining Algorithm ◽

Performance Requirement ◽

Frequent Item ◽

Mining Algorithm ◽

Frequent Items ◽

And Performance

In order to meet the uncertain data stream mining demand in large dynamic database, a frequent probability item mining algorithm was proposed base on sliding window. The mass data in the database was regarded as a data stream. In the window model of data stream, the frequent item set was extracted according to the probability frequency distribution information of data. Compared to the traditional algorithm, the mining environmental constraints of the certain data stream was overcome, the defect that the relevant information was easy to lose was improved. The true information of data was reflected fully, and the most accurate frequent item was minded. Simulation result shows that the new algorithm can mine the frequent items accurately, and the accuracy rate is higher than the traditional method. It can process the data quickly. It provides effective strategy for analyzing the large database, and it can meet the memory requirement and performance requirement in database analysis and mining.

Download Full-text