Clustering over uncertain data stream

An efficient Algorithm for Frequent Pattern Mining over Uncertain Data Stream

2019 12th International Symposium on Computational Intelligence and Design (ISCID) ◽

10.1109/iscid.2019.00026 ◽

2019 ◽

Author(s):

Mingye Xie ◽

Long Tan

Keyword(s):

Efficient Algorithm ◽

Data Stream ◽

Pattern Mining ◽

Uncertain Data ◽

Frequent Pattern Mining ◽

Frequent Pattern

Download Full-text

Online Clustering on Uncertain Data Stream

Journal of Physics Conference Series ◽

10.1088/1742-6596/1189/1/012025 ◽

2019 ◽

Vol 1189 ◽

pp. 012025

Author(s):

A Makhmutova ◽

I Anikin

Keyword(s):

Data Stream ◽

Uncertain Data ◽

Online Clustering

Download Full-text

Research on Clustering Algorithm Based on Grid Density on Uncertain Data Stream

International Journal of Database Theory and Application ◽

10.14257/ijdta.2016.9.9.08 ◽

2016 ◽

Vol 9 (9) ◽

pp. 83-96

Author(s):

Tang Xianghong ◽

Yang Quanwei ◽

Zheng Yang

Keyword(s):

Data Stream ◽

Clustering Algorithm ◽

Uncertain Data

Download Full-text

Classifier Ensemble Algorithm for Data Stream with Attribute Uncertainty

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2016.5747 ◽

2016 ◽

Vol 13 (10) ◽

pp. 7519-7525 ◽

Cited By ~ 1

Author(s):

Zhang Xing ◽

Wang MeiLi ◽

Zhang Yang ◽

Ning Jifeng

Keyword(s):

Decision Tree ◽

Data Stream ◽

High Speed ◽

Information Gain ◽

Uncertain Data ◽

Classifier Ensemble ◽

Ensemble Classifiers ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Ensemble Algorithm

To build a classifier for uncertain data stream, an Ensemble of Uncertain Decision Tree Algorithm (EDTU) is proposed. Firstly, the decision tree algorithm for uncertain data (DTU) was improved by changing the calculation method of its information gain and improving the efficiency of the algorithm so that it can process the high-speed flow of data streams; then, based on this basic classifier, dynamic classifier ensemble algorithm was used, and the classifiers presenting effective classification were selected to constitute ensemble classifiers. Experimental results on SEA and Forest Covertype Datasets demonstrate that the proposed EDTU algorithm is efficient in classifying data stream with uncertain attribute, and the performance is stable under the different parameters.

Download Full-text

Clustering on Uncertain Data Stream over Sliding Windows

2015 Third International Conference on Advanced Cloud and Big Data ◽

10.1109/cbd.2015.32 ◽

2015 ◽

Author(s):

Li Tu

Keyword(s):

Data Stream ◽

Uncertain Data ◽

Sliding Windows

Download Full-text

A Method for Processing Top-k Continuous Query on Uncertain Data Stream in Sliding Window Model

WSEAS TRANSACTIONS ON SYSTEMS AND CONTROL ◽

10.37394/23203.2021.16.22 ◽

2021 ◽

Vol 16 ◽

pp. 261-269

Author(s):

Raja Azhan Syah Raja Wahab ◽

Siti Nurulain Mohd Rum ◽

Hamidah Ibrahim ◽

Fatimah Sidi ◽

Iskandar Ishak

Keyword(s):

Query Processing ◽

Data Streams ◽

Data Stream ◽

Uncertain Data ◽

Research Work ◽

Computational Cost ◽

Sliding Window ◽

Possible World ◽

Processing Methods ◽

Uncertain Data Streams

The data stream is a series of data generated at sequential time from different sources. Processing such data is very important in many contemporary applications such as sensor networks, RFID technology, mobile computing and many more. The huge amount data generated and frequent changes in a short time makes the conventional processing methods insufficient. The Sliding Window Model (SWM) was introduced by Datar et. al to handle this problem. Avoiding multiple scans of the whole data sets, optimizing memory usage, and processing only the most recent tuple are the main challenges. The number of possible world instances grows exponentially in uncertain data and it is highly difficult to comprehend what it takes to meet Top-k query processing in the shortest amount of time. Following the generation of rules and the probability theory of this model, a framework was anticipated to sustain top-k processing algorithm over the SWM approach until the candidates expired. Based on the literature review study, none of the existing work have been made to tackle the issue arises from the top-k query processing of the possible world instance of the uncertain data streams within the SWM. The major issue resulted from these scenarios need to be addressed especially in the computation redundancy area that contributed to the increases of computational cost within the SWM. Therefore, the main objective of this research work is to propose the top-k query processing methods over uncertain data streams in SWM utilizing the score and the Possible World (PW) setting. In this study, a novel expiration and object indexing method is introduced to address the computational redundancy issues. We believed the proposed method can reduce computational costs and by managing insertion and exit policy on the right tuple candidates within a specified window frame. This research work will contribute to the area of computational query processing.

Download Full-text

DDEUDSC: A Dynamic Distance Estimation using Uncertain Data Stream Clustering in mobile wireless sensor networks

Measurement ◽

10.1016/j.measurement.2014.05.040 ◽

2014 ◽

Vol 55 ◽

pp. 423-433 ◽

Cited By ~ 15

Author(s):

Qinghua Luo ◽

Xiaozhen Yan ◽

Junbao Li ◽

Yu Peng

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Data Stream ◽

Uncertain Data ◽

Distance Estimation ◽

Wireless Sensor ◽

Mobile Wireless ◽

Stream Clustering ◽

Data Stream Clustering ◽

Mobile Wireless Sensor

Download Full-text

A Review of Uncertain Data Stream Clustering Algorithms

2015 Eighth International Conference on Internet Computing for Science and Engineering (ICICSE) ◽

10.1109/icicse.2015.30 ◽

2015 ◽

Cited By ~ 2

Author(s):

Yue Yang ◽

Zhuo Liu ◽

Zhidan Xing

Keyword(s):

Data Stream ◽

Clustering Algorithms ◽

Uncertain Data ◽

Stream Clustering ◽

Data Stream Clustering

Download Full-text

PROBABILISTIC QUERYING OVER UNCERTAIN DATA STREAMS

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488512500328 ◽

2012 ◽

Vol 20 (05) ◽

pp. 701-728 ◽

Cited By ~ 1

Author(s):

MOHAMMAD G. DEZFULI ◽

MOSTAFA S. HAGHJOO

Keyword(s):

Data Streams ◽

Data Stream ◽

Memory Management ◽

Query Language ◽

Uncertain Data ◽

Sensor Data ◽

Data Stream Management ◽

Probabilistic Data ◽

Stream Management ◽

Probabilistic Data Streams

Inherent imprecision of data in many applications motivates us to support uncertainty as a first-class concept. Data stream and probabilistic data have been recently considered noticeably in isolation. However, there are many applications including sensor data management systems and object monitoring systems which need both issues in tandem. Our main contribution is designing a probabilistic data stream management system, called Sarcheshmeh, for continuous querying over probabilistic data streams. Sarcheshmeh supports uncertainty from input data to final query results. In this paper, after reviewing requirements and applications of probabilistic data streams, we present our new data model for probabilistic data streams and define our main logical operators formally. Then, we present query language and physical operators. In addition, we introduce the architecture of Sarcheshmeh and also describe some major challenges like memory management and our floating precision mechanism toward designing a more robust system. Finally, we report evaluation of our system and the effect of floating precision on the tradeoff between accuracy and efficiency.

Download Full-text

Queries for Uncertain Data on Dataspace Based on Effective Clustering Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.380-384.1529 ◽

2013 ◽

Vol 380-384 ◽

pp. 1529-1532

Author(s):

Shuang Zhang ◽

Shi Xiong Zhang

Keyword(s):

Data Stream ◽

Clustering Algorithm ◽

Uncertain Data ◽

Effective Strategy ◽

Clustering Method ◽

Probabilistic Data ◽

Stream Clustering ◽

Data Stream Clustering ◽

Strong Cluster ◽

First Time

This paper presents a probabilistic data stream clustering method P-Stream. An effective clustering algorithm called P-Stream for probabilistic data stream is developed in this paper for the first time. For the uncertain tuples in the data stream, the concepts of strong cluster, transitional clusters and weak cluster are proposed in the P-Stream. With these concepts, an effective strategy of choosing candidate cluster is designed, which can find the sound cluster for every continuously arriving data point. In this paper, we systematically defined the dataspace, the uncertain data, and proposed a updated algorithm of queries on uncertain data based on Effective Clustering Algorithm.

Download Full-text