scholarly journals A Hybrid Approach for Clustering Uncertain Time Series

2021 ◽  
Vol 28 (4) ◽  
pp. 255-267
Author(s):  
Ruizhe Ma ◽  
Xiaoping Zhu ◽  
Li Yan

Information uncertainty extensively exists in the real-world applications, and uncertain data process and analysis have been a crucial issue in the area of data and knowledge engineering. In this paper, we concentrate on uncertain time series data clustering, in which the uncertain values at time points are represented by probability density function. We propose a hybrid clustering approach for uncertain time series. Our clustering approach first partitions the uncertain time series data into a set of micro-clusters and then merges the micro-clusters following the idea of hierarchical clustering. We evaluate our approach with experiments. The experimental results show that, compared with the traditional UK-means clustering algorithm, the Adjusted Rand Index (ARI) of our clustering results have an obviously higher accuracy. In addition, the time efficiency of our clustering approach is significantly improved.

2016 ◽  
Vol 26 (09n10) ◽  
pp. 1361-1377 ◽  
Author(s):  
Daoyuan Li ◽  
Tegawende F. Bissyande ◽  
Jacques Klein ◽  
Yves Le Traon

Time series mining has become essential for extracting knowledge from the abundant data that flows out from many application domains. To overcome storage and processing challenges in time series mining, compression techniques are being used. In this paper, we investigate the loss/gain of performance of time series classification approaches when fed with lossy-compressed data. This extended empirical study is essential for reassuring practitioners, but also for providing more insights on how compression techniques can even be effective in smoothing and reducing noise in time series data. From a knowledge engineering perspective, we show that time series may be compressed by 90% using discrete wavelet transforms and still achieve remarkable classification accuracy, and that residual details left by popular wavelet compression techniques can sometimes even help to achieve higher classification accuracy than the raw time series data, as they better capture essential local features.


Author(s):  
Pēteris Grabusts ◽  
Arkady Borisov

Clustering Methodology for Time Series MiningA time series is a sequence of real data, representing the measurements of a real variable at time intervals. Time series analysis is a sufficiently well-known task; however, in recent years research has been carried out with the purpose to try to use clustering for the intentions of time series analysis. The main motivation for representing a time series in the form of clusters is to better represent the main characteristics of the data. The central goal of the present research paper was to investigate clustering methodology for time series data mining, to explore the facilities of time series similarity measures and to use them in the analysis of time series clustering results. More complicated similarity measures include Longest Common Subsequence method (LCSS). In this paper, two tasks have been completed. The first task was to define time series similarity measures. It has been established that LCSS method gives better results in the detection of time series similarity than the Euclidean distance. The second task was to explore the facilities of the classical k-means clustering algorithm in time series clustering. As a result of the experiment a conclusion has been drawn that the results of time series clustering with the help of k-means algorithm correspond to the results obtained with LCSS method, thus the clustering results of the specific time series are adequate.


2021 ◽  
Vol 7 ◽  
pp. e534
Author(s):  
Kristoko Dwi Hartomo ◽  
Yessica Nataliani

This paper aims to propose a new model for time series forecasting that combines forecasting with clustering algorithm. It introduces a new scheme to improve the forecasting results by grouping the time series data using k-means clustering algorithm. It utilizes the clustering result to get the forecasting data. There are usually some user-defined parameters affecting the forecasting results, therefore, a learning-based procedure is proposed to estimate the parameters that will be used for forecasting. This parameter value is computed in the algorithm simultaneously. The result of the experiment compared to other forecasting algorithms demonstrates good results for the proposed model. It has the smallest mean squared error of 13,007.91 and the average improvement rate of 19.83%.


In this paper, we analyze, model, predict and cluster Global Active Power, i.e., a time series data obtained at one minute intervals from electricity sensors of a household. We analyze changes in seasonality and trends to model the data. We then compare various forecasting methods such as SARIMA and LSTM to forecast sensor data for the household and combine them to achieve a hybrid model that captures nonlinear variations better than either SARIMA or LSTM used in isolation. Finally, we cluster slices of time series data effectively using a novel clustering algorithm that is a combination of density-based and centroid-based approaches, to discover relevant subtle clusters from sensor data. Our experiments have yielded meaningful insights from the data at both a micro, day-to-day granularity, as well as a macro, weekly to monthly granularity.


2016 ◽  
Vol 5 (6) ◽  
pp. 233-236
Author(s):  
Radzuan M. F. Nabilah ◽  
◽  
Zalinda Othman ◽  
Bakar A. Azuraliza

Sign in / Sign up

Export Citation Format

Share Document