K-MDTSC: K-Multi-Dimensional Time-Series Clustering Algorithm

Danilo Giordano; Marco Mellia; Tania Cerquitelli

doi:10.3390/electronics10101166

K-MDTSC: K-Multi-Dimensional Time-Series Clustering Algorithm

Electronics ◽

10.3390/electronics10101166 ◽

2021 ◽

Vol 10 (10) ◽

pp. 1166

Author(s):

Danilo Giordano ◽

Marco Mellia ◽

Tania Cerquitelli

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Welding Process ◽

Heterogeneous Data ◽

Current Data ◽

Industrial Plant ◽

Process Data ◽

Case Scenario ◽

Time Series Clustering ◽

Synthetic Datasets

The increasing capability to collect data gives us the possibility to collect a massive amount of heterogeneous data. Among the heterogeneous data available, time-series represents a mother lode of information yet to be fully explored. Current data mining techniques have several shortcomings while analyzing time-series, especially when more than one time-series, i.e., multi-dimensional time-series, should be analyzed together to extract knowledge from the data. In this context, we present K-MDTSC (K-Multi-Dimensional Time-Series Clustering), a novel clustering algorithm specifically designed to deal with multi-dimensional time-series. Firstly, we demonstrate K-MDTSC capability to group multi-dimensional time-series using synthetic datasets. We compare K-MDTSC results with k-Shape, a state-of-art time-series clustering algorithm based on K-means. Our results show both K-MDTSC and k-Shape create good clustering results. However, K-MDTSC outperforms k-Shape when complicating the synthetic dataset. Secondly, we apply K-MDTSC in a real case scenario where we are asked to replace a scheduled maintenance with a predictive approach. To this end, we create a generalized pipeline to process data from a real industrial plant welding process. We apply K-MDTSC to create clusters of weldings based on their welding shape. Our results show that K-MDTSC identifies different welding profiles, but that the aging of the electrode does not negatively impact the welding process.

A time-series clustering algorithm for analyzing the changes of mobility pattern caused by COVID-19

10.1145/3486637.3489489 ◽

2021 ◽

Author(s):

Ziyi Zhang ◽

Diya Li ◽

Zhe Zhang ◽

Nicholas Duffield

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Mobility Pattern ◽

Time Series Clustering

Time Series Clustering Algorithm for Two-Modes Cyclic Biosignals

Biomedical Engineering Systems and Technologies - Communications in Computer and Information Science ◽

10.1007/978-3-642-29752-6_17 ◽

2013 ◽

pp. 233-245

Author(s):

Neuza Nunes ◽

Tiago Araújo ◽

Hugo Gamboa

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Clustering

Clustering Methodology for Time Series Mining

Scientific Journal of Riga Technical University Computer Sciences ◽

10.2478/v10143-010-0011-0 ◽

2009 ◽

Vol 40 (1) ◽

pp. 81-86

Author(s):

Pēteris Grabusts ◽

Arkady Borisov

Keyword(s):

Time Series ◽

Time Series Analysis ◽

Clustering Algorithm ◽

Time Series Data ◽

Similarity Measures ◽

Longest Common Subsequence ◽

Series Data ◽

Time Series Clustering ◽

Series Analysis ◽

Time Series Mining

Clustering Methodology for Time Series MiningA time series is a sequence of real data, representing the measurements of a real variable at time intervals. Time series analysis is a sufficiently well-known task; however, in recent years research has been carried out with the purpose to try to use clustering for the intentions of time series analysis. The main motivation for representing a time series in the form of clusters is to better represent the main characteristics of the data. The central goal of the present research paper was to investigate clustering methodology for time series data mining, to explore the facilities of time series similarity measures and to use them in the analysis of time series clustering results. More complicated similarity measures include Longest Common Subsequence method (LCSS). In this paper, two tasks have been completed. The first task was to define time series similarity measures. It has been established that LCSS method gives better results in the detection of time series similarity than the Euclidean distance. The second task was to explore the facilities of the classical k-means clustering algorithm in time series clustering. As a result of the experiment a conclusion has been drawn that the results of time series clustering with the help of k-means algorithm correspond to the results obtained with LCSS method, thus the clustering results of the specific time series are adequate.

Time Series Clustering of Online Gambling Activities for Addicted Users’ Detection

Applied Sciences ◽

10.3390/app11052397 ◽

2021 ◽

Vol 11 (5) ◽

pp. 2397

Author(s):

Fernando Peres ◽

Enrico Fallacara ◽

Luca Manzoni ◽

Mauro Castelli ◽

Aleš Popovič ◽

...

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Online Gambling ◽

Social Experiment ◽

Time Series Clustering ◽

Pathological Gamblers ◽

Beneficial Effects ◽

Personal Consequences ◽

Online Gamblers ◽

Gambling Activities

Ever since the worldwide demand for gambling services started to spread, its expansion has continued steadily. To wit, online gambling is a major industry in every European country, generating billions of Euros in revenue for commercial actors and governments alike. Despite such evidently beneficial effects, online gambling is ultimately a vast social experiment with potentially disastrous social and personal consequences that could result in an overall deterioration of social and familial relationships. Despite the relevance of this problem in society, there is a lack of tools for characterizing the behavior of online gamblers based on the data that are collected daily by betting platforms. This paper uses a time series clustering algorithm that can help decision-makers in identifying behaviors associated with potential pathological gamblers. In particular, experimental results obtained by analyzing sports event bets and black jack data demonstrate the suitability of the proposed method in detecting critical (i.e., pathological) players. This algorithm is the first component of a system developed in collaboration with the Portuguese authority for the control of betting activities.

Linear Time Complexity Time Series Clustering with Symbolic Pattern Forest

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/406 ◽

2019 ◽

Author(s):

Xiaosheng Li ◽

Jessica Lin ◽

Liang Zhao

Keyword(s):

Time Series ◽

Data Storage ◽

Time Complexity ◽

Clustering Algorithm ◽

Time Series Data ◽

Linear Time ◽

Series Data ◽

Data Generation ◽

Time Series Clustering ◽

Group Structures

With increasing powering of data storage and advances in data generation and collection technologies, large volumes of time series data become available and the content is changing rapidly. This requires the data mining methods to have low time complexity to handle the huge and fast-changing data. This paper presents a novel time series clustering algorithm that has linear time complexity. The proposed algorithm partitions the data by checking some randomly selected symbolic patterns in the time series. Theoretical analysis is provided to show that group structures in the data can be revealed from this process. We evaluate the proposed algorithm extensively on all 85 datasets from the well-known UCR time series archive, and compare with the state-of-the-art approaches with statistical analysis. The results show that the proposed method is faster, and achieves better accuracy compared with other rival methods.

A hidden Markov model-based K-means time series clustering algorithm

2010 IEEE International Conference on Intelligent Computing and Intelligent Systems ◽

10.1109/icicisys.2010.5658820 ◽

2010 ◽

Cited By ~ 2

Author(s):

Li-Li Wei ◽

Jing-Qiang Jiang

Keyword(s):

Time Series ◽

Markov Model ◽

Hidden Markov Model ◽

Clustering Algorithm ◽

Hidden Markov ◽

Time Series Clustering ◽

Model Based

Time Series Clustering Based on Singularity

International Journal of Computers Communications & Control ◽

10.15837/ijccc.2017.6.3002 ◽

2017 ◽

Vol 12 (6) ◽

pp. 790

Author(s):

Dan Chang ◽

Yunfang Ma ◽

Xueli Ding

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Clustering

With relevant theories on time series clustering, the thesis makes researchinto similarity clustering process of time series from the perspective of singularity andproposes the time series clustering based on singularity applying K-means and DBScanclustering algorithms according to the shortage of traditional clustering algorithm. Inaccordance with the general clustering process of time series, time series clusteringbased on singularity and K-means are made respectively to get different clusteringresults and make a comparison, thus proving that similarity clustering research oftime series from the perspective of singularity can better find out people’s concern ontime series.

Time series clustering based on sparse subspace clustering algorithm and its application to daily box-office data analysis

Neural Computing and Applications ◽

10.1007/s00521-018-3731-7 ◽

2018 ◽

Vol 31 (9) ◽

pp. 4809-4818 ◽

Cited By ~ 5

Author(s):

Yan Wang ◽

Yunian Ru ◽

Jianping Chai

Keyword(s):

Time Series ◽

Data Analysis ◽

Clustering Algorithm ◽

Subspace Clustering ◽

Time Series Clustering ◽

Box Office ◽

Sparse Subspace Clustering

Time series clustering in large data sets

Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis ◽

10.11118/actaun201159020075 ◽

2011 ◽

Vol 59 (2) ◽

pp. 75-80 ◽

Cited By ~ 4

Author(s):

Jiří Fejfar ◽

Jiří Šťastný

Keyword(s):

Time Series ◽

Digital Libraries ◽

Clustering Algorithm ◽

Learning Algorithm ◽

Large Data ◽

Data Sets ◽

Self Organizing Map ◽

Time Series Clustering ◽

Feature Vectors ◽

Cover Songs

The clustering of time series is a widely researched area. There are many methods for dealing with this task. We are actually using the Self-organizing map (SOM) with the unsupervised learning algorithm for clustering of time series. After the first experiment (Fejfar, Weinlichová, Šťastný, 2009) it seems that the whole concept of the clustering algorithm is correct but that we have to perform time series clustering on much larger dataset to obtain more accurate results and to find the correlation between configured parameters and results more precisely. The second requirement arose in a need for a well-defined evaluation of results. It seems useful to use sound recordings as instances of time series again. There are many recordings to use in digital libraries, many interesting features and patterns can be found in this area. We are searching for recordings with the similar development of information density in this experiment. It can be used for musical form investigation, cover songs detection and many others applications.The objective of the presented paper is to compare clustering results made with different parameters of feature vectors and the SOM itself. We are describing time series in a simplistic way evaluating standard deviations for separated parts of recordings. The resulting feature vectors are clustered with the SOM in batch training mode with different topologies varying from few neurons to large maps.There are other algorithms discussed, usable for finding similarities between time series and finally conclusions for further research are presented. We also present an overview of the related actual literature and projects.

An Adaptive Density-Based Time Series Clustering Algorithm: A Case Study on Rainfall Patterns

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi5110205 ◽

2016 ◽

Vol 5 (11) ◽

pp. 205 ◽

Cited By ~ 1

Author(s):

Xiaomi Wang ◽

Yaolin Liu ◽

Yiyun Chen ◽

Yi Liu

Keyword(s):

Time Series ◽

Clustering Algorithm ◽

Time Series Clustering ◽

Rainfall Patterns