Time Series Chains: A Novel Tool for Time Series Data Mining

Since their introduction over a decade ago, time se-ries motifs have become a fundamental tool for time series analytics, finding diverse uses in dozens of domains. In this work we introduce Time Series Chains, which are related to, but distinct from, time series motifs. Informally, time series chains are a temporally ordered set of subsequence patterns, such that each pattern is similar to the pattern that preceded it, but the first and last patterns are arbi-trarily dissimilar. In the discrete space, this is simi-lar to extracting the text chain “hit, hot, dot, dog” from a paragraph. The first and last words have nothing in common, yet they are connected by a chain of words with a small mutual difference. Time Series Chains can capture the evolution of systems, and help predict the future. As such, they potentially have implications for prognostics. In this work, we introduce a robust definition of time series chains, and a scalable algorithm that allows us to discover them in massive datasets.

Download Full-text

Particularities of data mining in medicine: lessons learned from patient medical time series data analysis

EURASIP Journal on Wireless Communications and Networking ◽

10.1186/s13638-019-1582-2 ◽

2019 ◽

Vol 2019 (1) ◽

Cited By ~ 2

Author(s):

Shadi Aljawarneh ◽

Aurea Anguera ◽

John William Atwood ◽

Juan A. Lara ◽

David Lizcano

Keyword(s):

Data Mining ◽

Time Series ◽

Knowledge Discovery ◽

Time Series Data ◽

Medical Patient ◽

Lessons Learned ◽

Physiological Signals ◽

Knowledge Discovery In Databases ◽

Series Data ◽

Data Mining Techniques

AbstractNowadays, large amounts of data are generated in the medical domain. Various physiological signals generated from different organs can be recorded to extract interesting information about patients’ health. The analysis of physiological signals is a hard task that requires the use of specific approaches such as the Knowledge Discovery in Databases process. The application of such process in the domain of medicine has a series of implications and difficulties, especially regarding the application of data mining techniques to data, mainly time series, gathered from medical examinations of patients. The goal of this paper is to describe the lessons learned and the experience gathered by the authors applying data mining techniques to real medical patient data including time series. In this research, we carried out an exhaustive case study working on data from two medical fields: stabilometry (15 professional basketball players, 18 elite ice skaters) and electroencephalography (100 healthy patients, 100 epileptic patients). We applied a previously proposed knowledge discovery framework for classification purpose obtaining good results in terms of classification accuracy (greater than 99% in both fields). The good results obtained in our research are the groundwork for the lessons learned and recommendations made in this position paper that intends to be a guide for experts who have to face similar medical data mining projects.

Download Full-text

A Simple and Efficient Method for Fault Diagnosis Using Time Series Data Mining

2007 IEEE International Electric Machines & Drives Conference ◽

10.1109/iemdc.2007.382734 ◽

2007 ◽

Cited By ~ 10

Author(s):

I. Aydin ◽

M. Karakose ◽

E. Akin

Keyword(s):

Data Mining ◽

Time Series ◽

Fault Diagnosis ◽

Efficient Method ◽

Time Series Data ◽

Series Data ◽

Time Series Data Mining

Download Full-text

Adaptive Multiresolution and Dedicated Elastic Matching in Linear Time Complexity for Time Series Data Mining

Sixth International Conference on Intelligent Systems Design and Applications ◽

10.1109/isda.2006.84 ◽

2006 ◽

Cited By ~ 3

Author(s):

Pierre-francois Marteau ◽

Gildas Menier

Keyword(s):

Data Mining ◽

Time Series ◽

Time Complexity ◽

Time Series Data ◽

Linear Time ◽

Series Data ◽

Time Series Data Mining

Download Full-text

Efficient Time Series Clustering and Its Application to Social Network Mining

Journal of Intelligent Systems ◽

10.1515/jisys-2014-0005 ◽

2014 ◽

Vol 23 (2) ◽

pp. 213-229 ◽

Cited By ~ 2

Author(s):

Cangqi Zhou ◽

Qianchuan Zhao

Keyword(s):

Time Series ◽

Social Network ◽

Information Diffusion ◽

Time Series Data ◽

Online Social Network ◽

Dissimilarity Measure ◽

Series Data ◽

Formation Mechanisms ◽

Network Analyses ◽

Definition Of

AbstractMining time series data is of great significance in various areas. To efficiently find representative patterns in these data, this article focuses on the definition of a valid dissimilarity measure and the acceleration of partitioning clustering, a common group of techniques used to discover typical shapes of time series. Dissimilarity measure is a crucial component in clustering. It is required, by some particular applications, to be invariant to specific transformations. The rationale for using the angle between two time series to define a dissimilarity is analyzed. Moreover, our proposed measure satisfies the triangle inequality with specific restrictions. This property can be employed to accelerate clustering. An integrated algorithm is proposed. The experiments show that angle-based dissimilarity captures the essence of time series patterns that are invariant to amplitude scaling. In addition, the accelerated algorithm outperforms the standard one as redundancies are pruned. Our approach has been applied to discover typical patterns of information diffusion in an online social network. Analyses revealed the formation mechanisms of different patterns.

Download Full-text

A PSO based time series data clustering using modified S-transform for data mining

International Journal of Data Mining Modelling and Management ◽

10.1504/ijdmmm.2011.041810 ◽

2011 ◽

Vol 3 (3) ◽

pp. 277

Author(s):

Ranjeeta Bisoi ◽

P.K. Dash

Keyword(s):

Data Mining ◽

Time Series ◽

Data Clustering ◽

Time Series Data ◽

Series Data ◽

S Transform

Download Full-text

Clustering of Time Series Data

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch042 ◽

2011 ◽

pp. 258-263

Author(s):

Anne Denton

Keyword(s):

Data Mining ◽

Time Series ◽

Pattern Mining ◽

Time Series Data ◽

Frequent Pattern Mining ◽

Frequent Pattern ◽

Series Data ◽

Science And Engineering ◽

Data Mining Algorithms ◽

Mining Algorithms

Time series data is of interest to most science and engineering disciplines and analysis techniques have been developed for hundreds of years. There have, however, in recent years been new developments in data mining techniques, such as frequent pattern mining, that take a different perspective of data. Traditional techniques were not meant for such pattern-oriented approaches. There is, as a result, a significant need for research that extends traditional time-series analysis, in particular clustering, to the requirements of the new data mining algorithms.

Download Full-text

Time Series Data Mining

10.1007/springerreference_65949 ◽

2011 ◽

Keyword(s):

Data Mining ◽

Time Series ◽

Time Series Data ◽

Series Data ◽

Time Series Data Mining

Download Full-text

Discrete and Fuzzy Models of Time Series in the Tasks of Forecasting and Diagnostics

Axioms ◽

10.3390/axioms9020049 ◽

2020 ◽

Vol 9 (2) ◽

pp. 49

Author(s):

Anton Romanov ◽

Valeria Voronina ◽

Gleb Guskov ◽

Irina Moshkina ◽

Nadezhda Yarushkina

Keyword(s):

Data Mining ◽

Time Series ◽

Knowledge Base ◽

Intelligent Systems ◽

Time Series Data ◽

Series Data ◽

Quality Of Data ◽

Dynamic Data ◽

Production Program ◽

Time Series Data Mining

The development of the economy and the transition to industry 4.0 creates new challenges for artificial intelligence methods. Such challenges include the processing of large volumes of data, the analysis of various dynamic indicators, the discovery of complex dependencies in the accumulated data, and the forecasting of the state of processes. The main point of this study is the development of a set of analytical and prognostic methods. The methods described in this article based on fuzzy logic, statistic, and time series data mining, because data extracted from dynamic systems are initially incomplete and have a high degree of uncertainty. The ultimate goal of the study is to improve the quality of data analysis in industrial and economic systems. The advantages of the proposed methods are flexibility and orientation to the high interpretability of dynamic data. The high level of the interpretability and interoperability of dynamic data is achieved due to a combination of time series data mining and knowledge base engineering methods. The merging of a set of rules extracted from the time series and knowledge base rules allow for making a forecast in case of insufficiency of the length and nature of the time series. The proposed methods are also based on the summarization of the results of processes modeling for diagnosing technical systems, forecasting of the economic condition of enterprises, and approaches to the technological preparation of production in a multi-productive production program with the application of type 2 fuzzy sets for time series modeling. Intelligent systems based on the proposed methods demonstrate an increase in the quality and stability of their functioning. This article contains a set of experiments to approve this statement.

Download Full-text

Exploratory Time Series Data Mining by Genetic Clustering

Mathematical Methods for Knowledge Discovery and Data Mining ◽

10.4018/978-1-59904-528-3.ch010 ◽

2011 ◽

pp. 157-178

Author(s):

T. Warren Liao

Keyword(s):

Data Mining ◽

Time Series ◽

Time Series Data ◽

Distance Measures ◽

Series Data ◽

Synthetic Control ◽

Data Set ◽

Univariate Time Series ◽

Genetic Clustering ◽

Data Objects

In this chapter, we present genetic algorithm (GA) based methods developed for clustering univariate time series with equal or unequal length as an exploratory step of data mining. These methods basically implement the k-medoids algorithm. Each chromosome encodes in binary the data objects serving as the k-medoids. To compare their performance, both fixed-parameter and adaptive GAs were used. We first employed the synthetic control chart data set to investigate the performance of three fitness functions, two distance measures, and other GA parameters such as population size, crossover rate, and mutation rate. Two more sets of time series with or without known number of clusters were also experimented: one is the cylinder-bell-funnel data and the other is the novel battle simulation data. The clustering results are presented and discussed.

Download Full-text

Scalable Algorithm for Subsequence Similarity Search in Very Large Time Series Data on Cluster of Phi KNL

Communications in Computer and Information Science - Data Analytics and Management in Data Intensive Domains ◽

10.1007/978-3-030-23584-0_9 ◽

2019 ◽

pp. 149-164 ◽

Cited By ~ 1

Author(s):

Yana Kraeva ◽

Mikhail Zymbler

Keyword(s):

Time Series ◽

Similarity Search ◽

Large Time ◽

Time Series Data ◽

Series Data ◽

Scalable Algorithm

Download Full-text