Similarity Pattern Discovery Using Calendar Concept Hierarchy in Time Series Data

Author(s):  
Sungbo Seo ◽  
Long Jin ◽  
Jun Wook Lee ◽  
Keun Ho Ryu
2021 ◽  
pp. 1-20
Author(s):  
Fabian Kai-Dietrich Noering ◽  
Yannik Schroeder ◽  
Konstantin Jonas ◽  
Frank Klawonn

In technical systems the analysis of similar situations is a promising technique to gain information about the system’s state, its health or wearing. Very often, situations cannot be defined but need to be discovered as recurrent patterns within time series data of the system under consideration. This paper addresses the assessment of different approaches to discover frequent variable-length patterns in time series. Because of the success of artificial neural networks (NN) in various research fields, a special issue of this work is the applicability of NNs to the problem of pattern discovery in time series. Therefore we applied and adapted a Convolutional Autoencoder and compared it to classical nonlearning approaches based on Dynamic Time Warping, based on time series discretization as well as based on the Matrix Profile. These nonlearning approaches have also been adapted, to fulfill our requirements like the discovery of potentially time scaled patterns from noisy time series. We showed the performance (quality, computing time, effort of parametrization) of those approaches in an extensive test with synthetic data sets. Additionally the transferability to other data sets is tested by using real life vehicle data. We demonstrated the ability of Convolutional Autoencoders to discover patterns in an unsupervised way. Furthermore the tests showed, that the Autoencoder is able to discover patterns with a similar quality like classical nonlearning approaches.


2010 ◽  
Vol 24 (9) ◽  
pp. 1198-1210 ◽  
Author(s):  
Rulin Ouyang ◽  
Liliang Ren ◽  
Weiming Cheng ◽  
Chenghu Zhou

2019 ◽  
Author(s):  
Catherine Inibhunu ◽  
Carolyn McGregor

BACKGROUND High frequency data collected from monitors and sensors that provide measures relating to patients’ vital status in intensive care units (NICUs) has the potential to provide valuable insights which can be crucial when making critical decisions for the care of premature and ill term infants. However, this exercise is not trivial when faced with huge volumes of data that are captured every second at the bedside/home. The ability to collect, analyze and understand any hidden relationships in the data that may be vital for clinical decision making is a central challenge. OBJECTIVE The main goal of this research is to develop a method to detect and represent relationships that may exist in temporal abstractions (TA) and temporal patterns (TP) derived from time oriented data. The premise of this research is that in clinical care, the discovery of unknown relationships among physiological time oriented data can lead to detection of onset of conditions, aid in classifying abnormal or normal behaviors or derive patterns of an altered trajectory towards a problematic future state for a patient. That is, there is great potential to use this approach to uncover previously unknown pathophysiologies that are present in high speed physiological data. METHODS This research introduces a TPR process and an associated TPRMine algorithm which adopts a stepwise approach to temporal pattern discovery by first applying a scaled mathematical formulation of the time series data. This is achieved by modelling the problem space as a finite state machine representation where for a given timeframe, a time series data segment transitions from one state to another based on probabilistic weights and then quantifying the many paths a time series data may transition to. RESULTS The TPRMine Algorithm has been designed, implemented and applied to patient physiological data streams captured from the McMaster Children’s Hospital NICU. The algorithm has been applied to understand the number of states a patient in a NICU bed can transition to in a given time period and a demonstration of formulation of hypothesis tests. In addition, a quantification of these states is completed leading to creation of a vital scoring. With this, it’s possible to understand the percent of time a patient remains in a high or low vital score. CONCLUSIONS The developed method allows understanding the number of states a patient may transition to in any given time period. Adding some clinical context to the identified states facilitates state quantification allowing formulation of thresholds which leads to generating patient scores. This is an approach that can be utilized for identifying patient at risk of some clinical condition prior to disease progress. Additionally the developed method facilitates identification of frequent patterns that could be associated with generated thresholds.


2005 ◽  
Vol 4 (2) ◽  
pp. 61-82 ◽  
Author(s):  
Jessica Lin ◽  
Eamonn Keogh ◽  
Stefano Lonardi

Data visualization techniques are very important for data analysis, since the human eye has been frequently advocated as the ultimate data-mining tool. However, there has been surprisingly little work on visualizing massive time series data sets. To this end, we developed VizTree, a time series pattern discovery and visualization system based on augmenting suffix trees. VizTree visually summarizes both the global and local structures of time series data at the same time. In addition, it provides novel interactive solutions to many pattern discovery problems, including the discovery of frequently occurring patterns (motif discovery), surprising patterns (anomaly detection), and query by content. VizTree works by transforming the time series into a symbolic representation, and encoding the data in a modified suffix tree in which the frequency and other properties of patterns are mapped onto colors and other visual properties. We demonstrate the utility of our system by comparing it with state-of-the-art batch algorithms on several real and synthetic data sets. Based on the tree structure, we further device a coefficient which measures the dissimilarity between any two time series. This coefficient is shown to be competitive with the well-known Euclidean distance.


2021 ◽  
Vol 25 (5) ◽  
pp. 1051-1072
Author(s):  
Fabian Kai-Dietrich Noering ◽  
Konstantin Jonas ◽  
Frank Klawonn

In technical systems the analysis of similar load situations is a promising technique to gain information about the system’s state, its health or wearing. Very often, load situations are challenging to be defined by hand. Hence, these situations need to be discovered as recurrent patterns within multivariate time series data of the system under consideration. Unsupervised algorithms for finding such recurrent patterns in multivariate time series must be able to cope with very large data sets because the system might be observed over a very long time. In our previous work we identified discretization-based approaches to be very interesting for variable length pattern discovery because of their low computing time due to the simplification (symbolization) of the time series. In this paper we propose additional preprocessing steps for symbolic representation of time series aiming for enhanced multivariate pattern discovery. Beyond that we show the performance (quality and computing time) of our algorithms in a synthetic test data set as well as in a real life example with 100 millions of time points. We also test our approach with increasing dimensionality of the time series.


Author(s):  
Hua Ling Deng ◽  
Yǔ Qiàn Sūn

The high volatility of world soybean prices has caused uncertainty and vulnerability particularly in the developing countries. The clustering of time series is a serviceable tool for discovering soybean price patterns in temporal data. However, traditional clustering method cannot represent the continuity of price data very well, nor keep a watchful eye on the correlation between factors. In this work, the authors use the Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data (TICC) to soybean price pattern discovery. This is a new method for multivariate time series clustering, which can simultaneously segment and cluster the time series data. Each pattern in the TICC method is defined by a Markov random field (MRF), characterizing the interdependencies between different factors of that pattern. Based on this representation, the characteristics of each pattern and the importance of each factor can be portrayed. The work provides a new way of thinking about market price prediction for agricultural products.


Sign in / Sign up

Export Citation Format

Share Document