Matrix Profile III: The Matrix Profile Allows Visualization of Salient Subsequences in Massive Time Series

Author(s):  
Chin-Chia Michael Yeh ◽  
Helga Van Herle ◽  
Eamonn Keogh
Keyword(s):  
2021 ◽  
Vol 60 ◽  
pp. 102431
Author(s):  
Hailin Li ◽  
Yenchun Jim Wu ◽  
Shijie Zhang ◽  
Jinchuan Zou
Keyword(s):  

2017 ◽  
Vol 32 (1) ◽  
pp. 83-123 ◽  
Author(s):  
Chin-Chia Michael Yeh ◽  
Yan Zhu ◽  
Liudmila Ulanova ◽  
Nurjahan Begum ◽  
Yifei Ding ◽  
...  
Keyword(s):  

2014 ◽  
Vol 30 (4) ◽  
pp. 1467-1485 ◽  
Author(s):  
Yufeng Gao ◽  
Yongxin Wu ◽  
Dayong Li ◽  
Ning Zhang ◽  
Fei Zhang

In dynamic analyses of important structures, seismic input may be defined in the form of time series. It is required that the response spectrum of this input time series be compatible with a specified target response spectrum. Time domain spectral matching, which is used to generate spectrum compatible acceleration time series, is investigated in some detail. First, a new, improved wavelet is presented, and the new adjustment wavelet can prevent drifts in the resulting velocity and displacement time series without applying a baseline correction. Next, the analytical solution of the matrix accounting for the cross correlation of each wavelet is given in order to ensure the speed of the matching procedure. Finally, some aspects, such as the reduction factors and the matching order, are discussed to ensure the stability and efficiency of the matching procedure. Accordingly, the characteristics of the matching procedure are illustrated by numerical examples.


2020 ◽  
Vol 34 (4) ◽  
pp. 949-979
Author(s):  
Yan Zhu ◽  
Shaghayegh Gharghabi ◽  
Diego Furtado Silva ◽  
Hoang Anh Dau ◽  
Chin-Chia Michael Yeh ◽  
...  

2021 ◽  
pp. 1-20
Author(s):  
Fabian Kai-Dietrich Noering ◽  
Yannik Schroeder ◽  
Konstantin Jonas ◽  
Frank Klawonn

In technical systems the analysis of similar situations is a promising technique to gain information about the system’s state, its health or wearing. Very often, situations cannot be defined but need to be discovered as recurrent patterns within time series data of the system under consideration. This paper addresses the assessment of different approaches to discover frequent variable-length patterns in time series. Because of the success of artificial neural networks (NN) in various research fields, a special issue of this work is the applicability of NNs to the problem of pattern discovery in time series. Therefore we applied and adapted a Convolutional Autoencoder and compared it to classical nonlearning approaches based on Dynamic Time Warping, based on time series discretization as well as based on the Matrix Profile. These nonlearning approaches have also been adapted, to fulfill our requirements like the discovery of potentially time scaled patterns from noisy time series. We showed the performance (quality, computing time, effort of parametrization) of those approaches in an extensive test with synthetic data sets. Additionally the transferability to other data sets is tested by using real life vehicle data. We demonstrated the ability of Convolutional Autoencoders to discover patterns in an unsupervised way. Furthermore the tests showed, that the Autoencoder is able to discover patterns with a similar quality like classical nonlearning approaches.


2019 ◽  
Author(s):  
V.M. Efimov ◽  
K.V. Efimov ◽  
V.Y. Kovaleva

In the 40s of the last century, Karhunen and Loève proposed a method for processing of one-dimensional numeric time series by converting it into a multidimensional by shifts. In fact, a one-dimensional number series was decomposed into several orthogonal time series. This method has many times been independently developed and applied in practice under various names (EOF, SSA, Caterpillar, etc.). Nowadays, the name SSA (the Singular Spectral Analysis) is most often used. It turned out that it is universal, applicable to any time series without requiring stationary assumptions, automatically decomposes time series into a trend, cyclic components and noise. By the beginning of the 1980s Takens showed that for a dynamical system such a method makes it possible to obtain an attractor from observing only one of these variables, thereby bringing the method to a powerful theoretical basis. In the same years, the practical benefits of phase portraits became clear. In particular, it was used in the analysis and forecast of the animal abundance dynamics.In this paper we propose to extend SSA to one-dimensional sequence of any type elements, including numbers, symbols, figures, etc., and, as a special case, to molecular sequence. Technically, the problem is solved almost the same algorithm as the SSA. The sequence is cut by a sliding window into fragments of a given length. Between all fragments, the matrix of Euclidean distances is calculated. This is always possible. For example, the square root from the Hamming distance between fragments is the Euclidean distance. For the resulting matrix, the principal components are calculated by the principal-coordinate method (PCo). Instead of a distance matrix one can use a matrix of any similarity/dissimilarity indexes and apply methods of multidimensional scaling (MDS). The result will always be PCs in some Euclidean space.We called this method PCA-Seq. It is certainly an exploratory method, as its particular case SSA. For any sequence, including molecular, PCA-Seq without any additional assumptions allows to get its principal components in a numerical form and visualize them in the form of phase portraits. Long-term experience of SSA application for numerical data gives all reasons to believe that PCA-Seq will be not less useful in the analysis of non-numerical data, especially in hypothesizing.PCA-Seq is implemented in the freely distributed Jacobi 4 package (http://mrherrn.github.io/JACOBI4/).


Author(s):  
John Ross ◽  
Igor Schreiber ◽  
Marcel O. Vlad

In this chapter we present an experimental test case of the deduction of a reaction pathway and mechanism by means of correlation metric construction from time-series measurements of the concentrations of chemical species. We choose as the system an enzymatic reaction network, the initial steps of glycolysis. Glycolysis is central in intermediary metabolism and has a high degree of regulation. The reaction pathway has been well studied and thus it is a good test for the theory. Further, the reaction mechanism of this part of glycolysis has been modeled extensively. The quantity and precision of the measurements reported here are sufficient to determine the matrix of correlation functions and, from this, a reaction pathway that is qualitatively consistent with the reaction mechanism established previously. The existence of unmeasured species did not compromise the analysis. The quantity and precision of the data were not excessive, and thus we expect the method to be generally applicable. This CMC experiment was carried out in a continuous-flow stirred-tank reactor (CSTR). The reaction network considered consists of eight enzymes, which catalyze the conversion of glucose into dihydroxyacetone phosphate and glyceraldehyde phosphate. The enzymes were confined to the reactor by an ultrafiltration membrane at the top of the reactor. The membrane was permeable to all low molecular weight species. The inputs are (1) a reaction buffer, which provides starting material for the reaction network to process, maintains pH and pMg, and contains any other species that act as constant constraints on the system dynamics, and (2) a set of “control species” (at least one), whose input concentrations are changed randomly every sampling period over the course of the experiment. The sampling period is chosen such that the system almost, but not quite, relaxes to a chosen nonequilibrium steady state. The system is kept near enough to its steady state to minimize trending (caused by the relaxation) in the time series, but far enough from the steady state that the time-lagged autocorrelation functions for each species decay to zero over three to five sampling periods. This long decay is necessary if temporal ordering in the network is to be analyzed.


2021 ◽  
Vol 5 (1) ◽  
pp. 45
Author(s):  
Eoin Cartwright ◽  
Martin Crane ◽  
Heather J. Ruskin

The Matrix Profile (MP) algorithm has the potential to revolutionise many areas of data analysis. In this article, several applications to financial time series are examined. Several approaches for the identification of similar behaviour patterns (or motifs) are proposed, illustrated, and the results discussed. While the MP is primarily designed for single series analysis, it can also be applied to multi-variate financial series. It still permits the initial identification of time periods with indicatively similar behaviour across individual market sectors and indexes, together with the assessment of wider applications, such as general market behaviour in times of financial crisis. In short, the MP algorithm offers considerable potential for detailed analysis, not only in terms of motif identification in financial time series, but also in terms of exploring the nature of underlying events.


Sign in / Sign up

Export Citation Format

Share Document