Improving discretization based pattern discovery for multivariate time series by additional preprocessing

In technical systems the analysis of similar load situations is a promising technique to gain information about the system’s state, its health or wearing. Very often, load situations are challenging to be defined by hand. Hence, these situations need to be discovered as recurrent patterns within multivariate time series data of the system under consideration. Unsupervised algorithms for finding such recurrent patterns in multivariate time series must be able to cope with very large data sets because the system might be observed over a very long time. In our previous work we identified discretization-based approaches to be very interesting for variable length pattern discovery because of their low computing time due to the simplification (symbolization) of the time series. In this paper we propose additional preprocessing steps for symbolic representation of time series aiming for enhanced multivariate pattern discovery. Beyond that we show the performance (quality and computing time) of our algorithms in a synthetic test data set as well as in a real life example with 100 millions of time points. We also test our approach with increasing dimensionality of the time series.

Download Full-text

Pattern discovery in time series using autoencoder in comparison to nonlearning approaches

Integrated Computer-Aided Engineering ◽

10.3233/ica-210650 ◽

2021 ◽

pp. 1-20

Author(s):

Fabian Kai-Dietrich Noering ◽

Yannik Schroeder ◽

Konstantin Jonas ◽

Frank Klawonn

Keyword(s):

Time Series ◽

Time Series Data ◽

Pattern Discovery ◽

Computing Time ◽

Real Life ◽

Synthetic Data ◽

Series Data ◽

Data Sets ◽

Research Fields ◽

The Matrix

In technical systems the analysis of similar situations is a promising technique to gain information about the system’s state, its health or wearing. Very often, situations cannot be defined but need to be discovered as recurrent patterns within time series data of the system under consideration. This paper addresses the assessment of different approaches to discover frequent variable-length patterns in time series. Because of the success of artificial neural networks (NN) in various research fields, a special issue of this work is the applicability of NNs to the problem of pattern discovery in time series. Therefore we applied and adapted a Convolutional Autoencoder and compared it to classical nonlearning approaches based on Dynamic Time Warping, based on time series discretization as well as based on the Matrix Profile. These nonlearning approaches have also been adapted, to fulfill our requirements like the discovery of potentially time scaled patterns from noisy time series. We showed the performance (quality, computing time, effort of parametrization) of those approaches in an extensive test with synthetic data sets. Additionally the transferability to other data sets is tested by using real life vehicle data. We demonstrated the ability of Convolutional Autoencoders to discover patterns in an unsupervised way. Furthermore the tests showed, that the Autoencoder is able to discover patterns with a similar quality like classical nonlearning approaches.

Download Full-text

Dynamic Modelling of Sharia-Based Corporate, Islamic Index and Exchange Rate: VAR Model Application

JURNAL ILMIAH EKONOMI ISLAM ◽

10.29040/jiei.v6i2.1093 ◽

2020 ◽

Vol 6 (2) ◽

pp. 195

Author(s):

Hasrun Afandi Umpusinga ◽

Atika Riasari ◽

Fajrin Satria Dwi Kesumah

Keyword(s):

Time Series ◽

Exchange Rate ◽

Time Series Data ◽

Multivariate Time Series ◽

Dynamic Modelling ◽

Series Data ◽

Var Model ◽

Suitable Model ◽

Data Set ◽

Dynamic Relationship

Indonesia is one of largest users of sharia-based compliant recently which bring into many concerns how the sharia stocks listing in the most valuable sharia stocks index in Indonesia perform and correlate with other variables, particularly exchange rates. The study aims to analysis the causal relationship and to forecast the performances of sharia-based stocks and its Islamic index in Indonesia along with the volatility of exchange rate. Vector Autoregressive (VAR) model is applied as the method to analyse the multivariate time series as it is believed as the suitable model in predicting such time-series data in the scope of multivariate variables. The finding suggests VAR(1) model is the fitted model as such to both analyse its dynamic relationship and forecast the data set for the next 24 weeks. While the prediction shows the JII has an increasing data, both ANTM and EXR are predicted to have a stable volatility. In addition, granger causality defines variables to have effect in its respective variables, and IRF describes the shocks in one variable cause another variable is relatively difficult in reaching its zero condition in short-term period.

Download Full-text

Production fault simulation and forecasting from time series data with machine learning in glove textile industry

Journal of Engineered Fibers and Fabrics ◽

10.1177/1558925019883462 ◽

2019 ◽

Vol 14 ◽

pp. 155892501988346 ◽

Cited By ~ 4

Author(s):

Mine Seçkin ◽

Ahmet Çağdaş Seçkin ◽

Aysun Coşkun

Keyword(s):

Machine Learning ◽

Time Series ◽

Production Process ◽

Time Series Data ◽

Real Life ◽

Machine Learning Algorithms ◽

Series Data ◽

Random Parameters ◽

Data Set ◽

Textile Sector

Although textile production is heavily automation-based, it is viewed as a virgin area with regard to Industry 4.0. When the developments are integrated into the textile sector, efficiency is expected to increase. When data mining and machine learning studies are examined in textile sector, it is seen that there is a lack of data sharing related to production process in enterprises because of commercial concerns and confidentiality. In this study, a method is presented about how to simulate a production process and how to make regression from the time series data with machine learning. The simulation has been prepared for the annual production plan, and the corresponding faults based on the information received from textile glove enterprise and production data have been obtained. Data set has been applied to various machine learning methods within the scope of supervised learning to compare the learning performances. The errors that occur in the production process have been created using random parameters in the simulation. In order to verify the hypothesis that the errors may be forecast, various machine learning algorithms have been trained using data set in the form of time series. The variable showing the number of faulty products could be forecast very successfully. When forecasting the faulty product parameter, the random forest algorithm has demonstrated the highest success. As these error values have given high accuracy even in a simulation that works with uniformly distributed random parameters, highly accurate forecasts can be made in real-life applications as well.

Download Full-text

Subsequence Time Series Clustering

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch286 ◽

2011 ◽

pp. 1871-1876

Author(s):

Jason Chen

Keyword(s):

Time Series ◽

Traditional Method ◽

Time Series Data ◽

Large Data ◽

Series Data ◽

Data Sets ◽

Data Set ◽

Time Series Clustering ◽

Mining Community ◽

Representative Points

Clustering analysis is a tool used widely in the Data Mining community and beyond (Everitt et al. 2001). In essence, the method allows us to “summarise” the information in a large data set X by creating a very much smaller set C of representative points (called centroids) and a membership map relating each point in X to its representative in C. An obvious but special type of data set that one might want to cluster is a time series data set. Such data has a temporal ordering on its elements, in contrast to non-time series data sets. In this article we explore the area of time series clustering, focusing mainly on a surprising recent result showing that the traditional method for time series clustering is meaningless. We then survey the literature of recent papers and go on to argue how time series clustering can be made meaningful.

Download Full-text

Soybean Price Pattern Discovery Via Toeplitz Inverse Covariance-Based Clustering

International Journal of Agricultural and Environmental Information Systems ◽

10.4018/ijaeis.2019100101 ◽

2019 ◽

Vol 10 (4) ◽

pp. 1-17

Author(s):

Hua Ling Deng ◽

Yǔ Qiàn Sūn

Keyword(s):

Time Series ◽

Time Series Data ◽

Multivariate Time Series ◽

Pattern Discovery ◽

Market Price ◽

Series Data ◽

Temporal Data ◽

Price Prediction ◽

Soybean Price ◽

High Volatility

The high volatility of world soybean prices has caused uncertainty and vulnerability particularly in the developing countries. The clustering of time series is a serviceable tool for discovering soybean price patterns in temporal data. However, traditional clustering method cannot represent the continuity of price data very well, nor keep a watchful eye on the correlation between factors. In this work, the authors use the Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data (TICC) to soybean price pattern discovery. This is a new method for multivariate time series clustering, which can simultaneously segment and cluster the time series data. Each pattern in the TICC method is defined by a Markov random field (MRF), characterizing the interdependencies between different factors of that pattern. Based on this representation, the characteristics of each pattern and the importance of each factor can be portrayed. The work provides a new way of thinking about market price prediction for agricultural products.

Download Full-text

Some statistical and CI models to predict chaotic high-frequency financial data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189107 ◽

2020 ◽

Vol 39 (5) ◽

pp. 6419-6430

Author(s):

Dusan Marcek

Keyword(s):

Time Series Data ◽

Moving Average ◽

Methodological Approach ◽

Back Propagation ◽

Large Data ◽

Series Data ◽

Data Set ◽

Training Time ◽

Optimal Population ◽

Forecast Time

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.

Download Full-text

Convex Hull Convolutive Non-Negative Matrix Factorization for Uncovering Temporal Patterns in Multivariate Time-Series Data

10.21437/interspeech.2016-571 ◽

2016 ◽

Cited By ~ 5

Author(s):

Colin Vaz ◽

Asterios Toutios ◽

Shrikanth S. Narayanan

Keyword(s):

Time Series ◽

Convex Hull ◽

Matrix Factorization ◽

Time Series Data ◽

Multivariate Time Series ◽

Temporal Patterns ◽

Series Data ◽

Non Negative Matrix Factorization

Download Full-text

Modeling Low-risk Actions from Multivariate Time Series Data Using Distributional Reinforcement Learning

2020 11th International Conference on Awareness Science and Technology (iCAST) ◽

10.1109/icast51195.2020.9319476 ◽

2020 ◽

Author(s):

Yosuke Sato ◽

Jianwei Zhang

Keyword(s):

Time Series ◽

Reinforcement Learning ◽

Time Series Data ◽

Multivariate Time Series ◽

Low Risk ◽

Series Data

Download Full-text

Remaining Useful Life Prediction Using Temporal Convolution with Attention

AI ◽

10.3390/ai2010005 ◽

2021 ◽

Vol 2 (1) ◽

pp. 48-70

Author(s):

Wei Ming Tan ◽

T. Hui Teo

Keyword(s):

Neural Network ◽

Time Series ◽

Time Series Data ◽

Remaining Useful Life ◽

Sensor Data ◽

Series Data ◽

Multiple Time ◽

Data Set ◽

Form Complex ◽

Useful Life

Prognostic techniques attempt to predict the Remaining Useful Life (RUL) of a subsystem or a component. Such techniques often use sensor data which are periodically measured and recorded into a time series data set. Such multivariate data sets form complex and non-linear inter-dependencies through recorded time steps and between sensors. Many current existing algorithms for prognostic purposes starts to explore Deep Neural Network (DNN) and its effectiveness in the field. Although Deep Learning (DL) techniques outperform the traditional prognostic algorithms, the networks are generally complex to deploy or train. This paper proposes a Multi-variable Time Series (MTS) focused approach to prognostics that implements a lightweight Convolutional Neural Network (CNN) with attention mechanism. The convolution filters work to extract the abstract temporal patterns from the multiple time series, while the attention mechanisms review the information across the time axis and select the relevant information. The results suggest that the proposed method not only produces a superior accuracy of RUL estimation but it also trains many folds faster than the reported works. The superiority of deploying the network is also demonstrated on a lightweight hardware platform by not just being much compact, but also more efficient for the resource restricted environment.

Download Full-text

Change Point Enhanced Anomaly Detection for IoT Time Series Data

Water ◽

10.3390/w13121633 ◽

2021 ◽

Vol 13 (12) ◽

pp. 1633

Author(s):

Elena-Simona Apostol ◽

Ciprian-Octavian Truică ◽

Florin Pop ◽

Christian Esposito

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Change Point ◽

Time Series Data ◽

Multivariate Time Series ◽

Change Point Detection ◽

Change Points ◽

Series Data ◽

Prediction And Forecasting ◽

Point Detection

Due to the exponential growth of the Internet of Things networks and the massive amount of time series data collected from these networks, it is essential to apply efficient methods for Big Data analysis in order to extract meaningful information and statistics. Anomaly detection is an important part of time series analysis, improving the quality of further analysis, such as prediction and forecasting. Thus, detecting sudden change points with normal behavior and using them to discriminate between abnormal behavior, i.e., outliers, is a crucial step used to minimize the false positive rate and to build accurate machine learning models for prediction and forecasting. In this paper, we propose a rule-based decision system that enhances anomaly detection in multivariate time series using change point detection. Our architecture uses a pipeline that automatically manages to detect real anomalies and remove the false positives introduced by change points. We employ both traditional and deep learning unsupervised algorithms, in total, five anomaly detection and five change point detection algorithms. Additionally, we propose a new confidence metric based on the support for a time series point to be an anomaly and the support for the same point to be a change point. In our experiments, we use a large real-world dataset containing multivariate time series about water consumption collected from smart meters. As an evaluation metric, we use Mean Absolute Error (MAE). The low MAE values show that the algorithms accurately determine anomalies and change points. The experimental results strengthen our assumption that anomaly detection can be improved by determining and removing change points as well as validates the correctness of our proposed rules in real-world scenarios. Furthermore, the proposed rule-based decision support systems enable users to make informed decisions regarding the status of the water distribution network and perform effectively predictive and proactive maintenance.

Download Full-text