G-CNN and double-referenced thresholding for detecting time series anomalies

Anomaly detection based on time series data is of great importance in many fields. Time series data produced by man-made systems usually include two parts: monitored and exogenous data, which respectively are the detected object and the control/feedback information. In this paper, a so-called G-CNN architecture that combined the gated recurrent units (GRU) with a convolutional neural network (CNN) is proposed, which respectively focus on the monitored and exogenous data. The most important is the introduction of a complementary double-referenced thresholding approach that processes prediction errors and calculates threshold, achieving balance between the minimization of false positives and the false negatives. The outstanding performance and extensive applicability of our model is demonstrated by experiments on two public datasets from aerospace and a new server machine dataset from an Internet company. It is also found that the monitored data is close associated with the exogenous data if any, and the interpretability of the G-CNN is discussed by visualizing the intermediate output of neural networks.

Download Full-text

Predicting Plant Growth from Time-Series Data Using Deep Learning

Remote Sensing ◽

10.3390/rs13030331 ◽

2021 ◽

Vol 13 (3) ◽

pp. 331

Author(s):

Robail Yasrab ◽

Jincheng Zhang ◽

Polina Smyth ◽

Michael P. Pound

Keyword(s):

Time Series ◽

Deep Learning ◽

Plant Growth ◽

Time Series Data ◽

Plant Traits ◽

Domain Adaptation ◽

Series Data ◽

Plant Phenotyping ◽

Research Issues ◽

Public Datasets

Phenotyping involves the quantitative assessment of the anatomical, biochemical, and physiological plant traits. Natural plant growth cycles can be extremely slow, hindering the experimental processes of phenotyping. Deep learning offers a great deal of support for automating and addressing key plant phenotyping research issues. Machine learning-based high-throughput phenotyping is a potential solution to the phenotyping bottleneck, promising to accelerate the experimental cycles within phenomic research. This research presents a study of deep networks’ potential to predict plants’ expected growth, by generating segmentation masks of root and shoot systems into the future. We adapt an existing generative adversarial predictive network into this new domain. The results show an efficient plant leaf and root segmentation network that provides predictive segmentation of what a leaf and root system will look like at a future time, based on time-series data of plant growth. We present benchmark results on two public datasets of Arabidopsis (A. thaliana) and Brassica rapa (Komatsuna) plants. The experimental results show strong performance, and the capability of proposed methods to match expert annotation. The proposed method is highly adaptable, trainable (transfer learning/domain adaptation) on different plant species and mutations.

Download Full-text

Multi-Channel Fusion Classification Method Based on Time-Series Data

Sensors ◽

10.3390/s21134391 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4391

Author(s):

Xue-Bo Jin ◽

Aiqiang Yang ◽

Tingli Su ◽

Jian-Lei Kong ◽

Yuting Bai

Keyword(s):

Time Series ◽

Time Series Data ◽

Short Term Memory ◽

Evidence Theory ◽

Recurrence Plot ◽

Series Data ◽

Public Datasets ◽

Application Fields ◽

Original Time

Time-series data generally exists in many application fields, and the classification of time-series data is one of the important research directions in time-series data mining. In this paper, univariate time-series data are taken as the research object, deep learning and broad learning systems (BLSs) are the basic methods used to explore the classification of multi-modal time-series data features. Long short-term memory (LSTM), gated recurrent unit, and bidirectional LSTM networks are used to learn and test the original time-series data, and a Gramian angular field and recurrence plot are used to encode time-series data to images, and a BLS is employed for image learning and testing. Finally, to obtain the final classification results, Dempster–Shafer evidence theory (D–S evidence theory) is considered to fuse the probability outputs of the two categories. Through the testing of public datasets, the method proposed in this paper obtains competitive results, compensating for the deficiencies of using only time-series data or images for different types of datasets.

Download Full-text

A novel cross-validation strategy for artificial neural networks using distributed-lag environmental factors

PLoS ONE ◽

10.1371/journal.pone.0244094 ◽

2021 ◽

Vol 16 (1) ◽

pp. e0244094

Author(s):

Chao-Yu Guo ◽

Tse-Wei Liu ◽

Yi-Hau Chen

Keyword(s):

Time Series ◽

Cross Validation ◽

Time Series Data ◽

Series Data ◽

Prediction Errors ◽

Statistical Regression ◽

Ann Model ◽

Distributed Lag ◽

Minimum Number ◽

Artificial Neural

In recent years, machine learning methods have been applied to various prediction scenarios in time-series data. However, some processing procedures such as cross-validation (CV) that rearrange the order of the longitudinal data might ruin the seriality and lead to a potentially biased outcome. Regarding this issue, a recent study investigated how different types of CV methods influence the predictive errors in conventional time-series data. Here, we examine a more complex distributed lag nonlinear model (DLNM), which has been widely used to assess the cumulative impacts of past exposures on the current health outcome. This research extends the DLNM into an artificial neural network (ANN) and investigates how the ANN model reacts to various CV schemes that result in different predictive biases. We also propose a newly designed permutation ratio to evaluate the performance of the CV in the ANN. This ratio mimics the concept of the R-square in conventional statistical regression models. The results show that as the complexity of the ANN increases, the predicted outcome becomes more stable, and the bias shows a decreasing trend. Among the different settings of hyperparameters, the novel strategy, Leave One Block Out Cross-Validation (LOBO-CV), demonstrated much better results, and the lowest mean square error was observed. The hyperparameters of the ANN trained by the LOBO-CV yielded the minimum number of prediction errors. The newly proposed permutation ratio indicates that LOBO-CV can contribute up to 34% of the prediction accuracy.

Download Full-text

Semantic Anomaly Detection in Medical Time Series

German Medical Data Sciences: Bringing Data to Life - Studies in Health Technology and Informatics ◽

10.3233/shti210059 ◽

2021 ◽

Author(s):

Sven Festag ◽

Cord Spreckelsen

Keyword(s):

Time Series ◽

Time Series Data ◽

Spatial Clustering ◽

Series Data ◽

Adjusted Rand Index ◽

Decision Mechanism ◽

Ecg Denoising ◽

Unsupervised Deep Learning ◽

Gated Recurrent Units ◽

Time Frames

The main goal of this project was to define and evaluate a new unsupervised deep learning approach that can differentiate between normal and anomalous intervals of signals like the electrical activity of the heart (ECG). Denoising autoencoders based on recurrent neural networks with gated recurrent units were used for the semantic encoding of such time frames. A subsequent cluster analysis conducted in the code space served as the decision mechanism labelling samples as anomalies or normal intervals, respectively. The cluster ensemble method called cluster-based similarity partitioning proved itself well suited for this task when used in combination with density-based spatial clustering of applications with noise. The best performing system reached an adjusted Rand index of 0.11 on real-world ECG signals labelled by medical experts. This corresponds to a precision and recall regarding the detection task of around 0.72. The new general approach outperformed several state-of-the-art outlier recognition methods and can be applied to all kinds of (medical) time series data. It can serve as a basis for more specific detectors that work in an unsupervised fashion or that are partially guided by medical experts.

Download Full-text

Predicting the failure of railway point machines by using Autoregressive Integrated Moving Average and Autoregressive-Kalman methods

Proceedings of the Institution of Mechanical Engineers Part F Journal of Rail and Rapid Transit ◽

10.1177/0954409717748790 ◽

2017 ◽

Vol 232 (6) ◽

pp. 1790-1799

Author(s):

Sahand Abbasnejad ◽

Ahmad Mirabadi

Keyword(s):

Time Series ◽

Time Series Data ◽

Moving Average ◽

Prediction Method ◽

Test Point ◽

Series Data ◽

Prediction Errors ◽

Autoregressive Integrated Moving Average ◽

Failure State ◽

Signal Processing Methods

In this paper, forercasting methods that use autoregressive integrated moving average (ARIMA) and autoregressive-Kalman (AR-Kalman) are presented for the prediction of the failure state of S700K railway point machines. Using signal processing methods such as wavelet transform and statistical analysis and the stator current signal, the authors have acquired the time series data of the point machine behavior using a near-failure test point machine. Prediction methods are implemented by utilizing the acquired time series data, and the results are compared with the specified failure margin. Furthermore, the prposed ARIMA method used in this study is compared with the AR-Kalman prediction method, and prediction errors are analysed.

Download Full-text

FIKWaste: A Waste Generation Dataset from Three Restaurant Kitchens in Portugal

Data ◽

10.3390/data6030025 ◽

2021 ◽

Vol 6 (3) ◽

pp. 25

Author(s):

Lucas Pereira ◽

Vitor Aguiar ◽

Fábio Vasconcelos

Keyword(s):

Artificial Intelligence ◽

Time Series ◽

Big Data ◽

Time Series Data ◽

Series Data ◽

Inorganic Glass ◽

Waste Generation ◽

Consecutive Period ◽

Technical Details ◽

Public Datasets

In the era of big data and artificial intelligence, public datasets are becoming increasingly important for researchers to build and evaluate their models. This paper presents the FIKWaste dataset, which contains time series data for the volume of waste produced in three restaurant kitchens in Portugal. Organic (undifferentiated) and inorganic (glass, paper, and plastic) waste bins were monitored for a consecutive period of four weeks. In addition to the time series measurements, the FIKWaste dataset contains labels for waste disposal events, i.e., when the waste bins are emptied, and technical and non-technical details of the monitored kitchens.

Download Full-text

Graphical Exploratory Data Analysis for Categorical Longitudinal and Time Series Data

PsycEXTRA Dataset ◽

10.1037/e634372013-001 ◽

2013 ◽

Author(s):

Stephen J. Tueller ◽

Richard A. Van Dorn ◽

Georgiy Bobashev ◽

Barry Eggleston

Keyword(s):

Time Series ◽

Data Analysis ◽

Exploratory Data Analysis ◽

Time Series Data ◽

Series Data ◽

Exploratory Data

Download Full-text

Faktor-Faktor Yang Mempengaruhi Nilai Tukar Dollar Amerika Serikat Terhadap Rupiah Tahun 2000–2013

Jurnal Riset Manajemen Sekolah Tinggi Ilmu Ekonomi Widya Wiwaha Program Magister Manajemen ◽

10.32477/jrm.v1i2.72 ◽

2017 ◽

Vol 1 (2) ◽

pp. 177-191

Author(s):

Rizki Rahma Kusumadewi ◽

Wahyu Widayat

Keyword(s):

Time Series ◽

Exchange Rate ◽

Money Supply ◽

Time Series Data ◽

The United States ◽

Economic Conditions ◽

Series Data ◽

Arch Model ◽

United States Dollar ◽

The Exchange Rate

Exchange rate is one tool to measure a country’s economic conditions. The growth of a stable currency value indicates that the country has a relatively good economic conditions or stable. This study has the purpose to analyze the factors that affect the exchange rate of the Indonesian Rupiah against the United States Dollar in the period of 2000-2013. The data used in this study is a secondary data which are time series data, made up of exports, imports, inflation, the BI rate, Gross Domestic Product (GDP), and the money supply (M1) in the quarter base, from first quarter on 2000 to fourth quarter on 2013. Regression model time series data used the ARCH-GARCH with ARCH model selection indicates that the variables that significantly influence the exchange rate are exports, inflation, the central bank rate and the money supply (M1). Whereas import and GDP did not give any influence.

Download Full-text