G-CNN and double-referenced thresholding for detecting time series anomalies

2020 ◽  
pp. 1-12
Author(s):  
Liping Li ◽  
Zean Tian ◽  
Kenli Li ◽  
Cen Chen

Anomaly detection based on time series data is of great importance in many fields. Time series data produced by man-made systems usually include two parts: monitored and exogenous data, which respectively are the detected object and the control/feedback information. In this paper, a so-called G-CNN architecture that combined the gated recurrent units (GRU) with a convolutional neural network (CNN) is proposed, which respectively focus on the monitored and exogenous data. The most important is the introduction of a complementary double-referenced thresholding approach that processes prediction errors and calculates threshold, achieving balance between the minimization of false positives and the false negatives. The outstanding performance and extensive applicability of our model is demonstrated by experiments on two public datasets from aerospace and a new server machine dataset from an Internet company. It is also found that the monitored data is close associated with the exogenous data if any, and the interpretability of the G-CNN is discussed by visualizing the intermediate output of neural networks.

2021 ◽  
Vol 13 (3) ◽  
pp. 331
Author(s):  
Robail Yasrab ◽  
Jincheng Zhang ◽  
Polina Smyth ◽  
Michael P. Pound

Phenotyping involves the quantitative assessment of the anatomical, biochemical, and physiological plant traits. Natural plant growth cycles can be extremely slow, hindering the experimental processes of phenotyping. Deep learning offers a great deal of support for automating and addressing key plant phenotyping research issues. Machine learning-based high-throughput phenotyping is a potential solution to the phenotyping bottleneck, promising to accelerate the experimental cycles within phenomic research. This research presents a study of deep networks’ potential to predict plants’ expected growth, by generating segmentation masks of root and shoot systems into the future. We adapt an existing generative adversarial predictive network into this new domain. The results show an efficient plant leaf and root segmentation network that provides predictive segmentation of what a leaf and root system will look like at a future time, based on time-series data of plant growth. We present benchmark results on two public datasets of Arabidopsis (A. thaliana) and Brassica rapa (Komatsuna) plants. The experimental results show strong performance, and the capability of proposed methods to match expert annotation. The proposed method is highly adaptable, trainable (transfer learning/domain adaptation) on different plant species and mutations.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4391
Author(s):  
Xue-Bo Jin ◽  
Aiqiang Yang ◽  
Tingli Su ◽  
Jian-Lei Kong ◽  
Yuting Bai

Time-series data generally exists in many application fields, and the classification of time-series data is one of the important research directions in time-series data mining. In this paper, univariate time-series data are taken as the research object, deep learning and broad learning systems (BLSs) are the basic methods used to explore the classification of multi-modal time-series data features. Long short-term memory (LSTM), gated recurrent unit, and bidirectional LSTM networks are used to learn and test the original time-series data, and a Gramian angular field and recurrence plot are used to encode time-series data to images, and a BLS is employed for image learning and testing. Finally, to obtain the final classification results, Dempster–Shafer evidence theory (D–S evidence theory) is considered to fuse the probability outputs of the two categories. Through the testing of public datasets, the method proposed in this paper obtains competitive results, compensating for the deficiencies of using only time-series data or images for different types of datasets.


PLoS ONE ◽  
2021 ◽  
Vol 16 (1) ◽  
pp. e0244094
Author(s):  
Chao-Yu Guo ◽  
Tse-Wei Liu ◽  
Yi-Hau Chen

In recent years, machine learning methods have been applied to various prediction scenarios in time-series data. However, some processing procedures such as cross-validation (CV) that rearrange the order of the longitudinal data might ruin the seriality and lead to a potentially biased outcome. Regarding this issue, a recent study investigated how different types of CV methods influence the predictive errors in conventional time-series data. Here, we examine a more complex distributed lag nonlinear model (DLNM), which has been widely used to assess the cumulative impacts of past exposures on the current health outcome. This research extends the DLNM into an artificial neural network (ANN) and investigates how the ANN model reacts to various CV schemes that result in different predictive biases. We also propose a newly designed permutation ratio to evaluate the performance of the CV in the ANN. This ratio mimics the concept of the R-square in conventional statistical regression models. The results show that as the complexity of the ANN increases, the predicted outcome becomes more stable, and the bias shows a decreasing trend. Among the different settings of hyperparameters, the novel strategy, Leave One Block Out Cross-Validation (LOBO-CV), demonstrated much better results, and the lowest mean square error was observed. The hyperparameters of the ANN trained by the LOBO-CV yielded the minimum number of prediction errors. The newly proposed permutation ratio indicates that LOBO-CV can contribute up to 34% of the prediction accuracy.


Author(s):  
Sven Festag ◽  
Cord Spreckelsen

The main goal of this project was to define and evaluate a new unsupervised deep learning approach that can differentiate between normal and anomalous intervals of signals like the electrical activity of the heart (ECG). Denoising autoencoders based on recurrent neural networks with gated recurrent units were used for the semantic encoding of such time frames. A subsequent cluster analysis conducted in the code space served as the decision mechanism labelling samples as anomalies or normal intervals, respectively. The cluster ensemble method called cluster-based similarity partitioning proved itself well suited for this task when used in combination with density-based spatial clustering of applications with noise. The best performing system reached an adjusted Rand index of 0.11 on real-world ECG signals labelled by medical experts. This corresponds to a precision and recall regarding the detection task of around 0.72. The new general approach outperformed several state-of-the-art outlier recognition methods and can be applied to all kinds of (medical) time series data. It can serve as a basis for more specific detectors that work in an unsupervised fashion or that are partially guided by medical experts.


Author(s):  
Sahand Abbasnejad ◽  
Ahmad Mirabadi

In this paper, forercasting methods that use autoregressive integrated moving average (ARIMA) and autoregressive-Kalman (AR-Kalman) are presented for the prediction of the failure state of S700K railway point machines. Using signal processing methods such as wavelet transform and statistical analysis and the stator current signal, the authors have acquired the time series data of the point machine behavior using a near-failure test point machine. Prediction methods are implemented by utilizing the acquired time series data, and the results are compared with the specified failure margin. Furthermore, the prposed ARIMA method used in this study is compared with the AR-Kalman prediction method, and prediction errors are analysed.


Data ◽  
2021 ◽  
Vol 6 (3) ◽  
pp. 25
Author(s):  
Lucas Pereira ◽  
Vitor Aguiar ◽  
Fábio Vasconcelos

In the era of big data and artificial intelligence, public datasets are becoming increasingly important for researchers to build and evaluate their models. This paper presents the FIKWaste dataset, which contains time series data for the volume of waste produced in three restaurant kitchens in Portugal. Organic (undifferentiated) and inorganic (glass, paper, and plastic) waste bins were monitored for a consecutive period of four weeks. In addition to the time series measurements, the FIKWaste dataset contains labels for waste disposal events, i.e., when the waste bins are emptied, and technical and non-technical details of the monitored kitchens.


2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Author(s):  
Rizki Rahma Kusumadewi ◽  
Wahyu Widayat

Exchange rate is one tool to measure a country’s economic conditions. The growth of a stable currency value indicates that the country has a relatively good economic conditions or stable. This study has the purpose to analyze the factors that affect the exchange rate of the Indonesian Rupiah against the United States Dollar in the period of 2000-2013. The data used in this study is a secondary data which are time series data, made up of exports, imports, inflation, the BI rate, Gross Domestic Product (GDP), and the money supply (M1) in the quarter base, from first quarter on 2000 to fourth quarter on 2013. Regression model time series data used the ARCH-GARCH with ARCH model selection indicates that the variables that significantly influence the exchange rate are exports, inflation, the central bank rate and the money supply (M1). Whereas import and GDP did not give any influence.


2016 ◽  
Vol 136 (3) ◽  
pp. 363-372
Author(s):  
Takaaki Nakamura ◽  
Makoto Imamura ◽  
Masashi Tatedoko ◽  
Norio Hirai

Sign in / Sign up

Export Citation Format

Share Document