scholarly journals A Review on Outlier/Anomaly Detection in Time Series Data

2021 ◽  
Vol 54 (3) ◽  
pp. 1-33
Author(s):  
Ane Blázquez-García ◽  
Angel Conde ◽  
Usue Mori ◽  
Jose A. Lozano

Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for researchers and practitioners in the past few years, including the detection of outliers or anomalies that may represent errors or events of interest. This review aims to provide a structured and comprehensive state-of-the-art on unsupervised outlier detection techniques in the context of time series. To this end, a taxonomy is presented based on the main aspects that characterize an outlier detection technique.

2017 ◽  
Vol 20 (2) ◽  
pp. 190-202 ◽  
Author(s):  
Kannan S. ◽  
Somasundaram K.

Purpose Due to the large-size, non-uniform transactions per day, the money laundering detection (MLD) is a time-consuming and difficult process. The major purpose of the proposed auto-regressive (AR) outlier-based MLD (AROMLD) is to reduce the time consumption for handling large-sized non-uniform transactions. Design/methodology/approach The AR-based outlier design produces consistent asymptotic distributed results that enhance the demand-forecasting abilities. Besides, the inter-quartile range (IQR) formulations proposed in this paper support the detailed analysis of time-series data pairs. Findings The prediction of high-dimensionality and the difficulties in the relationship/difference between the data pairs makes the time-series mining as a complex task. The presence of domain invariance in time-series mining initiates the regressive formulation for outlier detection. The deep analysis of time-varying process and the demand of forecasting combine the AR and the IQR formulations for an effective outlier detection. Research limitations/implications The present research focuses on the detection of an outlier in the previous financial transaction, by using the AR model. Prediction of the possibility of an outlier in future transactions remains a major issue. Originality/value The lack of prior segmentation of ML detection suffers from dimensionality. Besides, the absence of boundary to isolate the normal and suspicious transactions induces the limitations. The lack of deep analysis and the time consumption are overwhelmed by using the regression formulation.


2021 ◽  
Vol 27 (1) ◽  
pp. 55-60
Author(s):  
Sampson Twumasi-Ankrah ◽  
Simon Kojo Appiah ◽  
Doris Arthur ◽  
Wilhemina Adoma Pels ◽  
Jonathan Kwaku Afriyie ◽  
...  

This study examined the performance of six outlier detection techniques using a non-stationary time series dataset. Two key issues were of interest. Scenario one was the method that could correctly detect the number of outliers introduced into the dataset whiles scenario two was to find the technique that would over detect the number of outliers introduced into the dataset, when a dataset contains only extreme maxima values, extreme minima values or both. Air passenger dataset was used with different outliers or extreme values ranging from 1 to 10 and 40. The six outlier detection techniques used in this study were Mahalanobis distance, depth-based, robust kernel-based outlier factor (RKOF), generalized dispersion, Kth nearest neighbors distance (KNND), and principal component (PC) methods. When detecting extreme maxima, the Mahalanobis and the principal component methods performed better in correctly detecting outliers in the dataset. Also, the Mahalanobis method could identify more outliers than the others, making it the "best" method for the extreme minima category. The kth nearest neighbor distance method was the "best" method for not over-detecting the number of outliers for extreme minima. However, the Mahalanobis distance and the principal component methods were the "best" performed methods for not over-detecting the number of outliers for the extreme maxima category. Therefore, the Mahalanobis outlier detection technique is recommended for detecting outlier in nonstationary time series data.


Author(s):  
Tung Kieu ◽  
Bin Yang ◽  
Chenjuan Guo ◽  
Christian S. Jensen

We propose two solutions to outlier detection in time series based on recurrent autoencoder ensembles. The solutions exploit autoencoders built using sparsely-connected recurrent neural networks (S-RNNs). Such networks make it possible to generate multiple autoencoders with different neural network connection structures. The two solutions are ensemble frameworks, specifically an independent framework and a shared framework, both of which combine multiple S-RNN based autoencoders to enable outlier detection.  This ensemble-based approach aims to reduce the effects of some autoencoders being overfitted to outliers, this way improving overall detection quality. Experiments with two large real-world time series data sets, including univariate and multivariate time series, offer insight into the design properties of the proposed frameworks and demonstrate that the resulting solutions are capable of outperforming both baselines and the state-of-the-art methods.


2018 ◽  
Author(s):  
Rafael G. Vieira ◽  
Marcos A. Leone Filho ◽  
Robinson Semolini

Nowadays, time series data underlies countless research activities. Despite the wide range of techniques to capture and process all this information, issues such as analyzing large amounts of data and detecting unusual behaviors on them still pose a great challenge. In this context, this paper suggests SHESD+, a statistical technique that combines the Extreme Studentized Deviate (ESD) test and a decomposition procedure based on Loess to detect anomalies on time series data. The proposed technique employs robust metrics to identify anomalies in a more proper and accurate manner, even in the presence of trend and seasonal spikes. Simulation studies are carried out to evaluate the effectiveness of the SH-ESD+ using the published Numenta Anomaly Benchmark (NAB) collection. Computational results show that the SH-ESD+ performs consistently when compared against state-of-the-art and classic detection techniques.


Author(s):  
Elangovan Ramanujam ◽  
S. Padmavathi

Innovations and applicability of time series data mining techniques have significantly increased the researchers' interest in the problem of time series classification. Several algorithms have been proposed for this purpose categorized under shapelet, interval, motif, and whole series-based techniques. Among this, the bag-of-words technique, an extensive application of the text mining approach, performs well due to its simplicity and effectiveness. To extend the efficiency of the bag-of-words technique, this paper proposes a discriminate supervised weighted scheme to identify the characteristic and representative pattern of a class for efficient classification. This paper uses a modified weighted matrix that discriminates the representative and non-representative pattern which enables the interpretability in classification. Experimentation has been carried out to compare the performance of the proposed technique with state-of-the-art techniques in terms of accuracy and statistical significance.


Sign in / Sign up

Export Citation Format

Share Document