Unsupervised Outlier Detection in Time Series Data

Author(s):  
Z. Ferdousi ◽  
A. Maeda
2021 ◽  
Vol 54 (3) ◽  
pp. 1-33
Author(s):  
Ane Blázquez-García ◽  
Angel Conde ◽  
Usue Mori ◽  
Jose A. Lozano

Recent advances in technology have brought major breakthroughs in data collection, enabling a large amount of data to be gathered over time and thus generating time series. Mining this data has become an important task for researchers and practitioners in the past few years, including the detection of outliers or anomalies that may represent errors or events of interest. This review aims to provide a structured and comprehensive state-of-the-art on unsupervised outlier detection techniques in the context of time series. To this end, a taxonomy is presented based on the main aspects that characterize an outlier detection technique.


2017 ◽  
Vol 20 (2) ◽  
pp. 190-202 ◽  
Author(s):  
Kannan S. ◽  
Somasundaram K.

Purpose Due to the large-size, non-uniform transactions per day, the money laundering detection (MLD) is a time-consuming and difficult process. The major purpose of the proposed auto-regressive (AR) outlier-based MLD (AROMLD) is to reduce the time consumption for handling large-sized non-uniform transactions. Design/methodology/approach The AR-based outlier design produces consistent asymptotic distributed results that enhance the demand-forecasting abilities. Besides, the inter-quartile range (IQR) formulations proposed in this paper support the detailed analysis of time-series data pairs. Findings The prediction of high-dimensionality and the difficulties in the relationship/difference between the data pairs makes the time-series mining as a complex task. The presence of domain invariance in time-series mining initiates the regressive formulation for outlier detection. The deep analysis of time-varying process and the demand of forecasting combine the AR and the IQR formulations for an effective outlier detection. Research limitations/implications The present research focuses on the detection of an outlier in the previous financial transaction, by using the AR model. Prediction of the possibility of an outlier in future transactions remains a major issue. Originality/value The lack of prior segmentation of ML detection suffers from dimensionality. Besides, the absence of boundary to isolate the normal and suspicious transactions induces the limitations. The lack of deep analysis and the time consumption are overwhelmed by using the regression formulation.


2021 ◽  
Vol 27 (1) ◽  
pp. 55-60
Author(s):  
Sampson Twumasi-Ankrah ◽  
Simon Kojo Appiah ◽  
Doris Arthur ◽  
Wilhemina Adoma Pels ◽  
Jonathan Kwaku Afriyie ◽  
...  

This study examined the performance of six outlier detection techniques using a non-stationary time series dataset. Two key issues were of interest. Scenario one was the method that could correctly detect the number of outliers introduced into the dataset whiles scenario two was to find the technique that would over detect the number of outliers introduced into the dataset, when a dataset contains only extreme maxima values, extreme minima values or both. Air passenger dataset was used with different outliers or extreme values ranging from 1 to 10 and 40. The six outlier detection techniques used in this study were Mahalanobis distance, depth-based, robust kernel-based outlier factor (RKOF), generalized dispersion, Kth nearest neighbors distance (KNND), and principal component (PC) methods. When detecting extreme maxima, the Mahalanobis and the principal component methods performed better in correctly detecting outliers in the dataset. Also, the Mahalanobis method could identify more outliers than the others, making it the "best" method for the extreme minima category. The kth nearest neighbor distance method was the "best" method for not over-detecting the number of outliers for extreme minima. However, the Mahalanobis distance and the principal component methods were the "best" performed methods for not over-detecting the number of outliers for the extreme maxima category. Therefore, the Mahalanobis outlier detection technique is recommended for detecting outlier in nonstationary time series data.


2021 ◽  
Author(s):  
Nhung Le Thi ◽  
Benjamin Männel ◽  
Mihaela Jarema ◽  
Gopi Krishna Seemala ◽  
Kosuke Heki ◽  
...  

<p>In data mining, outliers can lead to misleading interpretations of statistical results, particularly in deformation monitoring based on fluctuations and disturbances simulated by numerical models for the analysis of deformations. Therefore, outlier filtering cannot be ignored in data standardization. However, it is not likely that a filtering algorithm is efficient for every data pattern. We investigate five outlier filtering algorithms using MATLAB® (Release 2020a): moving average, moving median, quartiles, Grubbs, and generalized extreme Studentized deviation (GESD) to select the optimal algorithms applied for GNSS time series data. This study is conducted on two types of data used for ionosphere disturbance analysis in the region of the Ring of Fire and crustal deformation monitoring in Germany, one showing seasonal time series patterns and the other presenting the trend models. We apply the simple random sampling method that ensures the principles of unbiased surveying techniques. The optimal algorithm selection is based on the sensitivity of outlier detection and the capability of the central tendency measures. The algorithm robustness is also tested by altering random outliers but maintaining the standard distribution of each dataset. Our results show that the moving median algorithm is most sensitive for outlier detection because it is robust statistics and is not affected by anomalies; followed in turn by quartiles, GESD, and Grubbs. The outlier filtering capability of the moving average algorithm is least efficient, with a percentage of outlier detection below 20% compared to the moving median (corresponding 95% probability). In deformation analysis, disturbances on numerical models are often the basis for motion assessment, while these anomalies are smoothed by moving median filtering. Hence, the quartiles algorithm can be considered in this case. Overall, the moving median is best suited to filter outliers for seasonal and trend time series data; in particular, for deformation analysis, the optimal solution is applying the quartiles or extending the threshold factor and the sliding window of the moving median.</p><p><strong>Keywords: </strong>Outlier filtering, Time series, Deformation analysis, Moving median, Quartiles, MATLAB.</p>


2017 ◽  
Vol 2017 ◽  
pp. 1-10
Author(s):  
Zhihua Li ◽  
Ziyuan Li ◽  
Ning Yu ◽  
Steven Wen

Physiological theories indicate that the deepest impression for time series data with respect to the human visual system is its extreme value. Based on this principle, by researching the strategies of extreme-point-based hierarchy segmentation, the hierarchy-segmentation-based data extraction method for time series, and the ideas of locality outlier, a novel outlier detection model and method for time series are proposed. The presented algorithm intuitively labels an outlier factor to each subsequence in time series such that the visual outlier detection gets relatively direct. The experimental results demonstrate the average advantage of the developed method over the compared methods and the efficient data reduction capability for time series, which indicates the promising performance of the proposed method and its practical application value.


Sign in / Sign up

Export Citation Format

Share Document