Anomaly Detection for Univariate Time Series with Statistics and Deep Learning

Author(s):  
Jian-Bin Kao ◽  
Jehn-Ruey Jiang
2021 ◽  
Vol 11 (15) ◽  
pp. 6698
Author(s):  
Jehn-Ruey Jiang ◽  
Jian-Bin Kao ◽  
Yu-Lin Li

Thanks to the advance of novel technologies, such as sensors and Internet of Things (IoT) technologies, big amounts of data are continuously gathered over time, resulting in a variety of time series. A semi-supervised anomaly detection framework, called Tri-CAD, for univariate time series is proposed in this paper. Based on the Pearson product-moment correlation coefficient and Dickey–Fuller test, time series are first categorized into three classes: (i) periodic, (ii) stationary, and (iii) non-periodic and non-stationary time series. Afterwards, different mechanisms using statistics, wavelet transform, and deep learning autoencoder concepts are applied to different classes of time series for detecting anomalies. The performance of the proposed Tri-CAD framework is evaluated by experiments using three Numenta anomaly benchmark (NAB) datasets. The performance of Tri-CAD is compared with those of related methods, such as STL, SARIMA, LSTM, LSTM with STL, and ADSaS. The comparison results show that Tri-CAD outperforms the others in terms of the precision, recall, and F1-score.


IEEE Access ◽  
2021 ◽  
Vol 9 ◽  
pp. 120043-120065
Author(s):  
Kukjin Choi ◽  
Jihun Yi ◽  
Changhwa Park ◽  
Sungroh Yoon

2019 ◽  
Vol 38 ◽  
pp. 233-240 ◽  
Author(s):  
Mattia Carletti ◽  
Chiara Masiero ◽  
Alessandro Beghi ◽  
Gian Antonio Susto

IEEE Access ◽  
2019 ◽  
Vol 7 ◽  
pp. 1991-2005 ◽  
Author(s):  
Mohsin Munir ◽  
Shoaib Ahmed Siddiqui ◽  
Andreas Dengel ◽  
Sheraz Ahmed

2020 ◽  
Author(s):  
İsmail Sezen ◽  
Alper Unal ◽  
Ali Deniz

<p>Atmospheric pollution is one of the primary problems and high concentration levels are critical for human health and environment. This requires to study causes of unusual high concentration levels which do not conform to the expected behavior of the pollutant but it is not always easy to decide which levels are unusual, especially, when data is big and has complex structure. A visual inspection is subjective in most cases and a proper anomaly detection method should be used. Anomaly detection has been widely used in diverse research areas, but most of them have been developed for certain application domains. It also might not be always a good idea to identify anomalies by using data from near measurement sites because of spatio-temporal complexity of the pollutant. That’s why, it’s required to use a method which estimates anomalies from univariate time series data.</p><p>This work suggests a framework based on STL decomposition and extended isolation forest (EIF), which is a machine learning algorithm, to identify anomalies for univariate time series which has trend, multi-seasonality and seasonal variation. Main advantage of EIF method is that it defines anomalies by a score value.</p><p>In this study, a multi-seasonal STL decomposition has been applied on a univariate PM10 time series to remove trend and seasonal parts but STL is not resourceful to remove seasonal variation from the data. The remainder part still has 24 hours and yearly variation. To remove the variation, hourly and annual inter-quartile ranges (IQR) are calculated and data is standardized by dividing each value to corresponding IQR value. This process ensures removing seasonality in variation and the resulting data is processed by EIF to decide which values are anomaly by an objective criterion.</p>


Sign in / Sign up

Export Citation Format

Share Document