scholarly journals Improving Univariate Time Series Anomaly Detection Through Automatic Algorithm Selection and Human-in-the-Loop False-Positive Removement

Author(s):  
Cynthia Freeman ◽  
Ian Beaver ◽  
Abdullah Mueen

The existence of a time series anomaly detection method that performs well for all domains is a myth. Given a massive library of available methods, how can one select the best method for their application? An extensive evaluation of every anomaly detection method is not feasible. Many existing anomaly detection systems do not include an avenue for human feedback, essential given the subjective nature of what even is anomalous. We present a technique for improving univariate time series anomaly detection through automatic algorithm selection and human-in-the-loop false-positive removement. These determinations were made by extensively experimenting with over 30 pre-annotated time series from the open-source Numenta Anomaly Benchmark repository. Once the highest performing anomaly detection methods are selected via these characteristics, humans can annotate the predicted outliers which are used to tune anomaly scores via subsequence similarity search and improve the selected methods for their data, increasing evaluation scores and reducing the need for annotation by 70% on predicted anomalies where annotation is used to improve F-scores.

2021 ◽  
Vol 72 ◽  
pp. 849-899
Author(s):  
Cynthia Freeman ◽  
Jonathan Merriman ◽  
Ian Beaver ◽  
Abdullah Mueen

The existence of an anomaly detection method that is optimal for all domains is a myth. Thus, there exists a plethora of anomaly detection methods which increases every year for a wide variety of domains. But a strength can also be a weakness; given this massive library of methods, how can one select the best method for their application? Current literature is focused on creating new anomaly detection methods or large frameworks for experimenting with multiple methods at the same time. However, and especially as the literature continues to expand, an extensive evaluation of every anomaly detection method is simply not feasible. To reduce this evaluation burden, we present guidelines to intelligently choose the optimal anomaly detection methods based on the characteristics the time series displays such as seasonality, trend, level change concept drift, and missing time steps. We provide a comprehensive experimental validation and survey of twelve anomaly detection methods over different time series characteristics to form guidelines based on several metrics: the AUC (Area Under the Curve), windowed F-score, and Numenta Anomaly Benchmark (NAB) scoring model. Applying our methodologies can save time and effort by surfacing the most promising anomaly detection methods instead of experimenting extensively with a rapidly expanding library of anomaly detection methods, especially in an online setting.


2016 ◽  
Vol 136 (3) ◽  
pp. 363-372
Author(s):  
Takaaki Nakamura ◽  
Makoto Imamura ◽  
Masashi Tatedoko ◽  
Norio Hirai

2020 ◽  
Vol 39 (4) ◽  
pp. 5243-5252
Author(s):  
Zhen Lei ◽  
Liang Zhu ◽  
Youliang Fang ◽  
Xiaolei Li ◽  
Beizhan Liu

Pattern recognition technology is applied to bridge health monitoring to solve abnormalities in bridge health monitoring data. Testing is of great significance. For abnormal data detection, this paper proposes a single variable pattern anomaly detection method based on KNN distance and a multivariate time series anomaly detection method based on the covariance matrix and singular value decomposition. This method first performs compression and segmentation on the original data sequence based on important points to obtain multiple time subsequences, then calculates the pattern distance between each time subsequence according to the similarity measure of the time series, and finally selects the abnormal mode according to the KNN method. In this paper, the reliability of the method is verified through experiments. The experimental results in this paper show that the 5/7/9 / 11-nearest neighbors point to a specific number of nodes. Combined with the original time series diagram corresponding to the time zone view, in this paragraph in the time, the value of the temperature sensor No. 6 stays at 32.5 degrees Celsius for up to one month. The detection algorithm controls the number of MTS subsequences through sliding windows and sliding intervals. The execution time is not large, and the value of K is different. Although the calculated results are different, most of the most obvious abnormal sequences can be detected. The results of this paper provide a certain reference value for the study of abnormal detection of bridge health monitoring data.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xuguang Liu

Aiming at the anomaly detection problem in sensor data, traditional algorithms usually only focus on the continuity of single-source data and ignore the spatiotemporal correlation between multisource data, which reduces detection accuracy to a certain extent. Besides, due to the rapid growth of sensor data, centralized cloud computing platforms cannot meet the real-time detection needs of large-scale abnormal data. In order to solve this problem, a real-time detection method for abnormal data of IoT sensors based on edge computing is proposed. Firstly, sensor data is represented as time series; K-nearest neighbor (KNN) algorithm is further used to detect outliers and isolated groups of the data stream in time series. Secondly, an improved DBSCAN (Density Based Spatial Clustering of Applications with Noise) algorithm is proposed by considering spatiotemporal correlation between multisource data. It can be set according to sample characteristics in the window and overcomes the slow convergence problem using global parameters and large samples, then makes full use of data correlation to complete anomaly detection. Moreover, this paper proposes a distributed anomaly detection model for sensor data based on edge computing. It performs data processing on computing resources close to the data source as much as possible, which improves the overall efficiency of data processing. Finally, simulation results show that the proposed method has higher computational efficiency and detection accuracy than traditional methods and has certain feasibility.


Sign in / Sign up

Export Citation Format

Share Document