Online Anomaly Detection of Time Series at Scale

Author(s):  
Andrew Mason ◽  
Yifan Zhao ◽  
Hongmei He ◽  
Raymon Gompelman ◽  
Srikanth Mandava
Author(s):  
MRS. M. VIJAYALAKSHMI ◽  
MR. K . JANARDHAN

Global understanding of the sequence anomaly detection problem and how techniques proposed for different domains relate to each other. Our specific contributions are as follows: We identify three distinct formulations of the anomaly detection problem, and review techniques from many disparate and disconnected domains that address each of these formulations. Within each problem formulation, we group techniques into categories based on the nature of the underlying algorithm. For each category, we provide a basic anomaly detection technique, and show how the existing techniques are variants of the basic technique. This approach shows how different techniques within a category are related or different from each other. Our categorization reveals new variants and combinations that have not been investigated before for anomaly detection. We also provide a discussion of relative strengths and weaknesses of different techniques. We show how techniques developed for one problem formulation can be adapted to solve a different formulation; thereby providing several novel adaptations to solve the different problem formulations. We highlight the applicability of the techniques that handle discrete sequences to other related areas such as online anomaly detection and time series anomaly detection.


2016 ◽  
Vol 136 (3) ◽  
pp. 363-372
Author(s):  
Takaaki Nakamura ◽  
Makoto Imamura ◽  
Masashi Tatedoko ◽  
Norio Hirai

2020 ◽  
Author(s):  
Bo Zhang ◽  
Hongyu Zhang ◽  
Pablo Moscato

<div>Complex software intensive systems, especially distributed systems, generate logs for troubleshooting. The logs are text messages recording system events, which can help engineers determine the system's runtime status. This paper proposes a novel approach named ADR (stands for Anomaly Detection by workflow Relations) that employs matrix nullspace to mine numerical relations from log data. The mined relations can be used for both offline and online anomaly detection and facilitate fault diagnosis. We have evaluated ADR on log data collected from two distributed systems, HDFS (Hadoop Distributed File System) and BGL (IBM Blue Gene/L supercomputers system). ADR successfully mined 87 and 669 numerical relations from the logs and used them to detect anomalies with high precision and recall. For online anomaly detection, ADR employs PSO (Particle Swarm Optimization) to find the optimal sliding windows' size and achieves fast anomaly detection.</div><div>The experimental results confirm that ADR is effective for both offline and online anomaly detection. </div>


Author(s):  
Cong Gao ◽  
Ping Yang ◽  
Yanping Chen ◽  
Zhongmin Wang ◽  
Yue Wang

AbstractWith large deployment of wireless sensor networks, anomaly detection for sensor data is becoming increasingly important in various fields. As a vital data form of sensor data, time series has three main types of anomaly: point anomaly, pattern anomaly, and sequence anomaly. In production environments, the analysis of pattern anomaly is the most rewarding one. However, the traditional processing model cloud computing is crippled in front of large amount of widely distributed data. This paper presents an edge-cloud collaboration architecture for pattern anomaly detection of time series. A task migration algorithm is developed to alleviate the problem of backlogged detection tasks at edge node. Besides, the detection tasks related to long-term correlation and short-term correlation in time series are allocated to cloud and edge node, respectively. A multi-dimensional feature representation scheme is devised to conduct efficient dimension reduction. Two key components of the feature representation trend identification and feature point extraction are elaborated. Based on the result of feature representation, pattern anomaly detection is performed with an improved kernel density estimation method. Finally, extensive experiments are conducted with synthetic data sets and real-world data sets.


Sign in / Sign up

Export Citation Format

Share Document