Real-Time Deep Learning-Based Anomaly Detection Approach for Multivariate Data Streams with Apache Flink

2021 ◽  
pp. 39-49
Author(s):  
Tae Wook Ha ◽  
Jung Mo Kang ◽  
Myoung Ho Kim
2018 ◽  
Vol 14 (10) ◽  
pp. 155014771880330 ◽  
Author(s):  
Li Cheng ◽  
Yijie Wang ◽  
Yong Zhou ◽  
Xingkong Ma

Due to the increasing arriving rate and complex relationship of behavior data streams, how to detect sequential behavior anomaly in an efficient and accurate manner has become an emerging challenge. However, most of the existing literature simply calculates the anomaly score for segmented sequence, and there is limited work going deep to investigate data stream segment and structural relationship. Moreover, existing studies cannot meet efficiency requirements because of large number of projected subsequences. In this article, we propose EADetection, an efficient and accurate sequential behavior anomaly detection approach over data streams. EADetection adopts time interval and fuzzy logic–based correlation to segment event stream adaptively based on rolling window. Through dynamic projection space–based fast pruning, large number of repeated patterns are reduced to improve detection efficiency. Meanwhile, EADetection calculates the anomaly score by top-k pattern–based abnormal scoring based on directed loop graph–based storage strategy, which ensures the accuracy of detection. Specially, we design and implement a streaming anomaly detection system based on EADetection to perform real-time detection. Extensive experiments confirm that EADetection can achieve real time and improve accuracy, significantly reduces latency by 36.8% and reduces false positive rate by 6.4% compared with existing approach.


2017 ◽  
Vol 8 (3) ◽  
pp. 677-696 ◽  
Author(s):  
Milan Flach ◽  
Fabian Gans ◽  
Alexander Brenning ◽  
Joachim Denzler ◽  
Markus Reichstein ◽  
...  

Abstract. Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advancing our understanding of vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of extreme climatic events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only a few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations like sudden changes in basic characteristics of time series such as the sample mean, the variance, changes in the cycle amplitude, and trends. This artificial experiment is needed as there is no gold standard for the identification of anomalies in real Earth observations. Our results show that a well-chosen feature extraction step (e.g., subtracting seasonal cycles, or dimensionality reduction) is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify three detection algorithms (k-nearest neighbors mean distance, kernel density estimation, a recurrence approach) and their combinations (ensembles) that outperform other multivariate approaches as well as univariate extreme-event detection methods. Our results therefore provide an effective workflow to automatically detect anomalies in Earth system science data.


2016 ◽  
Author(s):  
Milan Flach ◽  
Fabian Gans ◽  
Alexander Brenning ◽  
Joachim Denzler ◽  
Markus Reichstein ◽  
...  

Abstract. Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advance our understanding of e.g. vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of climatic extreme events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations. This artificial experiment is needed as there is no 'gold standard' for the identification of anomalies in real Earth observations. Our results show that a well chosen feature extraction step (e.g. subtracting seasonal cycles, or dimensionality reduction) is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify 3 detection algorithms (k-nearest neighbours mean distance, kernel density estimation, a recurrence approach) and their combinations (ensembles) that outperform other multivariate approaches as well as univariate extreme event detection methods. Our results therefore provide an effective workflow to automatically detect anomalies in Earth system science data.


2019 ◽  
Vol 15 (6) ◽  
pp. 814-823
Author(s):  
Jakup Fondaj ◽  
Zirije Hasani

Sign in / Sign up

Export Citation Format

Share Document