A study of the effect of alternative similarity measures on the performance of graph-based anomaly detection algorithms

AbstractAnomaly detection is challenging, especially for large datasets in high dimensions. Here, we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. DRAMA is released as a general python package that implements the general framework with a wide range of built-in options. This approach identifies the primary prototypes in the data with anomalies detected by their large distances from the prototypes, either in the latent space or in the original, high-dimensional space. DRAMA is tested on a wide variety of simulated and real datasets, in up to 3000 dimensions, and is found to be robust and highly competitive with commonly used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning, and highly unbalanced datasets. Besides, DRAMA naturally provides clustering of outliers for subsequent analysis.

Download Full-text

Intrusion Detection based on Sequential Information preserving Log Embedding Methods and Anomaly Detection Algorithms

IEEE Access ◽

10.1109/access.2021.3071763 ◽

2021 ◽

pp. 1-1

Author(s):

Czangyeob Kim ◽

Myeongjun Jang ◽

Seungwan Seo ◽

Kyeongchan Park ◽

Pilsung Kang

Keyword(s):

Intrusion Detection ◽

Anomaly Detection ◽

Detection Algorithms ◽

Sequential Information ◽

Embedding Methods

Download Full-text

Evaluation of Anomaly Detection Algorithms for the Real-World Applications

2020 25th International Conference on Pattern Recognition (ICPR) ◽

10.1109/icpr48806.2021.9413265 ◽

2021 ◽

Author(s):

Marija Ivanovska ◽

Janez Pers ◽

Domen Tabernik ◽

Danijel Skocaj

Keyword(s):

Anomaly Detection ◽

Real World ◽

The Real ◽

Detection Algorithms ◽

Real World Applications

Download Full-text

Unsupervised Anomaly Detection with Distillated Teacher-Student Network Ensemble

Entropy ◽

10.3390/e23020201 ◽

2021 ◽

Vol 23 (2) ◽

pp. 201

Author(s):

Qinfeng Xiao ◽

Jing Wang ◽

Youfang Lin ◽

Wenbo Gongsa ◽

Ganghui Hu ◽

...

Keyword(s):

Anomaly Detection ◽

Multivariate Data ◽

Failure Detection ◽

Superior Performance ◽

Detection Algorithms ◽

Teacher Student ◽

Model Complex ◽

Unsupervised Anomaly Detection ◽

Real World Datasets ◽

Complex Features

We address the problem of unsupervised anomaly detection for multivariate data. Traditional machine learning based anomaly detection algorithms rely on specific assumptions of normal patterns and fail to model complex feature interactions and relations. Recently, existing deep learning based methods are promising for extracting representations from complex features. These methods train an auxiliary task, e.g., reconstruction and prediction, on normal samples. They further assume that anomalies fail to perform well on the auxiliary task since they are never trained during the model optimization. However, the assumption does not always hold in practice. Deep models may also perform the auxiliary task well on anomalous samples, leading to the failure detection of anomalies. To effectively detect anomalies for multivariate data, this paper introduces a teacher-student distillation based framework Distillated Teacher-Student Network Ensemble (DTSNE). The paradigm of the teacher-student distillation is able to deal with high-dimensional complex features. In addition, an ensemble of student networks provides a better capability to avoid generalizing the auxiliary task performance on anomalous samples. To validate the effectiveness of our model, we conduct extensive experiments on real-world datasets. Experimental results show superior performance of DTSNE over competing methods. Analysis and discussion towards the behavior of our model are also provided in the experiment section.

Download Full-text

Hybrid DBN monitoring and anomaly detection algorithms for on-line SHM

2015 Annual Reliability and Maintainability Symposium (RAMS) ◽

10.1109/rams.2015.7105184 ◽

2015 ◽

Cited By ~ 5

Author(s):

Chonlagarn Iamsumang ◽

Ali Mosleh ◽

Mohammad Modarres

Keyword(s):

Anomaly Detection ◽

Detection Algorithms ◽

On Line

Download Full-text

Quantitative comparison of unsupervised anomaly detection algorithms for intrusion detection

Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing - SAC '19 ◽

10.1145/3297280.3297314 ◽

2019 ◽

Cited By ~ 5

Author(s):

Filipe Falcão ◽

Tommaso Zoppi ◽

Caio Barbosa Viera Silva ◽

Anderson Santos ◽

Baldoino Fonseca ◽

...

Keyword(s):

Intrusion Detection ◽

Anomaly Detection ◽

Quantitative Comparison ◽

Detection Algorithms ◽

Unsupervised Anomaly Detection

Download Full-text

Multivariate anomaly detection for Earth observations: a comparison of algorithms and feature extraction techniques

Earth System Dynamics ◽

10.5194/esd-8-677-2017 ◽

2017 ◽

Vol 8 (3) ◽

pp. 677-696 ◽

Cited By ~ 12

Author(s):

Milan Flach ◽

Fabian Gans ◽

Alexander Brenning ◽

Joachim Denzler ◽

Markus Reichstein ◽

...

Keyword(s):

Feature Extraction ◽

Anomaly Detection ◽

Data Streams ◽

Multivariate Data ◽

Earth System ◽

Earth System Science ◽

Sample Mean ◽

System Science ◽

Detection Algorithms ◽

Earth Observations

Abstract. Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advancing our understanding of vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of extreme climatic events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only a few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations like sudden changes in basic characteristics of time series such as the sample mean, the variance, changes in the cycle amplitude, and trends. This artificial experiment is needed as there is no gold standard for the identification of anomalies in real Earth observations. Our results show that a well-chosen feature extraction step (e.g., subtracting seasonal cycles, or dimensionality reduction) is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify three detection algorithms (k-nearest neighbors mean distance, kernel density estimation, a recurrence approach) and their combinations (ensembles) that outperform other multivariate approaches as well as univariate extreme-event detection methods. Our results therefore provide an effective workflow to automatically detect anomalies in Earth system science data.

Download Full-text

Multivariate Anomaly Detection for Earth Observations: A Comparison of Algorithms and Feature Extraction Techniques

10.5194/esd-2016-51 ◽

2016 ◽

Cited By ~ 1

Author(s):

Milan Flach ◽

Fabian Gans ◽

Alexander Brenning ◽

Joachim Denzler ◽

Markus Reichstein ◽

...

Keyword(s):

Feature Extraction ◽

Anomaly Detection ◽

Data Streams ◽

Multivariate Data ◽

Detection Methods ◽

Earth System ◽

Earth System Science ◽

System Science ◽

Detection Algorithms ◽

Earth Observations

Abstract. Today, many processes at the Earth's surface are constantly monitored by multiple data streams. These observations have become central to advance our understanding of e.g. vegetation dynamics in response to climate or land use change. Another set of important applications is monitoring effects of climatic extreme events, other disturbances such as fires, or abrupt land transitions. One important methodological question is how to reliably detect anomalies in an automated and generic way within multivariate data streams, which typically vary seasonally and are interconnected across variables. Although many algorithms have been proposed for detecting anomalies in multivariate data, only few have been investigated in the context of Earth system science applications. In this study, we systematically combine and compare feature extraction and anomaly detection algorithms for detecting anomalous events. Our aim is to identify suitable workflows for automatically detecting anomalous patterns in multivariate Earth system data streams. We rely on artificial data that mimic typical properties and anomalies in multivariate spatiotemporal Earth observations. This artificial experiment is needed as there is no 'gold standard' for the identification of anomalies in real Earth observations. Our results show that a well chosen feature extraction step (e.g. subtracting seasonal cycles, or dimensionality reduction) is more important than the choice of a particular anomaly detection algorithm. Nevertheless, we identify 3 detection algorithms (k-nearest neighbours mean distance, kernel density estimation, a recurrence approach) and their combinations (ensembles) that outperform other multivariate approaches as well as univariate extreme event detection methods. Our results therefore provide an effective workflow to automatically detect anomalies in Earth system science data.

Download Full-text

Evaluation schemes for video and image anomaly detection algorithms

10.1117/12.2224667 ◽

2016 ◽

Author(s):

Shibin Parameswaran ◽

Josh Harguess ◽

Christopher Barngrover ◽

Scott Shafer ◽

Michael Reese

Keyword(s):

Anomaly Detection ◽

Detection Algorithms

Download Full-text

Heuristic algorithms for best match graph editing

Algorithms for Molecular Biology ◽

10.1186/s13015-021-00196-3 ◽

2021 ◽

Vol 16 (1) ◽

Author(s):

David Schaller ◽

Manuela Geiß ◽

Marc Hellmuth ◽

Peter F. Stadler

Keyword(s):

Heuristic Algorithms ◽

Sequence Data ◽

Similarity Measures ◽

Set Partitioning ◽

Attractive Alternative ◽

Biological Sequence ◽

Detection Algorithms ◽

Empirical Estimates ◽

Mathematical Phylogenetics ◽

Multiple Species

Abstract Background Best match graphs (BMGs) are a class of colored digraphs that naturally appear in mathematical phylogenetics as a representation of the pairwise most closely related genes among multiple species. An arc connects a gene x with a gene y from another species (vertex color) Y whenever it is one of the phylogenetically closest relatives of x. BMGs can be approximated with the help of similarity measures between gene sequences, albeit not without errors. Empirical estimates thus will usually violate the theoretical properties of BMGs. The corresponding graph editing problem can be used to guide error correction for best match data. Since the arc set modification problems for BMGs are NP-complete, efficient heuristics are needed if BMGs are to be used for the practical analysis of biological sequence data. Results Since BMGs have a characterization in terms of consistency of a certain set of rooted triples (binary trees on three vertices) defined on the set of genes, we consider heuristics that operate on triple sets. As an alternative, we show that there is a close connection to a set partitioning problem that leads to a class of top-down recursive algorithms that are similar to Aho’s supertree algorithm and give rise to BMG editing algorithms that are consistent in the sense that they leave BMGs invariant. Extensive benchmarking shows that community detection algorithms for the partitioning steps perform best for BMG editing. Conclusion Noisy BMG data can be corrected with sufficient accuracy and efficiency to make BMGs an attractive alternative to classical phylogenetic methods.

Download Full-text