A Comparative Evaluation of AutoEncoder-Based Unsupervised Anomaly Detection Methods Applied on Space Payload

The need for robust unsupervised anomaly detection in streaming data is increasing rapidly in the current era of smart devices, where enormous data are gathered from numerous sensors. These sensors record the internal state of a machine, the external environment, and the interaction of machines with other machines and humans. It is of prime importance to leverage this information in order to minimize downtime of machines, or even avoid downtime completely by constant monitoring. Since each device generates a different type of streaming data, it is normally the case that a specific kind of anomaly detection technique performs better than the others depending on the data type. For some types of data and use-cases, statistical anomaly detection techniques work better, whereas for others, deep learning-based techniques are preferred. In this paper, we present a novel anomaly detection technique, FuseAD, which takes advantage of both statistical and deep-learning-based approaches by fusing them together in a residual fashion. The obtained results show an increase in area under the curve (AUC) as compared to state-of-the-art anomaly detection methods when FuseAD is tested on a publicly available dataset (Yahoo Webscope benchmark). The obtained results advocate that this fusion-based technique can obtain the best of both worlds by combining their strengths and complementing their weaknesses. We also perform an ablation study to quantify the contribution of the individual components in FuseAD, i.e., the statistical ARIMA model as well as the deep-learning-based convolutional neural network (CNN) model.

Download Full-text

egoDetect: Visual Detection and Exploration of Anomaly in Social Communication Network

Sensors ◽

10.3390/s20205895 ◽

2020 ◽

Vol 20 (20) ◽

pp. 5895

Author(s):

Jiansu Pu ◽

Jingwen Zhang ◽

Hui Shao ◽

Tingting Zhang ◽

Yunbo Rao

Keyword(s):

Social Networks ◽

Anomaly Detection ◽

Communication Networks ◽

Social Communication ◽

Detection Method ◽

Detection Methods ◽

Visualization System ◽

Egocentric Network ◽

Unsupervised Anomaly Detection ◽

The Relationship

The development of the Internet has made social communication increasingly important for maintaining relationships between people. However, advertising and fraud are also growing incredibly fast and seriously affect our daily life, e.g., leading to money and time losses, trash information, and privacy problems. Therefore, it is very important to detect anomalies in social networks. However, existing anomaly detection methods cannot guarantee the correct rate. Besides, due to the lack of labeled data, we also cannot use the detection results directly. In other words, we still need human analysts in the loop to provide enough judgment for decision making. To help experts analyze and explore the results of anomaly detection in social networks more objectively and effectively, we propose a novel visualization system, egoDetect, which can detect the anomalies in social communication networks efficiently. Based on the unsupervised anomaly detection method, the system can detect the anomaly without training and get the overview quickly. Then we explore an ego’s topology and the relationship between egos and alters by designing a novel glyph based on the egocentric network. Besides, it also provides rich interactions for experts to quickly navigate to the interested users for further exploration. We use an actual call dataset provided by an operator to evaluate our system. The result proves that our proposed system is effective in the anomaly detection of social networks.

Download Full-text

Evaluation of Unsupervised Anomaly Detection Methods in Sentiment Mining

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.i8012.078919 ◽

2019 ◽

Vol 8 (9) ◽

pp. 1080-1085

Keyword(s):

Anomaly Detection ◽

Computation Time ◽

Vital Role ◽

Detection Methods ◽

Research Focus ◽

Comparative Performance ◽

Detection Techniques ◽

Detection Algorithms ◽

Sentiment Mining ◽

Unsupervised Anomaly Detection

Anomaly detection has vital role in data preprocessing and also in the mining of outstanding points for marketing, network sensors, fraud detection, intrusion detection, stock market analysis. Recent studies have been found to concentrate more on outlier detection for real time datasets. Anomaly detection study is at present focuses on the expansion of innovative machine learning methods and on enhancing the computation time. Sentiment mining is the process to discover how people feel about a particular topic. Though many anomaly detection techniques have been proposed, it is also notable that the research focus lacks a comparative performance evaluation in sentiment mining datasets. In this study, three popular unsupervised anomaly detection algorithms such as density based, statistical based and cluster based anomaly detection methods are evaluated on movie review sentiment mining dataset. This paper will set a base for anomaly detection methods in sentiment mining research. The results show that density based (LOF) anomaly detection method suits best for the movie review sentiment dataset.

Download Full-text

The MVTec Anomaly Detection Dataset: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection

International Journal of Computer Vision ◽

10.1007/s11263-020-01400-4 ◽

2021 ◽

Author(s):

Paul Bergmann ◽

Kilian Batzner ◽

Michael Fauser ◽

David Sattlegger ◽

Carsten Steger

Keyword(s):

Computer Vision ◽

Anomaly Detection ◽

Structural Changes ◽

Performance Metrics ◽

Generative Models ◽

Detection Methods ◽

Generative Adversarial Networks ◽

Natural Image ◽

Advantages And Disadvantages ◽

Unsupervised Anomaly Detection

AbstractThe detection of anomalous structures in natural image data is of utmost importance for numerous tasks in the field of computer vision. The development of methods for unsupervised anomaly detection requires data on which to train and evaluate new approaches and ideas. We introduce the MVTec anomaly detection dataset containing 5354 high-resolution color images of different object and texture categories. It contains normal, i.e., defect-free images intended for training and images with anomalies intended for testing. The anomalies manifest themselves in the form of over 70 different types of defects such as scratches, dents, contaminations, and various structural changes. In addition, we provide pixel-precise ground truth annotations for all anomalies. We conduct a thorough evaluation of current state-of-the-art unsupervised anomaly detection methods based on deep architectures such as convolutional autoencoders, generative adversarial networks, and feature descriptors using pretrained convolutional neural networks, as well as classical computer vision methods. We highlight the advantages and disadvantages of multiple performance metrics as well as threshold estimation techniques. This benchmark indicates that methods that leverage descriptors of pretrained networks outperform all other approaches and deep-learning-based generative models show considerable room for improvement.

Download Full-text

Unsupervised modeling anomaly detection in discussion forums posts using global vectors for text representation

Natural Language Engineering ◽

10.1017/s1351324920000066 ◽

2020 ◽

Vol 26 (5) ◽

pp. 551-578

Author(s):

Paweł Cichosz

Keyword(s):

Anomaly Detection ◽

English Language ◽

Learning Task ◽

Detection Methods ◽

Support Vector ◽

Discussion Forums ◽

Text Data ◽

Polish Language ◽

Unsupervised Anomaly Detection ◽

Detection Quality

AbstractAnomaly detection can be seen as an unsupervised learning task in which a predictive model created on historical data is used to detect outlying instances in new data. This work addresses possibly promising but relatively uncommon application of anomaly detection to text data. Two English-language and one Polish-language Internet discussion forums devoted to psychoactive substances received from home-grown plants, such as hashish or marijuana, serve as text sources that are both realistic and possibly interesting on their own, due to potential associations with drug-related crime. The utility of two different vector text representations is examined: the simple bag of words representation and a more refined Global Vectors (GloVe) representation, which is an example of the increasingly popular word embedding approach. They are both combined with two unsupervised anomaly detection methods, based on one-class support vector machines (SVM) and based on dissimilarity to k-medoids clusters. The GloVe representation is found definitely more useful for anomaly detection, permitting better detection quality and ameliorating the curse of dimensionality issues with text clustering. The cluster dissimilarity approach combined with this representation outperforms one-class SVM with respect to detection quality and appears a more promising approach to anomaly detection in text data.

Download Full-text

Transfer Learning for Anomaly Detection through Localized and Unsupervised Instance Selection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6068 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6054-6061

Author(s):

Vercruyssen Vincent ◽

Meert Wannes ◽

Davis Jesse

Keyword(s):

Anomaly Detection ◽

Transfer Learning ◽

Real World ◽

Learning Algorithm ◽

Detection Task ◽

Detection Methods ◽

Water Usage ◽

Unsupervised Anomaly Detection ◽

Anomalous Water ◽

Real World Problems

Anomaly detection attempts to identify instances that deviate from expected behavior. Constructing performant anomaly detectors on real-world problems often requires some labeled data, which can be difficult and costly to obtain. However, often one considers multiple, related anomaly detection tasks. Therefore, it may be possible to transfer labeled instances from a related anomaly detection task to the problem at hand. This paper proposes a novel transfer learning algorithm for anomaly detection that selects and transfers relevant labeled instances from a source anomaly detection task to a target one. Then, it classifies target instances using a novel semi-supervised nearest-neighbors technique that considers both unlabeled target and transferred, labeled source instances. The algorithm outperforms a multitude of state-of-the-art transfer learning methods and unsupervised anomaly detection methods on a large benchmark. Furthermore, it outperforms its rivals on a real-world task of detecting anomalous water usage in retail stores.

Download Full-text

Unsupervised Anomaly Detection Approach for Time-Series in Multi-Domains Using Deep Reconstruction Error

Symmetry ◽

10.3390/sym12081251 ◽

2020 ◽

Vol 12 (8) ◽

pp. 1251 ◽

Cited By ~ 1

Author(s):

Tsatsral Amarbayasgalan ◽

Van Huy Pham ◽

Nipon Theera-Umpon ◽

Keun Ho Ryu

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Real Time ◽

Reconstruction Error ◽

Ar Model ◽

Detection Methods ◽

Series Data ◽

Detection Approach ◽

Unsupervised Anomaly Detection ◽

Anomaly Detector

Automatic anomaly detection for time-series is critical in a variety of real-world domains such as fraud detection, fault diagnosis, and patient monitoring. Current anomaly detection methods detect the remarkably low proportion of the actual abnormalities correctly. Furthermore, most of the datasets do not provide data labels, and require unsupervised approaches. By focusing on these problems, we propose a novel deep learning-based unsupervised anomaly detection approach (RE-ADTS) for time-series data, which can be applicable to batch and real-time anomaly detections. RE-ADTS consists of two modules including the time-series reconstructor and anomaly detector. The time-series reconstructor module uses the autoregressive (AR) model to find an optimal window width and prepares the subsequences for further analysis according to the width. Then, it uses a deep autoencoder (AE) model to learn the data distribution, which is then used to reconstruct a time-series close to the normal. For anomalies, their reconstruction error (RE) was higher than that of the normal data. As a result of this module, RE and compressed representation of the subsequences were estimated. Later, the anomaly detector module defines the corresponding time-series as normal or an anomaly using a RE based anomaly threshold. For batch anomaly detection, the combination of the density-based clustering technique and anomaly threshold is employed. In the case of real-time anomaly detection, only the anomaly threshold is used without the clustering process. We conducted two types of experiments on a total of 52 publicly available time-series benchmark datasets for the batch and real-time anomaly detections. Experimental results show that the proposed RE-ADTS outperformed the state-of-the-art publicly available anomaly detection methods in most cases.

Download Full-text

Graph Regularized Deep Sparse Representation for Unsupervised Anomaly Detection

Computational Intelligence and Neuroscience ◽

10.1155/2021/4026132 ◽

2021 ◽

Vol 2021 ◽

pp. 1-19

Author(s):

Shicheng Li ◽

Shumin Lai ◽

Yan Jiang ◽

Wenle Wang ◽

Yugen Yi

Keyword(s):

Anomaly Detection ◽

Sparse Representation ◽

Original Data ◽

Feature Representation ◽

Detection Methods ◽

Local Geometry ◽

L1 Norm ◽

Graph Regularization ◽

Structure Information ◽

Unsupervised Anomaly Detection

Anomaly detection (AD) aims to distinguish the data points that are inconsistent with the overall pattern of the data. Recently, unsupervised anomaly detection methods have aroused huge attention. Among these methods, feature representation (FR) plays an important role, which can directly affect the performance of anomaly detection. Sparse representation (SR) can be regarded as one of matrix factorization (MF) methods, which is a powerful tool for FR. However, there are some limitations in the original SR. On the one hand, it just learns the shallow feature representations, which leads to the poor performance for anomaly detection. On the other hand, the local geometry structure information of data is ignored. To address these shortcomings, a graph regularized deep sparse representation (GRDSR) approach is proposed for unsupervised anomaly detection in this work. In GRDSR, a deep representation framework is first designed by extending the single layer MF to a multilayer MF for extracting hierarchical structure from the original data. Next, a graph regularization term is introduced to capture the intrinsic local geometric structure information of the original data during the process of FR, making the deep features preserve the neighborhood relationship well. Then, a L1-norm-based sparsity constraint is added to enhance the discriminant ability of the deep features. Finally, a reconstruction error is applied to distinguish anomalies. In order to demonstrate the effectiveness of the proposed approach, we conduct extensive experiments on ten datasets. Compared with the state-of-the-art methods, the proposed approach can achieve the best performance.

Download Full-text

A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data

PLoS ONE ◽

10.1371/journal.pone.0152173 ◽

2016 ◽

Vol 11 (4) ◽

pp. e0152173 ◽

Cited By ~ 197

Author(s):

Markus Goldstein ◽

Seiichi Uchida

Keyword(s):

Anomaly Detection ◽

Comparative Evaluation ◽

Multivariate Data ◽

Detection Algorithms ◽

Unsupervised Anomaly Detection

Download Full-text