Pre-clustering algorithm for anomaly detection and clustering that uses variable size buckets

Abstract For modern operation and maintenance systems, they are usually required to monitor multiple types and large quantities of machine’s key performance indicators (KPIs) at the same time with limited resources. In this paper, to tackle these problems, we propose a highly compatible time series anomaly detection model based on K-means clustering algorithm with a new Wavelet Feature Distance (WFD). Our work is inspired by some ideas from image processing and signal processing domain. Our model detects abnormalities in the time series datasets which are first clustered by K-means to boost the accuracy. Our experiments show significant accuracy improvements compared with traditional algorithms, and excellent compatibilities and operating efficiencies compared with algorithms based on deep learning.

Download Full-text

Unsupervised Anomaly Detection Using HDG-Clustering Algorithm

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-540-69162-4_37 ◽

2008 ◽

pp. 356-365 ◽

Cited By ~ 3

Author(s):

Cheng-Fa Tsai ◽

Chia-Chen Yen

Keyword(s):

Anomaly Detection ◽

Clustering Algorithm ◽

Unsupervised Anomaly Detection

Download Full-text

Failure Modeling of a Propulsion Subsystem: Unsupervised and Semi-Supervised Approaches to Anomaly Detection

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419400196 ◽

2019 ◽

Vol 33 (11) ◽

pp. 1940019 ◽

Cited By ~ 2

Author(s):

Catherine Cheung ◽

Julio J. Valdés ◽

Richard Salas Chavez ◽

Srishti Sehgal

Keyword(s):

Anomaly Detection ◽

Clustering Algorithm ◽

Operating Conditions ◽

Sensor Data ◽

Support Vector ◽

Training Models ◽

Detection Techniques ◽

Failure Modeling ◽

Supervised Classifiers ◽

Unsupervised Anomaly Detection

In this work, the sensor data related to a diesel engine system and specifically its turbocharger subsystem were analyzed. An incident where the turbocharger seized was recorded by dozens of standard turbocharger-related sensors. By training models to distinguish between normal healthy operating conditions and deteriorated conditions, there is an opportunity to develop prognostic and predictive tools to ideally help prevent a similar occurrence in the future. Analysis of this event provides an opportunity to identify changes in equipment indicators with a known outcome. A number of data analysis tools were used to characterize the healthy and deteriorated states of the turbocharger system, including various supervised classification as well as semi-supervised and unsupervised anomaly detection techniques. The leader clustering algorithm was also implemented to reduce the amount of data to train and develop the models. This paper describes the results of this modeling process, validated by testing on healthy data from the same propulsion system and a second distinct one. Although this problem posed challenges due to the severely imbalanced class distribution, the supervised classifiers, in particular Support Vector Machine (SVM) and Random Forest (RF), performed very well in all metrics while the unsupervised anomaly detection models achieved near-perfect accuracy for identifying healthy turbocharger states.

Download Full-text