Pre-clustering algorithm for anomaly detection and clustering that uses variable size buckets

Author(s):  
Manish Sharma ◽  
Durga Toshniwal
2021 ◽  
Vol 2113 (1) ◽  
pp. 012062
Author(s):  
Weihong Wang ◽  
Zhuolin Wu ◽  
Xuan Liu ◽  
Lei Jia ◽  
Xiaoguang Wang

Abstract For modern operation and maintenance systems, they are usually required to monitor multiple types and large quantities of machine’s key performance indicators (KPIs) at the same time with limited resources. In this paper, to tackle these problems, we propose a highly compatible time series anomaly detection model based on K-means clustering algorithm with a new Wavelet Feature Distance (WFD). Our work is inspired by some ideas from image processing and signal processing domain. Our model detects abnormalities in the time series datasets which are first clustered by K-means to boost the accuracy. Our experiments show significant accuracy improvements compared with traditional algorithms, and excellent compatibilities and operating efficiencies compared with algorithms based on deep learning.


Author(s):  
Catherine Cheung ◽  
Julio J. Valdés ◽  
Richard Salas Chavez ◽  
Srishti Sehgal

In this work, the sensor data related to a diesel engine system and specifically its turbocharger subsystem were analyzed. An incident where the turbocharger seized was recorded by dozens of standard turbocharger-related sensors. By training models to distinguish between normal healthy operating conditions and deteriorated conditions, there is an opportunity to develop prognostic and predictive tools to ideally help prevent a similar occurrence in the future. Analysis of this event provides an opportunity to identify changes in equipment indicators with a known outcome. A number of data analysis tools were used to characterize the healthy and deteriorated states of the turbocharger system, including various supervised classification as well as semi-supervised and unsupervised anomaly detection techniques. The leader clustering algorithm was also implemented to reduce the amount of data to train and develop the models. This paper describes the results of this modeling process, validated by testing on healthy data from the same propulsion system and a second distinct one. Although this problem posed challenges due to the severely imbalanced class distribution, the supervised classifiers, in particular Support Vector Machine (SVM) and Random Forest (RF), performed very well in all metrics while the unsupervised anomaly detection models achieved near-perfect accuracy for identifying healthy turbocharger states.


Sign in / Sign up

Export Citation Format

Share Document