Machine Learning Based Anomaly Detection of Log Files Using Ensemble Learning and Self-Attention

Online anomaly detection for stream data has been explored recently, where the detector is supposed to be able to perform an accurate and timely judgment for the upcoming observation. However, due to the inherent complex characteristics of stream data, such as quick generation, tremendous volume and dynamic evolution distribution, how to develop an effective online anomaly detection method is a challenge. The main objective of this paper is to propose an adaptive online anomaly detection method for stream data. This is achieved by combining isolation principle with online ensemble learning, which is then optimized by statistic histogram. Three main algorithms are developed, i.e., online detector building algorithm, anomaly detecting algorithm and adaptive detector updating algorithm. To evaluate our proposed method, four massive datasets from the UCI machine learning repository recorded from real events were adopted. Extensive simulations based on these datasets show that our method is effective and robust against different scenarios.

Download Full-text

Combining Machine Learning Models Using combo Library

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i09.7111 ◽

2020 ◽

Vol 34 (09) ◽

pp. 13648-13649

Author(s):

Yue Zhao ◽

Xuejian Wang ◽

Cheng Cheng ◽

Xueying Ding

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Ensemble Learning ◽

Academic Research ◽

Learning Models ◽

Model Combination ◽

Code Coverage ◽

Industry Applications ◽

Python Package ◽

Machine Learning Models

Model combination, often regarded as a key sub-field of ensemble learning, has been widely used in both academic research and industry applications. To facilitate this process, we propose and implement an easy-to-use Python toolkit, combo, to aggregate models and scores under various scenarios, including classification, clustering, and anomaly detection. In a nutshell, combo provides a unified and consistent way to combine both raw and pretrained models from popular machine learning libraries, e.g., scikit-learn, XGBoost, and LightGBM. With accessibility and robustness in mind, combo is designed with detailed documentation, interactive examples, continuous integration, code coverage, and maintainability check; it can be installed easily through Python Package Index (PyPI) or {https://github.com/yzhao062/combo}.

Download Full-text

Detecting Overlapping Data in System Logs Based on Ensemble Learning Method

Wireless Communications and Mobile Computing ◽

10.1155/2020/8853971 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Chunbo Liu ◽

Yitong Ren ◽

Mengmeng Liang ◽

Zhaojun Gu ◽

Jialiang Wang ◽

...

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Ensemble Learning ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Similar System ◽

Detection Model ◽

Overlapping Data ◽

Sample Data ◽

System Logs

Machine learning techniques are essential for system log anomaly detection. It is prone to the phenomenon of class overlap because of too many similar system log data. The occurrence of this phenomenon will have a serious impact on the anomaly detection of the system logs. To solve the problem of class overlap in system logs, this paper proposes an anomaly detection model for class overlap problem on system logs. We first calculate the relationship between the sample data and the membership of different classes, normal or anomaly, and use the fuzziness to separate the sample data of the overlapping parts of the classes from the data of the other parts. AdaBoost, an ensemble learning approach, is used to detect overlapping data. Compared with machine learning algorithms, ensemble learning can better classify the data of the overlapping parts, so as to achieve the purpose of detecting the anomalies of the system logs. We also discussed the possible impact of different voting methods on ensemble learning results. Experimental results show that our model can be effectively applied in a variety of basic algorithms, and the results of each measure have been improved.

Download Full-text

Large-Scale Data Learning Method for Anomaly Detection using Machine Learning for Monitoring Vibration in Vehicle Equipment

IEEJ Transactions on Industry Applications ◽

10.1541/ieejias.140.480 ◽

2020 ◽

Vol 140 (6) ◽

pp. 480-487

Author(s):

Minoru Kondo

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Large Scale ◽

Learning Method ◽

Large Scale Data ◽

Scale Data

Download Full-text

A Novel Machine Learning Model for Early Operational Anomaly Detection Using LWD/MWD Data

10.2523/iptc-19230-ms ◽

2019 ◽

Author(s):

Mohammed Al-Ghazal ◽

Viranchi Vedpathak

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Learning Model ◽

Machine Learning Model

Download Full-text

Anomaly Detection in Market Data Structures Via Machine Learning Algorithms

SSRN Electronic Journal ◽

10.2139/ssrn.3516028 ◽

2020 ◽

Author(s):

Dirk Röder ◽

Henning Mueller

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Data Structures ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Market Data

Download Full-text

OutlierNets: Highly Compact Deep Autoencoder Network Architectures for On-Device Acoustic Anomaly Detection

Sensors ◽

10.3390/s21144805 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4805

Author(s):

Saad Abbasi ◽

Mahmoud Famouri ◽

Mohammad Javad Shafiee ◽

Alexander Wong

Keyword(s):

Machine Learning ◽

Anomaly Detection ◽

Detection Methods ◽

Detection Accuracy ◽

Network Architectures ◽

Design Exploration ◽

Convolutional Autoencoder ◽

Acoustic Anomaly ◽

Human Operators ◽

Computational Resources

Human operators often diagnose industrial machinery via anomalous sounds. Given the new advances in the field of machine learning, automated acoustic anomaly detection can lead to reliable maintenance of machinery. However, deep learning-driven anomaly detection methods often require an extensive amount of computational resources prohibiting their deployment in factories. Here we explore a machine-driven design exploration strategy to create OutlierNets, a family of highly compact deep convolutional autoencoder network architectures featuring as few as 686 parameters, model sizes as small as 2.7 KB, and as low as 2.8 million FLOPs, with a detection accuracy matching or exceeding published architectures with as many as 4 million parameters. The architectures are deployed on an Intel Core i5 as well as a ARM Cortex A72 to assess performance on hardware that is likely to be used in industry. Experimental results on the model’s latency show that the OutlierNet architectures can achieve as much as 30x lower latency than published networks.

Download Full-text