scholarly journals Exploring Ensemble-Based Class Imbalance Learners for Intrusion Detection in Industrial Control Networks

2021 ◽  
Vol 5 (4) ◽  
pp. 72
Author(s):  
Maya Hilda Lestari Louk ◽  
Bayu Adhi Tama

Classifier ensembles have been utilized in the industrial cybersecurity sector for many years. However, their efficacy and reliability for intrusion detection systems remain questionable in current research, owing to the particularly imbalanced data issue. The purpose of this article is to address a gap in the literature by illustrating the benefits of ensemble-based models for identifying threats and attacks in a cyber-physical power grid. We provide a framework that compares nine cost-sensitive individual and ensemble models designed specifically for handling imbalanced data, including cost-sensitive C4.5, roughly balanced bagging, random oversampling bagging, random undersampling bagging, synthetic minority oversampling bagging, random undersampling boosting, synthetic minority oversampling boosting, AdaC2, and EasyEnsemble. Each ensemble’s performance is tested against a range of benchmarked power system datasets utilizing balanced accuracy, Kappa statistics, and AUC metrics. Our findings demonstrate that EasyEnsemble outperformed significantly in comparison to its rivals across the board. Furthermore, undersampling and oversampling strategies were effective in a boosting-based ensemble but not in a bagging-based ensemble.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Sikha Bagui ◽  
Kunqi Li

AbstractMachine learning plays an increasingly significant role in the building of Network Intrusion Detection Systems. However, machine learning models trained with imbalanced cybersecurity data cannot recognize minority data, hence attacks, effectively. One way to address this issue is to use resampling, which adjusts the ratio between the different classes, making the data more balanced. This research looks at resampling’s influence on the performance of Artificial Neural Network multi-class classifiers. The resampling methods, random undersampling, random oversampling, random undersampling and random oversampling, random undersampling with Synthetic Minority Oversampling Technique, and random undersampling with Adaptive Synthetic Sampling Method were used on benchmark Cybersecurity datasets, KDD99, UNSW-NB15, UNSW-NB17 and UNSW-NB18. Macro precision, macro recall, macro F1-score were used to evaluate the results. The patterns found were: First, oversampling increases the training time and undersampling decreases the training time; second, if the data is extremely imbalanced, both oversampling and undersampling increase recall significantly; third, if the data is not extremely imbalanced, resampling will not have much of an impact; fourth, with resampling, mostly oversampling, more of the minority data (attacks) were detected.


2020 ◽  
Vol 26 (2) ◽  
pp. 47-53
Author(s):  
Richard Paes ◽  
David C. Mazur ◽  
Bruce K. Venne ◽  
Jack Ostrzenski

Author(s):  
Haicheng Qu ◽  
Jianzhong Zhou ◽  
Jitao Qin ◽  
Xiaorong Tian

In traditional network anomaly detection algorithms, the anomaly threshold needs to be defined manually. Keeping this as background, this study proposes an anomaly detection algorithm (VAEOCSVM), which combines the variable auto-encoder (VAE) and one-class support vector machine (OCSVM) to realize anomaly detection in industrial control networks. First, the VAE model is used to obtain the distribution of the original normal sample data represented by the low-dimensional code; the reconstruction error of the VAE model is merged into the new input. Then, using OCSVM’s hinge-loss objective function and the random Fourier feature fitting radial basis function (RBF) kernel method, the OCSVM model is represented and solved using the deep neural network and gradient descent method. Finally, the decision function of the OCSVM model is constructed by using the solved parameter information to realize the detection of abnormal data. The proposed algorithm is compared with other machine-learning-based anomaly detection algorithms in terms of multiple indicators such as precision, recall, and [Formula: see text] score. The experimental results using various datasets show that the proposed algorithm has a better outlier recognition ability than the machine-learning-based anomaly detection algorithms.


2012 ◽  
Vol 22 (6) ◽  
pp. 477-493 ◽  
Author(s):  
Youngjoon Won ◽  
Mi-Jung Choi ◽  
Byungchul Park ◽  
James Won-Ki Hong

Sign in / Sign up

Export Citation Format

Share Document