scholarly journals A GAN and Feature Selection-Based Oversampling Technique for Intrusion Detection

2021 ◽  
Vol 2021 ◽  
pp. 1-15
Author(s):  
Xiaodong Liu ◽  
Tong Li ◽  
Runzi Zhang ◽  
Di Wu ◽  
Yongheng Liu ◽  
...  

In recent years, there have been numerous cyber security issues that have caused considerable damage to the society. The development of efficient and reliable Intrusion Detection Systems (IDSs) is an effective countermeasure against the growing cyber threats. In modern high-bandwidth, large-scale network environments, traditional IDSs suffer from a high rate of missed and false alarms. Researchers have introduced machine learning techniques into intrusion detection with good results. However, due to the scarcity of attack data, such methods’ training sets are usually unbalanced, affecting the analysis performance. In this paper, we survey and analyze the design principles and shortcomings of existing oversampling methods. Based on the findings, we take the perspective of imbalance and high dimensionality of datasets in the field of intrusion detection and propose an oversampling technique based on Generative Adversarial Networks (GAN) and feature selection. Specifically, we model the complex high-dimensional distribution of attacks based on Gradient Penalty Wasserstein GAN (WGAN-GP) to generate additional attack samples. We then select a subset of features representing the entire dataset based on analysis of variance, ultimately generating a rebalanced low-dimensional dataset for machine learning training. To evaluate the effectiveness of our proposal, we conducted experiments based on the NSL-KDD, UNSW-NB15, and CICIDS-2017 datasets. The experimental results show that our method can effectively improve the detection performance of machine learning models and outperform the baselines.

Author(s):  
Iqbal H. Sarker ◽  
Yoosef B. Abushark ◽  
Fawaz Alsolami ◽  
Asif Irshad Khan

Cyber security has recently received enormous attention in today’s security concerns, due to the popularity of the Internet-of-Things (IoT), the tremendous growth of computer networks, and the huge number of relevant applications. Thus, detecting various cyber-attacks or anomalies in a network and building an effective intrusion detection system that performs an essential role in today’s security is becoming more important. Artificial intelligence, particularly machine learning techniques, can be used for building such a data-driven intelligent intrusion detection system. In order to achieve this goal, in this paper, we present an Intrusion Detection Tree (“IntruDTree”) machine-learning-based security model that first takes into account the ranking of security features according to their importance and then build a tree-based generalized intrusion detection model based on the selected important features. This model is not only effective in terms of prediction accuracy for unseen test cases but also minimizes the computational complexity of the model by reducing the feature dimensions. Finally, the effectiveness of our IntruDTree model was examined by conducting experiments on cybersecurity datasets and computing the precision, recall, fscore, accuracy, and ROC values to evaluate. We also compare the outcome results of IntruDTree model with several traditional popular machine learning methods such as the naive Bayes classifier, logistic regression, support vector machines, and k-nearest neighbor, to analyze the effectiveness of the resulting security model.


Author(s):  
Ly Vu ◽  
Quang Uy Nguyen

Machine learning-based intrusion detection hasbecome more popular in the research community thanks to itscapability in discovering unknown attacks. To develop a gooddetection model for an intrusion detection system (IDS) usingmachine learning, a great number of attack and normal datasamples are required in the learning process. While normaldata can be relatively easy to collect, attack data is muchrarer and harder to gather. Subsequently, IDS datasets areoften dominated by normal data and machine learning modelstrained on those imbalanced datasets are ineffective in detect-ing attacks. In this paper, we propose a novel solution to thisproblem by using generative adversarial networks to generatesynthesized attack data for IDS. The synthesized attacks aremerged with the original data to form the augmented dataset.Three popular machine learning techniques are trained on theaugmented dataset. The experiments conducted on the threecommon IDS datasets and one our own dataset show thatmachine learning algorithms achieve better performance whentrained on the augmented dataset of the generative adversarialnetworks compared to those trained on the original datasetand other sampling techniques. The visualization techniquewas also used to analyze the properties of the synthesizeddata of the generative adversarial networks and the others.


Author(s):  
Daniel Kobla Gasu

The internet has become an indispensable resource for exchanging information among users, devices, and organizations. However, the use of the internet also exposes these entities to myriad cyber-attacks that may result in devastating outcomes if appropriate measures are not implemented to mitigate the risks. Currently, intrusion detection and threat detection schemes still face a number of challenges including low detection rates, high rates of false alarms, adversarial resilience, and big data issues. This chapter describes a focused literature survey of machine learning (ML) and data mining (DM) methods for cyber analytics in support of intrusion detection and cyber-attack detection. Key literature on ML and DM methods for intrusion detection is described. ML and DM methods and approaches such as support vector machine, random forest, and artificial neural networks, among others, with their variations, are surveyed, compared, and contrasted. Selected papers were indexed, read, and summarized in a tabular format.


Author(s):  
Mushtaq Talb Tally ◽  
Haleh Amintoosi

With the development of web applications nowadays, intrusions represent a crucial aspect in terms of violating the security policies. Intrusions can be defined as a specific change in the normal behavior of the network operations that intended to violate the security policies of a particular network and affect its performance. Recently, several researchers have examined the capabilities of machine learning techniques in terms of detecting intrusions. One of the important issues behind using the machine learning techniques lies on employing proper set of features. Since the literature has shown diversity of feature types, there is a vital demand to apply a feature selection approach in order to identify the most appropriate features for intrusion detection. This study aims to propose a hybrid method of Genetic Algorithm and Support Vector Machine. GA has been as a feature selection in order to select the best features, while SVM has been used as a classification method to categorize the behavior into normal and intrusion based on the selected features from GA. A benchmark dataset of intrusions (NSS-KDD) has been in the experiment. In addition, the proposed method has been compared with the traditional SVM. Results showed that GA has significantly improved the SVM classification by achieving 0.927 of f-measure.


Symmetry ◽  
2020 ◽  
Vol 12 (5) ◽  
pp. 754 ◽  
Author(s):  
Iqbal H. Sarker ◽  
Yoosef B. Abushark ◽  
Fawaz Alsolami ◽  
Asif Irshad Khan

Cyber security has recently received enormous attention in today’s security concerns, due to the popularity of the Internet-of-Things (IoT), the tremendous growth of computer networks, and the huge number of relevant applications. Thus, detecting various cyber-attacks or anomalies in a network and building an effective intrusion detection system that performs an essential role in today’s security is becoming more important. Artificial intelligence, particularly machine learning techniques, can be used for building such a data-driven intelligent intrusion detection system. In order to achieve this goal, in this paper, we present an Intrusion Detection Tree (“IntruDTree”) machine-learning-based security model that first takes into account the ranking of security features according to their importance and then build a tree-based generalized intrusion detection model based on the selected important features. This model is not only effective in terms of prediction accuracy for unseen test cases but also minimizes the computational complexity of the model by reducing the feature dimensions. Finally, the effectiveness of our IntruDTree model was examined by conducting experiments on cybersecurity datasets and computing the precision, recall, fscore, accuracy, and ROC values to evaluate. We also compare the outcome results of IntruDTree model with several traditional popular machine learning methods such as the naive Bayes classifier, logistic regression, support vector machines, and k-nearest neighbor, to analyze the effectiveness of the resulting security model.


2021 ◽  
Vol 13 (22) ◽  
pp. 12337
Author(s):  
Abdullah Alharbi ◽  
Adil Hussain Seh ◽  
Wael Alosaimi ◽  
Hashem Alyami ◽  
Alka Agrawal ◽  
...  

Machine learning (ML) is one of the dominating technologies practiced in both the industrial and academic domains throughout the world. ML algorithms can examine the threats and respond to intrusions and security incidents swiftly in an instinctive way. It plays a critical function in providing a proactive security mechanism in the cybersecurity domain. Cybersecurity ensures the real time protection of information, information systems, and networks from intruders. Several security and privacy reports have cited that there has been a rapid increase in both the frequency and the number of cybersecurity breaches in the last decade. Information security has been compromised by intruders at an alarming rate. Anomaly detection, phishing page identification, software vulnerability diagnosis, malware identification, and denial of services attacks are the main cyber-security issues that demand effective solutions. Researchers and experts have been practicing different approaches to address the current cybersecurity issues and challenges. However, in this research endeavor, our objective is to make an idealness assessment of machine learning-based intrusion detection systems (IDS) under the hesitant fuzzy (HF) conditions, using a multi-criteria decision making (MCDM)-based analytical hierarchy process (AHP) and technique for order of preference by similarity to ideal-solutions (TOPSIS). Hesitant fuzzy sets are useful for addressing decision-making situations in which experts must overcome the reluctance to make a conclusion. The proposed research project would assist the machine learning practitioners and cybersecurity specialists in identifying, selecting, and prioritizing cybersecurity-related attributes for intrusion detection systems, and build more ideal and effective intrusion detection systems.


Cyber security is a major problem of modern society so that Vulnerabilities of computer Network is become easy with the help of technologies and human skills. Now day’s difference type of attacks occurred for example DOS attack, Probing, R2U, R2L virus, port scans, buffer overflow, CGI Attack and flooding etc. We need a platform where a system can be developed for recognition and prevention of these attacks. In This paper, most of the latest methods are summarised to implement IDS for cyber security. Intrusion Detection Systems is a most suitable solution for cyber attacks. Machine learning based Intrusion Detection Systems have high accuracy, in rapidly changing environment. This paper discusses which type of ML techniques has low accuracy, so it explore some research area for researcher.


Sign in / Sign up

Export Citation Format

Share Document