An Approach to Feature Selection in Intrusion Detection Systems Using Machine Learning Algorithms

The rapid development of various services that are provided by information technology has been widely accepted by the users who are making use of such services in their day-to-day life activities. Securing such a system application from various intrusions still remains to be a one of the major issues in the current era. Detecting such anomalies from the regular events involves various steps such as data pre-processing, feature selection, and classification. Many of the computational models intend to accurately discriminate the samples of each group for better classification by identifying candidate features prior to the learning phase. This research studies the implementation of a combined feature selection technique such as the GRRF-FWSVM method which is applied to the benchmarked anomaly detection dataset KDD CUP 99. The results prove the novel proposed hybrid model is an effective method in identifying anomalies and it increases the detection rate of about 98.55% of the intrusion detection system with the two most common benchmark models.

Download Full-text

Detecting cybersecurity attacks across different network features and learners

Journal Of Big Data ◽

10.1186/s40537-021-00426-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Joffrey L. Leevy ◽

John Hancock ◽

Richard Zuech ◽

Taghi M. Khoshgoftaar

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Operating Characteristic ◽

Characteristic Curve ◽

Machine Learning Algorithms ◽

Feature Selection Technique ◽

Impact Performance ◽

Detection Model ◽

Wide Range ◽

Research Questions

AbstractMachine learning algorithms efficiently trained on intrusion detection datasets can detect network traffic capable of jeopardizing an information system. In this study, we use the CSE-CIC-IDS2018 dataset to investigate ensemble feature selection on the performance of seven classifiers. CSE-CIC-IDS2018 is big data (about 16,000,000 instances), publicly available, modern, and covers a wide range of realistic attack types. Our contribution is centered around answers to three research questions. The first question is, “Does feature selection impact performance of classifiers in terms of Area Under the Receiver Operating Characteristic Curve (AUC) and F1-score?” The second question is, “Does including the Destination_Port categorical feature significantly impact performance of LightGBM and Catboost in terms of AUC and F1-score?” The third question is, “Does the choice of classifier: Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), Logistic Regression (LR), Catboost, LightGBM, or XGBoost, significantly impact performance in terms of AUC and F1-score?” These research questions are all answered in the affirmative and provide valuable, practical information for the development of an efficient intrusion detection model. To the best of our knowledge, we are the first to use an ensemble feature selection technique with the CSE-CIC-IDS2018 dataset.

Download Full-text

Development of an efficient classifier using proposed sensitivity-based feature selection technique for intrusion detection system

International Journal of Information and Computer Security ◽

10.1504/ijics.2018.089594 ◽

2018 ◽

Vol 10 (1) ◽

pp. 80

Author(s):

H.S. Hota ◽

Dinesh K. Sharma ◽

A.K. Shrivas

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Feature Selection Technique ◽

Selection Technique

Download Full-text

Hybridization of Machine Learning Algorithm in Intrusion Detection System

Handbook of Research on Machine and Deep Learning Applications for Cyber Security - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-5225-9611-0.ch008 ◽

2020 ◽

pp. 150-175

Author(s):

Amudha P. ◽

Sivakumari S.

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Classification Accuracy ◽

Intrusion Detection System ◽

Learning Algorithm ◽

Detection System ◽

Principal Component ◽

Machine Learning Algorithms ◽

Feature Selection Technique ◽

Efficient Manner

In recent years, the field of machine learning grows very fast both on the development of techniques and its application in intrusion detection. The computational complexity of the machine learning algorithms increases rapidly as the number of features in the datasets increases. By choosing the significant features, the number of features in the dataset can be reduced, which is critical to progress the classification accuracy and speed of algorithms. Also, achieving high accuracy and detection rate and lowering false alarm rates are the major challenges in designing an intrusion detection system. The major motivation of this work is to address these issues by hybridizing machine learning and swarm intelligence algorithms for enhancing the performance of intrusion detection system. It also emphasizes applying principal component analysis as feature selection technique on intrusion detection dataset for identifying the most suitable feature subsets which may provide high-quality results in a fast and efficient manner.

Download Full-text

Fusion of Feature Selection and Random Forest for an Anomaly-Based Intrusion Detection System

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8332 ◽

2019 ◽

Vol 16 (8) ◽

pp. 3603-3607 ◽

Cited By ~ 1

Author(s):

Shraddha Khonde ◽

V. Ulagamuthalvi

Keyword(s):

Feature Selection ◽

Random Forest ◽

Intrusion Detection ◽

Real Time ◽

Intrusion Detection System ◽

New Technologies ◽

Detection System ◽

Sensitive Data ◽

Detection Systems ◽

New Type

Considering current network scenario hackers and intruders has become a big threat today. As new technologies are emerging fast, extensive use of these technologies and computers, what plays an important role is security. Most of the computers in network can be easily compromised with attacks. Big issue of concern is increase in new type of attack these days. Security to the sensitive data is very big threat to deal with, it need to consider as high priority issue which should be addressed immediately. Highly efficient Intrusion Detection Systems (IDS) are available now a days which detects various types of attacks on network. But we require the IDS which is intelligent enough to detect and analyze all type of new threats on the network. Maximum accuracy is expected by any of this intelligent intrusion detection system. An Intrusion Detection System can be hardware or software that analyze and monitors all activities of network to detect malicious activities happened inside the network. It also informs and helps administrator to deal with malicious packets, which if enters in network can harm more number of computers connected together. In our work we have implemented an intellectual IDS which helps administrator to analyze real time network traffic. IDS does it by classifying packets entering into the system as normal or malicious. This paper mainly focus on techniques used for feature selection to reduce number of features from KDD-99 dataset. This paper also explains algorithm used for classification i.e., Random Forest which works with forest of trees to classify real time packet as normal or malicious. Random forest makes use of ensembling techniques to give final output which is derived by combining output from number of trees used to create forest. Dataset which is used while performing experiments is KDD-99. This dataset is used to train all trees to get more accuracy with help of random forest. From results achieved we can observe that random forest algorithm gives more accuracy in distributed network with reduced false alarm rate.

Download Full-text

AN ARCHITECTURAL FRAMEWORK FOR ANT LION OPTIMIZATIONBASED FEATURE SELECTION TECHNIQUE FOR CLOUD INTRUSION DETECTION SYSTEM USING BAYESIAN CLASSIFIER

i-manager’s Journal on Cloud Computing ◽

10.26634/jcc.5.2.15691 ◽

2018 ◽

Vol 5 (2) ◽

pp. 36

Author(s):

HARUNA ATABO CHRISTOPHER ◽

JIMOH YAKUBU ◽

SHAFI'I MUHAMMAD ABDULHAMID ◽

ABDULMALIK D. MOHAMMED ◽

◽

...

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Bayesian Classifier ◽

Feature Selection Technique ◽

Selection Technique ◽

Architectural Framework ◽

Ant Lion

Download Full-text

A novel intrusion detection system for wireless mesh network with hybrid feature selection technique based on GA and MI

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-169421 ◽

2018 ◽

Vol 34 (3) ◽

pp. 1243-1250 ◽

Cited By ~ 7

Author(s):

R. Vijayanand ◽

D. Devaraj ◽

B. Kannapiran

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Wireless Mesh Network ◽

Detection System ◽

Mesh Network ◽

Feature Selection Technique ◽

Selection Technique ◽

Wireless Mesh

Download Full-text

Fuzzy Rule-Based Layered Classifier and Entropy-Based Feature Selection for Intrusion Detection System

Handbook of Research on Cyber Crime and Information Privacy - Advances in Information Security, Privacy, and Ethics ◽

10.4018/978-1-7998-5728-0.ch015 ◽

2021 ◽

pp. 289-309

Author(s):

Devaraju Sellappan ◽

Ramakrishnan Srinivasan

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Fuzzy Rule ◽

Intrusion Detection Systems ◽

Rule Based ◽

Detection Systems ◽

Positive Rate ◽

Rule Based Classifier

Intrusion detection systems must detect the vulnerability consistently in a network and also perform efficiently with the huge amount of traffic. Intrusion detection systems must be capable of detecting emerging and proactive threats in the networks. Various classifiers are used to classify the threats as normal or intrusive by supervising the system activity. In this chapter, layered fuzzy rule-based classifier is proposed to detect the various intrusions, and fuzzy entropy-based feature selection is proposed to identify the relevant features. Layered fuzzy rule-based classifier is proposed to improve the performance of the intrusion detection system. KDD dataset contains various attacks; these attacks are grouped into four classes, namely Denial-of-Service (DoS), Probe, Remote-to-Local (R2L), and User-to-Root (U2R). Real-time dataset is also considered in this research. Experimental result shows that the proposed method provides good detection rate, minimizes the false positive rate, and less computational time.

Download Full-text

Review of Ensemble-Based Filter Feature Selection Techniques for Building Intrusion Detection System

Journal of Network Security Computer Networks ◽

10.46610/jonscn.2021.v07i02.003 ◽

2021 ◽

Vol 7 (2) ◽

Author(s):

Ishita Karna ◽

Aniket Madam ◽

Chinmay Deokule ◽

Rahul Adhao ◽

Vinod Pachghare

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Ensemble Learning ◽

Detection System ◽

Critical Role ◽

Monitoring Network ◽

Detection Systems ◽

Computational Overhead ◽

Optimal Subset ◽

Feature Selection Techniques

Intrusion detection systems (IDS) play a critical role in network security by monitoring network traffic for malicious activities and detecting vulnerability exploits against target applications or computers. A large number of redundant and irrelevant features increase the dimensionality of the dataset, which increases the computational overhead on the system and reduces its performance. This paper studies different filter-based feature selection techniques to improve performance of system. Feature selection techniques are used to select a well performing subset of features followed by technique of ensemble learning, which selects an optimal subset of features by combining multiple subsets of features. Feature selection combined with ensemble learning is explored in this paper. The performance of the algorithms implemented in existing research in terms of accuracy, false alarm rates, and true positive rates is explored, and their shortcomings are observed.

Download Full-text