An Approach to Feature Selection in Intrusion Detection Systems Using Machine Learning Algorithms

2020 ◽  
Vol 16 (4) ◽  
pp. 48-58
Author(s):  
Kavitha G. ◽  
Elango N. M.

The rapid development of various services that are provided by information technology has been widely accepted by the users who are making use of such services in their day-to-day life activities. Securing such a system application from various intrusions still remains to be a one of the major issues in the current era. Detecting such anomalies from the regular events involves various steps such as data pre-processing, feature selection, and classification. Many of the computational models intend to accurately discriminate the samples of each group for better classification by identifying candidate features prior to the learning phase. This research studies the implementation of a combined feature selection technique such as the GRRF-FWSVM method which is applied to the benchmarked anomaly detection dataset KDD CUP 99. The results prove the novel proposed hybrid model is an effective method in identifying anomalies and it increases the detection rate of about 98.55% of the intrusion detection system with the two most common benchmark models.

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Joffrey L. Leevy ◽  
John Hancock ◽  
Richard Zuech ◽  
Taghi M. Khoshgoftaar

AbstractMachine learning algorithms efficiently trained on intrusion detection datasets can detect network traffic capable of jeopardizing an information system. In this study, we use the CSE-CIC-IDS2018 dataset to investigate ensemble feature selection on the performance of seven classifiers. CSE-CIC-IDS2018 is big data (about 16,000,000 instances), publicly available, modern, and covers a wide range of realistic attack types. Our contribution is centered around answers to three research questions. The first question is, “Does feature selection impact performance of classifiers in terms of Area Under the Receiver Operating Characteristic Curve (AUC) and F1-score?” The second question is, “Does including the Destination_Port categorical feature significantly impact performance of LightGBM and Catboost in terms of AUC and F1-score?” The third question is, “Does the choice of classifier: Decision Tree (DT), Random Forest (RF), Naive Bayes (NB), Logistic Regression (LR), Catboost, LightGBM, or XGBoost, significantly impact performance in terms of AUC and F1-score?” These research questions are all answered in the affirmative and provide valuable, practical information for the development of an efficient intrusion detection model. To the best of our knowledge, we are the first to use an ensemble feature selection technique with the CSE-CIC-IDS2018 dataset.


Author(s):  
Amudha P. ◽  
Sivakumari S.

In recent years, the field of machine learning grows very fast both on the development of techniques and its application in intrusion detection. The computational complexity of the machine learning algorithms increases rapidly as the number of features in the datasets increases. By choosing the significant features, the number of features in the dataset can be reduced, which is critical to progress the classification accuracy and speed of algorithms. Also, achieving high accuracy and detection rate and lowering false alarm rates are the major challenges in designing an intrusion detection system. The major motivation of this work is to address these issues by hybridizing machine learning and swarm intelligence algorithms for enhancing the performance of intrusion detection system. It also emphasizes applying principal component analysis as feature selection technique on intrusion detection dataset for identifying the most suitable feature subsets which may provide high-quality results in a fast and efficient manner.


2019 ◽  
Vol 16 (8) ◽  
pp. 3603-3607 ◽  
Author(s):  
Shraddha Khonde ◽  
V. Ulagamuthalvi

Considering current network scenario hackers and intruders has become a big threat today. As new technologies are emerging fast, extensive use of these technologies and computers, what plays an important role is security. Most of the computers in network can be easily compromised with attacks. Big issue of concern is increase in new type of attack these days. Security to the sensitive data is very big threat to deal with, it need to consider as high priority issue which should be addressed immediately. Highly efficient Intrusion Detection Systems (IDS) are available now a days which detects various types of attacks on network. But we require the IDS which is intelligent enough to detect and analyze all type of new threats on the network. Maximum accuracy is expected by any of this intelligent intrusion detection system. An Intrusion Detection System can be hardware or software that analyze and monitors all activities of network to detect malicious activities happened inside the network. It also informs and helps administrator to deal with malicious packets, which if enters in network can harm more number of computers connected together. In our work we have implemented an intellectual IDS which helps administrator to analyze real time network traffic. IDS does it by classifying packets entering into the system as normal or malicious. This paper mainly focus on techniques used for feature selection to reduce number of features from KDD-99 dataset. This paper also explains algorithm used for classification i.e., Random Forest which works with forest of trees to classify real time packet as normal or malicious. Random forest makes use of ensembling techniques to give final output which is derived by combining output from number of trees used to create forest. Dataset which is used while performing experiments is KDD-99. This dataset is used to train all trees to get more accuracy with help of random forest. From results achieved we can observe that random forest algorithm gives more accuracy in distributed network with reduced false alarm rate.


Author(s):  
Devaraju Sellappan ◽  
Ramakrishnan Srinivasan

Intrusion detection systems must detect the vulnerability consistently in a network and also perform efficiently with the huge amount of traffic. Intrusion detection systems must be capable of detecting emerging and proactive threats in the networks. Various classifiers are used to classify the threats as normal or intrusive by supervising the system activity. In this chapter, layered fuzzy rule-based classifier is proposed to detect the various intrusions, and fuzzy entropy-based feature selection is proposed to identify the relevant features. Layered fuzzy rule-based classifier is proposed to improve the performance of the intrusion detection system. KDD dataset contains various attacks; these attacks are grouped into four classes, namely Denial-of-Service (DoS), Probe, Remote-to-Local (R2L), and User-to-Root (U2R). Real-time dataset is also considered in this research. Experimental result shows that the proposed method provides good detection rate, minimizes the false positive rate, and less computational time.


Author(s):  
Ishita Karna ◽  
Aniket Madam ◽  
Chinmay Deokule ◽  
Rahul Adhao ◽  
Vinod Pachghare

Intrusion detection systems (IDS) play a critical role in network security by monitoring network traffic for malicious activities and detecting vulnerability exploits against target applications or computers. A large number of redundant and irrelevant features increase the dimensionality of the dataset, which increases the computational overhead on the system and reduces its performance. This paper studies different filter-based feature selection techniques to improve performance of system. Feature selection techniques are used to select a well performing subset of features followed by technique of ensemble learning, which selects an optimal subset of features by combining multiple subsets of features. Feature selection combined with ensemble learning is explored in this paper. The performance of the algorithms implemented in existing research in terms of accuracy, false alarm rates, and true positive rates is explored, and their shortcomings are observed.


Sign in / Sign up

Export Citation Format

Share Document