Feature Selection for Intrusion Detection with Neural Networks and Support Vector Machines

Author(s):  
Srinivas Mukkamala ◽  
Andrew H. Sung

Computational intelligence (CI) methods are increasingly being used for problem solving, and CI-type learning machines are being used for intrusion detection. Intrusion detection is a problem of general interest to transportation infrastructure protection, since one of its necessary tasks is to protect the computers responsible for the infrastructure’s operational control, and an effective intrusion detection system (IDS) is essential for ensuring network security. Two classes of learning machines for IDSs are studied: artificial neural networks (ANNs) and support vector machines (SVMs). SVMs are shown to be superior to ANNs in three critical respects of IDSs: SVMs train and run an order of magnitude faster; they scale much better; and they give higher classification accuracy. A related issue is ranking the importance of input features, which is itself a problem of great interest. Since elimination of the insignificant (or useless) inputs leads to a simplified problem and possibly faster and more accurate detection, feature selection is very important in intrusion detection. Two methods for feature ranking are presented: the first one is independent of the modeling tool, while the second method is specific to SVMs. The two methods were applied to identify the important features in the 1999 Defense Advanced Research Projects Agency intrusion data set. It was shown that the two methods produce results that are largely consistent. Experimental results indicated that SVM-based IDSs with a reduced number of features can deliver enhanced or comparable performance. An SVM-based IDS for class-specific detection is proposed.

2011 ◽  
Vol 38 (1) ◽  
pp. 306-313 ◽  
Author(s):  
Shi-Jinn Horng ◽  
Ming-Yang Su ◽  
Yuan-Hsin Chen ◽  
Tzong-Wann Kao ◽  
Rong-Jian Chen ◽  
...  

Author(s):  
Mohammad Reza Daliri

AbstractIn this article, we propose a feature selection strategy using a binary particle swarm optimization algorithm for the diagnosis of different medical diseases. The support vector machines were used for the fitness function of the binary particle swarm optimization. We evaluated our proposed method on four databases from the machine learning repository, including the single proton emission computed tomography heart database, the Wisconsin breast cancer data set, the Pima Indians diabetes database, and the Dermatology data set. The results indicate that, with selected less number of features, we obtained a higher accuracy in diagnosing heart, cancer, diabetes, and erythematosquamous diseases. The results were compared with the traditional feature selection methods, namely, the F-score and the information gain, and a superior accuracy was obtained with our method. Compared to the genetic algorithm for feature selection, the results of the proposed method show a higher accuracy in all of the data, except in one. In addition, in comparison with other methods that used the same data, our approach has a higher performance using less number of features.


An Intrusion Detection System (IDS) is a system, that checks the network or data for abnormal actions and when such activity is discovered it issues an alert. Numerous IDS techniques are in use these days but one major problem with all of them is their performance. Various works have been done on this issue using support vector machine and multilayer perceptron. Supervised learning models such as support vector machines with related learning algorithms are used to analyze the data which is used for regression analysis and also classification. The IDS is used in analyzing big data as there is huge traffic which has to be analyzed to check for suspicious activities, and also be successful in doing so. Hence, an efficient and fast classification algorithm is required. Machine learning techniques such as neural networks and extreme machine learning are used. Both of these techniques are highly regarded and are considered one of the best techniques. Extreme learning machines are feed forward neural networks which have one hidden layer and no back propagation used for classification. Once the intrusion is detected using IDS through ELM then we are also going to detect the type of intrusion using the Random Forest Technique (Multi class classification) efficiently with a higher rate of accuracy and precision. The NSL_KDD dataset which is very well-known used for the training as well as testing of these IDS algorithms. This work determines that compared to artificial neural network and logistic regression extreme learning machines provide a much better rate of intrusion detection, which is 93.96% and is also proven to be more efficient in terms of execution time of 38 seconds


Sign in / Sign up

Export Citation Format

Share Document