Efficient Classification of DDoS Attacks Using an Ensemble Feature Selection Algorithm

2017 ◽  
Vol 29 (1) ◽  
pp. 71-83 ◽  
Author(s):  
Khundrakpam Johnson Singh ◽  
Tanmay De

Abstract In the current cyber world, one of the most severe cyber threats are distributed denial of service (DDoS) attacks, which make websites and other online resources unavailable to legitimate clients. It is different from other cyber threats that breach security parameters; however, DDoS is a short-term attack that brings down the server temporarily. Appropriate selection of features plays a crucial role for effective detection of DDoS attacks. Too many irrelevant features not only produce unrelated class categories but also increase computation overhead. In this article, we propose an ensemble feature selection algorithm to determine which attribute in the given training datasets is efficient in categorizing the classes. The result of the ensemble algorithm when compared to a threshold value will enable us to decide the features. The selected features are deployed as training inputs for various classifiers to select a classifier that yields maximum accuracy. We use a multilayer perceptron classifier as the final classifier, as it provides better accuracy when compared to other conventional classification models. The proposed method classifies the new datasets into either attack or normal classes with an efficiency of 98.3% and also reduces the overall computation time. We use the CAIDA 2007 dataset to evaluate the performance of the proposed method using MATLAB and Weka 3.6 simulators.

Diabetes has become a serious problem now a day. So there is a need to take serious precautions to eradicate this. To eradicate, we should know the level of occurrence. In this project we predict the level of occurrence of diabetes. We predict the level of occurrence of diabetes using Random Forest, a Machine Learning Algorithm. Using the patient’s Electronic Health Records (EHR) we can build accurate models that predict the presence of diabetes.


2020 ◽  
Author(s):  
Maria Irmina Prasetiyowati ◽  
Nur Ulfa Maulidevi ◽  
Kridanto Surendro

Abstract Feature selection is a preprocessing technique aims to remove the unnecessary features and speed up the algorithm's work process. One of the feature selection techniques is by calculating the information gain value of each feature in a dataset. From the information gain value obtained, then the determined threshold value will be used to make feature selection. Generally, the threshold value is used freely, or using a value of 0.05. This study proposed the determination of the threshold value using the standard deviation of the information gain value generated by each feature in the dataset. The determination of this threshold value was tested on ten original datasets and datasets that had been transformed by FFT and IFFT, then classified using Random Forest. The results of the average value of accuracy and the average time required from the Random Forest classification using the proposed threshold value are better compared to the results of feature selection with a threshold value of 0.05 and the Correlation-Base Feature Selection algorithm. Likewise, the result of the average accuracy value of the proposed threshold using a transformed dataset in terms are better than the threshold value of 0.05 and the Correlation-Base Feature Selection algorithm. However, the calculation results for the average time required are higher (slower).


Genomics ◽  
2020 ◽  
Vol 112 (2) ◽  
pp. 1916-1925 ◽  
Author(s):  
Pilar García-Díaz ◽  
Isabel Sánchez-Berriel ◽  
Juan A. Martínez-Rojas ◽  
Ana M. Diez-Pascual

Sign in / Sign up

Export Citation Format

Share Document