An empirical study to estimate the stability of random forest classifier on the hybrid features recommended by filter based feature selection technique

2019 ◽  
Vol 11 (2) ◽  
pp. 339-358
Author(s):  
S. L. Shiva Darshan ◽  
C. D. Jaidhar
Cybersecurity ◽  
2022 ◽  
Vol 5 (1) ◽  
Author(s):  
Raisa Abedin Disha ◽  
Sajjad Waheed

AbstractTo protect the network, resources, and sensitive data, the intrusion detection system (IDS) has become a fundamental component of organizations that prevents cybercriminal activities. Several approaches have been introduced and implemented to thwart malicious activities so far. Due to the effectiveness of machine learning (ML) methods, the proposed approach applied several ML models for the intrusion detection system. In order to evaluate the performance of models, UNSW-NB 15 and Network TON_IoT datasets were used for offline analysis. Both datasets are comparatively newer than the NSL-KDD dataset to represent modern-day attacks. However, the performance analysis was carried out by training and testing the Decision Tree (DT), Gradient Boosting Tree (GBT), Multilayer Perceptron (MLP), AdaBoost, Long-Short Term Memory (LSTM), and Gated Recurrent Unit (GRU) for the binary classification task. As the performance of IDS deteriorates with a high dimensional feature vector, an optimum set of features was selected through a Gini Impurity-based Weighted Random Forest (GIWRF) model as the embedded feature selection technique. This technique employed Gini impurity as the splitting criterion of trees and adjusted the weights for two different classes of the imbalanced data to make the learning algorithm understand the class distribution. Based upon the importance score, 20 features were selected from UNSW-NB 15 and 10 features from the Network TON_IoT dataset. The experimental result revealed that DT performed well with the feature selection technique than other trained models of this experiment. Moreover, the proposed GIWRF-DT outperformed other existing methods surveyed in the literature in terms of the F1 score.


Author(s):  
Hua Tang ◽  
Chunmei Zhang ◽  
Rong Chen ◽  
Po Huang ◽  
Chenggang Duan ◽  
...  

Author(s):  
Uttamarani Pati ◽  
Papia Ray ◽  
Arvind R. Singh

Abstract Very short term load forecasting (VSTLF) plays a pivotal role in helping the utility workers make proper decisions regarding generation scheduling, size of spinning reserve, and maintaining equilibrium between the power generated by the utility to fulfil the load demand. However, the development of an effective VSTLF model is challenging in gathering noisy real-time data and complicates features found in load demand variations from time to time. A hybrid approach for VSTLF using an incomplete fuzzy decision system (IFDS) combined with a genetic algorithm (GA) based feature selection technique for load forecasting in an hour ahead format is proposed in this research work. This proposed work aims to determine the load features and eliminate redundant features to form a less complex forecasting model. The proposed method considers the time of the day, temperature, humidity, and dew point as inputs and generates output as forecasted load. The input data and historical load data are collected from the Northern Regional Load Dispatch Centre (NRLDC) New Delhi for December 2009, January 2010 and February 2010. For validation of proposed method efficacy, it’s performance is further compared with other conventional AI techniques like ANN and ANFIS, which are integrated with genetic algorithm-based feature selection technique to boost their performance. These techniques’ accuracy is tested through their mean absolute percentage error (MAPE) and normalized root mean square error (nRMSE) value. Compared to other conventional AI techniques and other methods provided through previous studies, the proposed method is found to have acceptable accuracy for 1 h ahead of electrical load forecasting.


Sign in / Sign up

Export Citation Format

Share Document