Fuzzy-rough Information Gain Ratio Approach to Filter-wrapper Feature Selection

2017 ◽  
Vol 30 (9) ◽  
2021 ◽  
Vol 8 (3) ◽  
pp. 457
Author(s):  
Nitami Lestari Putri ◽  
Radityo Adi Nugroho ◽  
Rudy Herteno

<p><em>Intrusion Detection System</em> merupakan suatu sistem yang dikembangkan untuk memantau dan memfilter aktivitas jaringan dengan mengidentifikasi serangan. Karena jumlah data yang perlu diperiksa oleh IDS sangat besar dan banyaknya fitur-fitur asing yang dapat membuat proses analisis menjadi sulit untuk mendeteksi pola perilaku yang mencurigakan, maka IDS perlu mengurangi jumlah data yang akan diproses dengan cara mengurangi fitur yang dapat dilakukan dengan seleksi fitur. Pada penelitian ini mengkombinasikan dua metode perangkingan fitur yaitu <em>Information Gain Ratio </em>dan <em>Correlation </em>dan mengklasifikasikannya menggunakan algoritma <em>K-Nearest Neighbor</em>. Hasil perankingan dari kedua metode dibagi menjadi dua kelompok. Pada kelompok pertama dicari nilai mediannya dan untuk kelompok kedua dihapus. Lalu dilakukan klasifikasi <em>K-Nearest Neighbor</em> dengan menggunakan 10 kali validasi silang dan dilakukan pengujian dengan nilai k=5. Penerapan pemodelan yang diusulkan menghasilkan akurasi tertinggi sebesar 99.61%. Sedangkan untuk akurasi tanpa seleksi fitur menghasilkan akurasi tertinggi sebesar 99.59%.</p><p> </p><p class="Judul2"><strong><em>Abstract</em></strong></p><p class="Abstract"><em>Intrusion Detection System is a system that was developed for monitoring and filtering activity in network with identified of attack. Because of the amount of the data that need to be checked by IDS is very large and many foreign feature that can make the analysis process difficult for detection suspicious pattern of behavior, so that IDS need for reduce amount of the data to be processed by reducing features that can be done by feature selection. In this study, combines two methods of feature ranking is Information Gain Ratio and Correlation and classify it using K-Nearest Neighbor algorithm. The result of feature ranking from the both methods divided into two groups. in the first group searched for the median value and in the second group is removed. Then do the classification of  K-Nearest Neighbor using 10 fold cross validation and do the tests with values k=5. The result of the  proposed modelling produce the highest accuracy of 99.61%. While the highest accuracy value of the not using the feature selection is 99.59%.</em></p>


2021 ◽  
Vol 11 (5) ◽  
pp. 7714-7719
Author(s):  
S. Nuanmeesri ◽  
W. Sriurai

The goal of the current study is to develop a diagnosis model for chili pepper disease diagnosis by applying filter and wrapper feature selection methods as well as a Multi-Layer Perceptron Neural Network (MLPNN). The data used for developing the model include 1) types, 2) causative agents, 3) areas of infection, 4) growth stages of infection, 5) conditions, 6) symptoms, and 7) 14 types of chili pepper diseases. These datasets were applied to the 3 feature selection techniques, including information gain, gain ratio, and wrapper. After selecting the key features, the selected datasets were utilized to develop the diagnosis model towards the application of MLPNN. According to the model’s effectiveness evaluation results, estimated by 10-fold cross-validation, it can be seen that the diagnosis model developed by applying the wrapper method along with MLPNN provided the highest level of effectiveness, with an accuracy of 98.91%, precision of 98.92%, and recall of 98.89%. The findings showed that the developed model is applicable.


Sign in / Sign up

Export Citation Format

Share Document