A New Feature Selection Algorithm for DNA Dataset

Author(s):  
Disha Khode ◽  
Antara Bhattacharya
Electronics ◽  
2020 ◽  
Vol 9 (1) ◽  
pp. 144 ◽  
Author(s):  
Yan Naung Soe ◽  
Yaokai Feng ◽  
Paulus Insap Santosa ◽  
Rudy Hartanto ◽  
Kouichi Sakurai

The application of a large number of Internet of Things (IoT) devices makes our life more convenient and industries more efficient. However, it also makes cyber-attacks much easier to occur because so many IoT devices are deployed and most of them do not have enough resources (i.e., computation and storage capacity) to carry out ordinary intrusion detection systems (IDSs). In this study, a lightweight machine learning-based IDS using a new feature selection algorithm is designed and implemented on Raspberry Pi, and its performance is verified using a public dataset collected from an IoT environment. To make the system lightweight, we propose a new algorithm for feature selection, called the correlated-set thresholding on gain-ratio (CST-GR) algorithm, to select really necessary features. Because the feature selection is conducted on three specific kinds of cyber-attacks, the number of selected features can be significantly reduced, which makes the classifiers very small and fast. Thus, our detection system is lightweight enough to be implemented and carried out in a Raspberry Pi system. More importantly, as the really necessary features corresponding to each kind of attack are exploited, good detection performance can be expected. The performance of our proposal is examined in detail with different machine learning algorithms, in order to learn which of them is the best option for our system. The experiment results indicate that the new feature selection algorithm can select only very few features for each kind of attack. Thus, the detection system is lightweight enough to be implemented in the Raspberry Pi environment with almost no sacrifice on detection performance.


2011 ◽  
Vol 24 (6) ◽  
pp. 904-914 ◽  
Author(s):  
Jieming Yang ◽  
Yuanning Liu ◽  
Zhen Liu ◽  
Xiaodong Zhu ◽  
Xiaoxu Zhang

2010 ◽  
Vol 139-141 ◽  
pp. 2506-2512 ◽  
Author(s):  
Sheng Li ◽  
Chun Liang Zhang ◽  
Xia Yue

To effectively avoid the loss of useful information, in this paper, feature information has been extracted from the fault signal of rotating machinery in different aspects such as amplitude-domain, time-domain and time-frequency domain. Then, for the multi-dimensional feature extraction was prone to the problem of “dimension disaster”, the principles of FDR was introduced in data mining to determine the classification ability of each individual feature, and the cross correlation coefficient was adopted to solve the problem that dealing with individual feature. Neglected the interrelationship between the features, a new feature selection algorithm was constructed. Finally, the eigenvectors were used for training and recognizing of SVM model. The experimental results showed the fault diagnosis system was valid and robust.


Sign in / Sign up

Export Citation Format

Share Document