Classification Models and Hybrid Feature Selection Method to Improve Crop Performance

In this paper classification models and hybrid feature selection methods are implanted on benchmark dataset on the Mango and Maize. Particle Swarm Optimization–Support Vector Machine (PSO-SVM) classification algorithm for the selection of important features from the Mango and Maize datasets to analysis and also compare with the novel classification techniques. Various experiments conducted on these datasets, provide more generated rules and high selection of features using PSO-SVM algorithm and Fuzzy Decision Tree. The proposed method yield high accuracy output as compared to the existing methods with minimum Error Rate and Maximum Positive Rate.

Download Full-text

Implementasi teknik seleksi fitur pada klasifikasi malware Android menggunakan support vector machine (SVM)

Repositor ◽

10.22219/repositor.v1i1.1 ◽

2019 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Hendra Saputra ◽

Setio Basuki ◽

Mahar Faiqurahman

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Chi Square ◽

Android Malware ◽

Correlation Based Feature Selection ◽

Selection Of

AbstrakPertumbuhan Malware Android telah meningkat secara signifikan seiring dengan majunya jaman dan meninggkatnya keragaman teknik dalam pengembangan Android. Teknik Machine Learning adalah metode yang saat ini bisa kita gunakan dalam memodelkan pola fitur statis dan dinamis dari Malware Android. Dalam tingkat keakurasian dari klasifikasi jenis Malware peneliti menghubungkan antara fitur aplikasi dengan fitur yang dibutuhkan dari setiap jenis kategori Malware. Kategori jenis Malware yang digunakan merupakan jenis Malware yang banyak beredar saat ini. Untuk mengklasifikasi jenis Malware pada penelitian ini digunakan Support Vector Machine (SVM). Jenis SVM yang akan digunakan adalah class SVM one against one menggunakan Kernel RBF. Fitur yang akan dipakai dalam klasifikasi ini adalah Permission dan Broadcast Receiver. Untuk meningkatkan akurasi dari hasil klasifikasi pada penelitian ini digunakan metode Seleksi Fitur. Seleksi Fitur yang digunakan ialah Correlation-based Feature Selection (CSF), Gain Ratio (GR) dan Chi-Square (CHI). Hasil dari Seleksi Fitur akan di evaluasi bersama dengan hasil yang tidak menggunakan Seleksi Fitur. Akurasi klasifikasi Seleksi Fitur CFS menghasilkan akurasi sebesar 90.83% , GR dan CHI sebesar 91.25% dan data yang tidak menggunakan Seleksi Fitur sebesar 91.67%. Hasil dari pengujian menunjukan bahwa Permission dan Broadcast Receiver bisa digunakan dalam mengklasifikasi jenis Malware, akan tetapi metode Seleksi Fitur yang digunakan mempunyai akurasi yang berada sedikit dibawah data yang tidak menggunakan Seleksi Fitur. Kata kunci: klasifikasi malware android, seleksi fitur, SVM dan multi class SVM one agains one Abstract Android Malware has growth significantly along with the advance of the times and the increasing variety of technique in the development of Android. Machine Learning technique is a method that now we can use in the modeling the pattern of a static and dynamic feature of Android Malware. In the level of accuracy of the Malware type classification, the researcher connect between the application feature with the feature required by each types of Malware category. The category of malware used is a type of Malware that many circulating today, to classify the type of Malware in this study used Support Vector Machine (SVM). The SVM type wiil be used is class SVM one against one using the RBF Kernel. The feature will be used in this classification are the Permission and Broadcast Receiver. To improve the accuracy of the classification result in this study used Feature Selection method. Selection of feature used are Correlation-based Feature Selection (CFS), Gain Ratio (GR) and Chi-Square (CHI). Result from Feature Selection will be evaluated together with result that not use Feature Selection. Accuracy Classification Feature Selection CFS result accuracy of 90.83%, GR and CHI of 91.25% and data that not use Feature Selection of 91.67%. The result of testing indicate that permission and broadcast receiver can be used in classyfing type of Malware, but the Feature Selection method that used have accuracy is a little below the data that are not using Feature Selection. Keywords: Classification Android Malware, Feature Selection, SVM and Multi Class SVM one against one

Download Full-text

A Hybrid Approach for Feature Selection Based on Genetic Algorithm and Recursive Feature Elimination

International Journal of Information System Modeling and Design ◽

10.4018/ijismd.2021040102 ◽

2021 ◽

Vol 12 (2) ◽

pp. 17-38

Author(s):

Pooja Rani ◽

Rajneesh Kumar ◽

Anurag Jain ◽

Sunil Kumar Chawla

Keyword(s):

Machine Learning ◽

Genetic Algorithm ◽

Feature Selection ◽

Hybrid Approach ◽

Feature Selection Method ◽

Classification Systems ◽

Recursive Feature Elimination ◽

Support Vector ◽

Sensitivity Specificity ◽

Selection Of

Machine learning has become an integral part of our life in today's world. Machine learning when applied to real-world applications suffers from the problem of high dimensional data. Data can have unnecessary and redundant features. These unnecessary features affect the performance of classification systems used in prediction. Selection of important features is the first step in developing any decision support system. In this paper, the authors have proposed a hybrid feature selection method GARFE by integrating GA (genetic algorithm) and RFE (recursive feature elimination) algorithms. Efficiency of proposed method is analyzed using support vector machine classifier on the scale of accuracy, sensitivity, specificity, precision, F-measure, and execution time parameters. Proposed GARFE method is also compared to eight other feature selection methods. Results demonstrate that the proposed GARFE method has increased the performance of classification systems by removing irrelevant and redundant features.

Download Full-text

Reputation Scoring Fake News Using Text Mining

ACMIT Proceedings ◽

10.33555/acmit.v4i1.52 ◽

2017 ◽

Vol 4 (1) ◽

pp. 12-17

Author(s):

Ahmad Firdaus

Keyword(s):

Feature Selection ◽

Decision Tree ◽

Text Categorization ◽

Information Gain ◽

Feature Selection Method ◽

Support Vector ◽

Stable Level ◽

Vector Machines ◽

Selection Of

The classification of hoax news or news with incorrect information is one of the text categorization applications.Like text-based categorization of machine applications in general, this system consists of pre-processing andexecution of classification models. In this study, experiments were conducted to select the best technique in each sub-process by using 1200 articles hoax and 600 articles no hoax collected manually. This research Triedexperimenting to determine the best preprocessing stages between stop removals and stemming and showing the results of the deception Tree algorithm achieving an accuracy of 100% concluded above naive byes more stable level of accuracy in the number of datasets used in all candidates. Information gain, TFIDF and GGA based on using Naive Byes algorithm, supporting Vector Machine and Decision Tree no significant percentage change occurred on all candidates. But after using GGA (Optimize Generation) feature selection there is an increase of accuracy level The results of a comparison of classification algorithms between Naive Byes, decision trees and Support Vector machines combined with the GGA feature selection method for classifying the best result is generated by the selection of GGA + Decision Tree feature on candidate 2 (Paslon2) 100% and in the selection of the Information Gain + Decision Tree Feature selection with the lowest accuracy Candidate 3 at 36.67%, but overall improvement of accuracy Occurred on all algorithm after using feature selection and Naive byes more stable level of accuracy in the number of datasets used in all candidates.

Download Full-text

A fuzzy gaussian rank aggregation ensemble feature selection method for microarray data

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-190134 ◽

2021 ◽

Vol 24 (4) ◽

pp. 289-301

Author(s):

B. Venkatesh ◽

J. Anuradha

Keyword(s):

Feature Selection ◽

Microarray Data ◽

Classification Accuracy ◽

Performance Metrics ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Binary Particle Swarm Optimization ◽

Selection Methods

In Microarray Data, it is complicated to achieve more classification accuracy due to the presence of high dimensions, irrelevant and noisy data. And also It had more gene expression data and fewer samples. To increase the classification accuracy and the processing speed of the model, an optimal number of features need to extract, this can be achieved by applying the feature selection method. In this paper, we propose a hybrid ensemble feature selection method. The proposed method has two phases, filter and wrapper phase in filter phase ensemble technique is used for aggregating the feature ranks of the Relief, minimum redundancy Maximum Relevance (mRMR), and Feature Correlation (FC) filter feature selection methods. This paper uses the Fuzzy Gaussian membership function ordering for aggregating the ranks. In wrapper phase, Improved Binary Particle Swarm Optimization (IBPSO) is used for selecting the optimal features, and the RBF Kernel-based Support Vector Machine (SVM) classifier is used as an evaluator. The performance of the proposed model are compared with state of art feature selection methods using five benchmark datasets. For evaluation various performance metrics such as Accuracy, Recall, Precision, and F1-Score are used. Furthermore, the experimental results show that the performance of the proposed method outperforms the other feature selection methods.

Download Full-text

Design of Text Categorization System Based on SVM

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.532-533.1191 ◽

2012 ◽

Vol 532-533 ◽

pp. 1191-1195 ◽

Cited By ~ 1

Author(s):

Zhen Yan Liu ◽

Wei Ping Wang ◽

Yong Wang

Keyword(s):

Feature Extraction ◽

Feature Selection ◽

Text Categorization ◽

Feature Selection Method ◽

Extraction Methods ◽

Support Vector ◽

Text Representation ◽

Text Feature ◽

Categorization System ◽

Classifier Training

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.

Download Full-text

Feature Selection Method Based on Mutual Information and Support Vector Machine

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800142150021x ◽

2021 ◽

pp. 2150021

Author(s):

Gang Liu ◽

Chunlei Yang ◽

Sen Liu ◽

Chunbao Xiao ◽

Bin Song

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Mutual Information ◽

Classification Accuracy ◽

Feature Selection Method ◽

Selection Method ◽

Support Vector ◽

Svm Classifier ◽

Standard Data ◽

Feature Dimension

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.

Download Full-text

A Hybrid Feature Selection Method Based on Symmetrical Uncertainty and Support Vector Machine for High-Dimensional Data Classification

Intelligent Information and Database Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-319-54472-4_67 ◽

2017 ◽

pp. 721-727 ◽

Cited By ~ 2

Author(s):

Yongjun Piao ◽

Keun Ho Ryu

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

High Dimensional Data ◽

Feature Selection Method ◽

Data Classification ◽

Selection Method ◽

High Dimensional ◽

Support Vector ◽

Symmetrical Uncertainty

Download Full-text

Simultaneous Channel and Feature Selection of Fused EEG Features Based on Sparse Group Lasso

BioMed Research International ◽

10.1155/2015/703768 ◽

2015 ◽

Vol 2015 ◽

pp. 1-13 ◽

Cited By ~ 11

Author(s):

Jin-Jia Wang ◽

Fang Xue ◽

Hui Li

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Group Lasso ◽

High Dimensional ◽

Test Accuracy ◽

Gradient Descent Method ◽

Feature Subset ◽

Eeg Signals ◽

Sparse Group Lasso ◽

Selection Of

Feature extraction and classification of EEG signals are core parts of brain computer interfaces (BCIs). Due to the high dimension of the EEG feature vector, an effective feature selection algorithm has become an integral part of research studies. In this paper, we present a new method based on a wrapped Sparse Group Lasso for channel and feature selection of fused EEG signals. The high-dimensional fused features are firstly obtained, which include the power spectrum, time-domain statistics, AR model, and the wavelet coefficient features extracted from the preprocessed EEG signals. The wrapped channel and feature selection method is then applied, which uses the logistical regression model with Sparse Group Lasso penalized function. The model is fitted on the training data, and parameter estimation is obtained by modified blockwise coordinate descent and coordinate gradient descent method. The best parameters and feature subset are selected by using a 10-fold cross-validation. Finally, the test data is classified using the trained model. Compared with existing channel and feature selection methods, results show that the proposed method is more suitable, more stable, and faster for high-dimensional feature fusion. It can simultaneously achieve channel and feature selection with a lower error rate. The test accuracy on the data used from international BCI Competition IV reached 84.72%.

Download Full-text

An Approach Based on the Improved SVM Algorithm for Identifying Malware in Network Traffic

Security and Communication Networks ◽

10.1155/2021/5518909 ◽

2021 ◽

Vol 2021 ◽

pp. 1-14

Author(s):

Bo Liu ◽

Jinfu Chen ◽

Songling Qin ◽

Zufa Zhang ◽

Yisong Liu ◽

...

Keyword(s):

Network Traffic ◽

Cyber Security ◽

False Positive ◽

False Positive Rate ◽

Support Vector ◽

Traffic Classification ◽

Svm Algorithm ◽

Network Traffic Classification ◽

Positive Rate ◽

Function Selection

Due to the growth and popularity of the internet, cyber security remains, and will continue, to be an important issue. There are many network traffic classification methods or malware identification approaches that have been proposed to solve this problem. However, the existing methods are not well suited to help security experts effectively solve this challenge due to their low accuracy and high false positive rate. To this end, we employ a machine learning-based classification approach to identify malware. The approach extracts features from network traffic and reduces the dimensionality of the features, which can effectively improve the accuracy of identification. Furthermore, we propose an improved SVM algorithm for classifying the network traffic dubbed Optimized Facile Support Vector Machine (OFSVM). The OFSVM algorithm solves the problem that the original SVM algorithm is not satisfactory for classification from two aspects, i.e., parameter optimization and kernel function selection. Therefore, in this paper, we present an approach for identifying malware in network traffic, called Network Traffic Malware Identification (NTMI). To evaluate the effectiveness of the NTMI approach proposed in this paper, we collect four real network traffic datasets and use a publicly available dataset CAIDA for our experiments. Evaluation results suggest that the NTMI approach can lead to higher accuracy while achieving a lower false positive rate compared with other identification methods. On average, the NTMI approach achieves an accuracy of 92.5% and a false positive rate of 5.527%.

Download Full-text

A Machine Learning Approach to EEG-based Prediction of Human Affective States Using Recursive Feature Elimination Method

MATEC Web of Conferences ◽

10.1051/matecconf/202133504001 ◽

2021 ◽

Vol 335 ◽

pp. 04001

Author(s):

Didar Dadebayev ◽

Goh Wei Wei ◽

Tan Ee Xion

Keyword(s):

Feature Selection ◽

Fractal Dimension ◽

Emotion Recognition ◽

Affective Computing ◽

Feature Selection Method ◽

Recursive Feature Elimination ◽

Support Vector ◽

Emotional States ◽

Affective States ◽

Emotion Classification

Emotion recognition, as a branch of affective computing, has attracted great attention in the last decades as it can enable more natural brain-computer interface systems. Electroencephalography (EEG) has proven to be an effective modality for emotion recognition, with which user affective states can be tracked and recorded, especially for primitive emotional events such as arousal and valence. Although brain signals have been shown to correlate with emotional states, the effectiveness of proposed models is somewhat limited. The challenge is improving accuracy, while appropriate extraction of valuable features might be a key to success. This study proposes a framework based on incorporating fractal dimension features and recursive feature elimination approach to enhance the accuracy of EEG-based emotion recognition. The fractal dimension and spectrum-based features to be extracted and used for more accurate emotional state recognition. Recursive Feature Elimination will be used as a feature selection method, whereas the classification of emotions will be performed by the Support Vector Machine (SVM) algorithm. The proposed framework will be tested with a widely used public database, and results are expected to demonstrate higher accuracy and robustness compared to other studies. The contributions of this study are primarily about the improvement of the EEG-based emotion classification accuracy. There is a potential restriction of how generic the results can be as different EEG dataset might yield different results for the same framework. Therefore, experimenting with different EEG dataset and testing alternative feature selection schemes can be very interesting for future work.

Download Full-text