PROTEOMIC BIOMARKER IDENTIFICATION FOR DIAGNOSIS OF EARLY RELAPSE IN OVARIAN CANCER

Ovarian cancer recurs at the rate of 75% within a few months or several years later after therapy. Early recurrence, though responding better to treatment, is difficult to detect. Surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF) mass spectrometry has showed the potential to accurately identify disease biomarkers to help early diagnosis. A major challenge in the interpretation of SELDI-TOF data is the high dimensionality of the feature space. To tackle this problem, we have developed a multi-step data processing method composed of t-test, binning and backward feature selection. A new algorithm, support vector machine-Markov blanket/recursive feature elimination (SVM-MB/RFE) is presented for the backward feature selection. This method is an integration of minimum weight feature elimination by SVM-RFE and information theory based redundant/irrelevant feature removal by Markov Blanket. Subsequently, SVM was used for classification. We conducted the biomarker selection algorithm on 113 serum samples to identify early relapse from ovarian cancer patients after primary therapy. To validate the performance of the proposed algorithm, experiments were carried out in comparison with several other feature selection and classification algorithms.

Download Full-text

Evaluation of Feature Selection Techniques in a Multifrequency Large Amplitude Pulse Voltammetric Electronic Tongue

Engineering Proceedings ◽

10.3390/ecsa-7-08242 ◽

2020 ◽

Vol 2 (1) ◽

pp. 62

Author(s):

Luis F. Villamil-Cubillos ◽

Jersson X. Leon-Medina ◽

Maribel Anaya ◽

Diego A. Tibaduiza

Keyword(s):

Feature Selection ◽

Large Amplitude ◽

Electronic Tongue ◽

Supervised Machine Learning ◽

Recursive Feature Elimination ◽

Support Vector ◽

Variance Filter ◽

Supervised Machine Learning Classifiers ◽

Voltammetric Electronic Tongue ◽

Feature Selection Techniques

An electronic tongue is a device composed of a sensor array that takes advantage of the cross sensitivity property of several sensors to perform classification and quantification in liquid substances. In practice, electronic tongues generate a large amount of information that needs to be correctly analyzed, to define which interactions and features are more relevant to distinguish one substance from another. This work focuses on implementing and validating feature selection methodologies in the liquid classification process of a multifrequency large amplitude pulse voltammetric (MLAPV) electronic tongue. Multi-layer perceptron neural network (MLP NN) and support vector machine (SVM) were used as supervised machine learning classifiers. Different feature selection techniques were used, such as Variance filter, ANOVA F-value, Recursive Feature Elimination and model-based selection. Both 5-fold Cross validation and GridSearchCV were used in order to evaluate the performance of the feature selection methodology by testing various configurations and determining the best one. The methodology was validated in an imbalanced MLAPV electronic tongue dataset of 13 different liquid substances, reaching a 93.85% of classification accuracy.

Download Full-text

The Evaluation of Accuracy Performance in an Enhanced Embedded Feature Selection for Unstructured Text Classification

Iraqi Journal of Science ◽

10.24996/ijs.2020.61.12.28 ◽

2020 ◽

pp. 3397-3407

Author(s):

Nur Syafiqah Mohd Nafis ◽

Suryanti Awang

Keyword(s):

Feature Selection ◽

Text Classification ◽

Training Dataset ◽

Recursive Feature Elimination ◽

High Dimensional ◽

Significant Feature ◽

Support Vector ◽

Svm Classifier ◽

Text Documents ◽

Text Document

Text documents are unstructured and high dimensional. Effective feature selection is required to select the most important and significant feature from the sparse feature space. Thus, this paper proposed an embedded feature selection technique based on Term Frequency-Inverse Document Frequency (TF-IDF) and Support Vector Machine-Recursive Feature Elimination (SVM-RFE) for unstructured and high dimensional text classificationhis technique has the ability to measure the feature’s importance in a high-dimensional text document. In addition, it aims to increase the efficiency of the feature selection. Hence, obtaining a promising text classification accuracy. TF-IDF act as a filter approach which measures features importance of the text documents at the first stage. SVM-RFE utilized a backward feature elimination scheme to recursively remove insignificant features from the filtered feature subsets at the second stage. This research executes sets of experiments using a text document retrieved from a benchmark repository comprising a collection of Twitter posts. Pre-processing processes are applied to extract relevant features. After that, the pre-processed features are divided into training and testing datasets. Next, feature selection is implemented on the training dataset by calculating the TF-IDF score for each feature. SVM-RFE is applied for feature ranking as the next feature selection step. Only top-rank features will be selected for text classification using the SVM classifier. Based on the experiments, it shows that the proposed technique able to achieve 98% accuracy that outperformed other existing techniques. In conclusion, the proposed technique able to select the significant features in the unstructured and high dimensional text document.

Download Full-text

A Machine Learning Approach to EEG-based Prediction of Human Affective States Using Recursive Feature Elimination Method

MATEC Web of Conferences ◽

10.1051/matecconf/202133504001 ◽

2021 ◽

Vol 335 ◽

pp. 04001

Author(s):

Didar Dadebayev ◽

Goh Wei Wei ◽

Tan Ee Xion

Keyword(s):

Feature Selection ◽

Fractal Dimension ◽

Emotion Recognition ◽

Affective Computing ◽

Feature Selection Method ◽

Recursive Feature Elimination ◽

Support Vector ◽

Emotional States ◽

Affective States ◽

Emotion Classification

Emotion recognition, as a branch of affective computing, has attracted great attention in the last decades as it can enable more natural brain-computer interface systems. Electroencephalography (EEG) has proven to be an effective modality for emotion recognition, with which user affective states can be tracked and recorded, especially for primitive emotional events such as arousal and valence. Although brain signals have been shown to correlate with emotional states, the effectiveness of proposed models is somewhat limited. The challenge is improving accuracy, while appropriate extraction of valuable features might be a key to success. This study proposes a framework based on incorporating fractal dimension features and recursive feature elimination approach to enhance the accuracy of EEG-based emotion recognition. The fractal dimension and spectrum-based features to be extracted and used for more accurate emotional state recognition. Recursive Feature Elimination will be used as a feature selection method, whereas the classification of emotions will be performed by the Support Vector Machine (SVM) algorithm. The proposed framework will be tested with a widely used public database, and results are expected to demonstrate higher accuracy and robustness compared to other studies. The contributions of this study are primarily about the improvement of the EEG-based emotion classification accuracy. There is a potential restriction of how generic the results can be as different EEG dataset might yield different results for the same framework. Therefore, experimenting with different EEG dataset and testing alternative feature selection schemes can be very interesting for future work.

Download Full-text

Hybrid adapted fast correlation FCBF-support vector machine recursive feature elimination for feature selection

Intelligent Decision Technologies ◽

10.3233/idt-190014 ◽

2020 ◽

Vol 14 (3) ◽

pp. 269-279

Author(s):

Hayet Djellali ◽

Nacira Ghoualmi-Zine ◽

Souad Guessoum

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Recursive Feature Elimination ◽

Support Vector ◽

Svm Classifier ◽

Hybrid Architecture ◽

Features Selection ◽

K Nearest Neighbors ◽

Correlation Based Feature Selection ◽

Embedded Method

This paper investigates feature selection methods based on hybrid architecture using feature selection algorithm called Adapted Fast Correlation Based Feature selection and Support Vector Machine Recursive Feature Elimination (AFCBF-SVMRFE). The AFCBF-SVMRFE has three stages and composed of SVMRFE embedded method with Correlation based Features Selection. The first stage is the relevance analysis, the second one is a redundancy analysis, and the third stage is a performance evaluation and features restoration stage. Experiments show that the proposed method tested on different classifiers: Support Vector Machine SVM and K nearest neighbors KNN provide a best accuracy on various dataset. The SVM classifier outperforms KNN classifier on these data. The AFCBF-SVMRFE outperforms FCBF multivariate filter, SVMRFE, Particle swarm optimization PSO and Artificial bees colony ABC.

Download Full-text

Credit Scoring Using Support Vector Machine: A Comparative Analysis

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.6527 ◽

2012 ◽

Vol 433-440 ◽

pp. 6527-6533 ◽

Cited By ~ 2

Author(s):

S. Harikrishna ◽

M.A.H. Farquad ◽

Shabana

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Decision Makers ◽

Recursive Feature Elimination ◽

Second Step ◽

Support Vector ◽

Ensemble Of Classifiers ◽

Well Efficiency ◽

Credit Data ◽

Intelligent Models

Credit Scoring is the use of statistical/intelligent models to transform relevant data into numerical measures that guide the management and decision makers to make decisions such as accept/reject, pricing, pay/no pay and collections. This study focuses on predicting whether a credit applicant can be categorized as good or bad from the supplied data. Many researchers have recently worked on an ensemble of classifiers for such problems. It is observed from the literature that feature selection reduces the complexity of the system and improves the accuracy as well. Efficiency of SVM for feature selection and as a classifier in tandem and its application to credit scoring is analyzed in this paper. During the first step, SVM-RFE (Recursive Feature Elimination) is employed for feature selection and during the second step various architectures of SVM viz., Standard SVM, PSO-SVM and EVO-SVM are employed for classification purpose. The effectiveness of various approaches tested are evaluated using UK credit data and German credit data. It is observed that feature selection using SVM-RFE not only simplifies the process of credit scoring but also improves the accuracy of the system.

Download Full-text

An Adaptive Genetic Algorithm with Recursive Feature Elimination Approach for Predicting Malaria Vector Gene Expression Data Classification using Support Vector Machine Kernels

Walailak Journal of Science and Technology (WJST) ◽

10.48048/wjst.2021.9849 ◽

2021 ◽

Vol 18 (17) ◽

Author(s):

Micheal Olaolu AROWOLO ◽

Marion Olubunmi ADEBIYI ◽

Chiebuka Timothy NNODIM ◽

Sulaiman Olaniyi ABDULSALAM ◽

Ayodele Ariyo ADEBIYI

Keyword(s):

Gene Expression ◽

Genetic Algorithm ◽

Support Vector Machine ◽

Feature Selection ◽

Malaria Vector ◽

Recursive Feature Elimination ◽

Support Vector ◽

Adaptive Genetic Algorithm ◽

Rna Seq ◽

Gene Expressions

As mosquito parasites breed across many parts of the sub-Saharan Africa part of the world, infected cells embrace an unpredictable and erratic life period. Millions of individual parasites have gene expressions. Ribonucleic acid sequencing (RNA-seq) is a popular transcriptional technique that has improved the detection of major genetic probes. The RNA-seq analysis generally requires computational improvements of machine learning techniques since it computes interpretations of gene expressions. For this study, an adaptive genetic algorithm (A-GA) with recursive feature elimination (RFE) (A-GA-RFE) feature selection algorithms was utilized to detect important information from a high-dimensional gene expression malaria vector RNA-seq dataset. Support Vector Machine (SVM) kernels were used as the classification algorithms to evaluate its predictive performances. The feasibility of this study was confirmed by using an RNA-seq dataset from the mosquito Anopheles gambiae. The technique results in related performance had 98.3 and 96.7 % accuracy rates, respectively. HIGHLIGHTS Dimensionality reduction method based of feature selection Classification using Support vector machine Classification of malaria vector dataset using an adaptive GA-RFE-SVM GRAPHICAL ABSTRACT

Download Full-text

SVM-RFE Based Feature Selection and Taguchi Parameters Optimization for Multiclass SVM Classifier

The Scientific World JOURNAL ◽

10.1155/2014/795624 ◽

2014 ◽

Vol 2014 ◽

pp. 1-10 ◽

Cited By ~ 30

Author(s):

Mei-Ling Huang ◽

Yung-Hsiang Hung ◽

W. M. Lee ◽

R. K. Li ◽

Bo-Ru Jiang

Keyword(s):

Feature Selection ◽

Classification Accuracy ◽

Explanatory Power ◽

Disease Diagnosis ◽

Parameters Optimization ◽

Recursive Feature Elimination ◽

Support Vector ◽

Svm Classifier ◽

Classification Problems ◽

Class Variable

Recently, support vector machine (SVM) has excellent performance on classification and prediction and is widely used on disease diagnosis or medical assistance. However, SVM only functions well on two-group classification problems. This study combines feature selection and SVM recursive feature elimination (SVM-RFE) to investigate the classification accuracy of multiclass problems for Dermatology and Zoo databases. Dermatology dataset contains 33 feature variables, 1 class variable, and 366 testing instances; and the Zoo dataset contains 16 feature variables, 1 class variable, and 101 testing instances. The feature variables in the two datasets were sorted in descending order by explanatory power, and different feature sets were selected by SVM-RFE to explore classification accuracy. Meanwhile, Taguchi method was jointly combined with SVM classifier in order to optimize parametersCandγto increase classification accuracy for multiclass classification. The experimental results show that the classification accuracy can be more than 95% after SVM-RFE feature selection and Taguchi parameter optimization for Dermatology and Zoo databases.

Download Full-text