Multiclass Response Feature Selection and Cancer Tumour Classification With Support Vector Machine

Author(s):  
A. W. Banjoko ◽  
W. B. Yahya ◽  
M. K. Garba

Background & Aim: In this study, efficient Support Vector Machine (SVM) algorithm for feature selection and classification of multi-category tumour classes of biological samples using gene expression profiles was proposed. Methods: Feature selection interface of the algorithm employed the F-statistic of the ANOVA–like testing scheme at some chosen family-wise-error-rate which ensured efficient detection of false-positive genes. The selected gene subsets using the above method were further screened for optimality using the Misclassification Error Rates yielded by each of them and their combinations in a sequential selection manner. In a 10-fold cross-validation, the optimal values of the SVM parameters with appropriate kernel were determined  for  tissue sample classification using one-versus-all approach. The entire data matrix was randomly partitioned into 95% training set to train the SVM classifier and 5% test set to evaluate the predictive performance of the classifier over 1,000 Monte-Carlo cross-validation runs. Published microarray breast cancer dataset with five clinical endpoints was employed to validate the results from the simulation studies. Results: Results from Monte-Carlo study showed excellent performance of the SVM classifier with higher prediction accuracy of the tissue samples based on the few gene biomarkers selected by the proposed feature selection method. Conclusion: SVM could be considered as a classification of multi-category tumour classes of biological

Author(s):  
Gang Liu ◽  
Chunlei Yang ◽  
Sen Liu ◽  
Chunbao Xiao ◽  
Bin Song

A feature selection method based on mutual information and support vector machine (SVM) is proposed in order to eliminate redundant feature and improve classification accuracy. First, local correlation between features and overall correlation is calculated by mutual information. The correlation reflects the information inclusion relationship between features, so the features are evaluated and redundant features are eliminated with analyzing the correlation. Subsequently, the concept of mean impact value (MIV) is defined and the influence degree of input variables on output variables for SVM network based on MIV is calculated. The importance weights of the features described with MIV are sorted by descending order. Finally, the SVM classifier is used to implement feature selection according to the classification accuracy of feature combination which takes MIV order of feature as a reference. The simulation experiments are carried out with three standard data sets of UCI, and the results show that this method can not only effectively reduce the feature dimension and high classification accuracy, but also ensure good robustness.


Worldwide, breast cancer is the leading type of cancer in women accounting for 25% of all cases. Survival rates in the developed countries are comparatively higher with that of developing countries. This had led to the importance of computer aided diagnostic methods for early detection of breast cancer disease. This eventually reduces the death rate. This paper intents the scope of the biomarker that can be used to predict the breast cancer from the anthropometric data. This experimental study aims at computing and comparing various classification models (Binary Logistic Regression, Ball Vector Machine (BVM), C4.5, Partial Least Square (PLS) for Classification, Classification Tree, Cost sensitive Classification Tree, Cost sensitive Decision Tree, Support Vector Machine for Classification, Core Vector Machine, ID3, K-Nearest Neighbor, Linear Discriminant Analysis (LDA), Log-Reg TRIRLS, Multi Layer Perceptron (MLP), Multinomial Logistic Regression (MLR), Naïve Bayes (NB), PLS for Discriminant Analysis, PLS for LDA, Random Tree (RT), Support Vector Machine SVM) for the UCI Coimbra breast cancer dataset. The feature selection algorithms (Backward Logit, Fisher Filtering, Forward Logit, ReleifF, Step disc) are worked out to find out the minimum attributes that can achieve a better accuracy. To ascertain the accuracy results, the Jack-knife cross validation method for the algorithms is conducted and validated. The Core vector machine classification algorithm outperforms the other nineteen algorithms with an accuracy of 82.76%, sensitivity of 76.92% and specificity of 87.50% for the selected three attributes, Age, Glucose and Resistin using ReleifF feature selection algorithm.


Mekatronika ◽  
2021 ◽  
Vol 3 (1) ◽  
pp. 27-31
Author(s):  
Ken-ji Ee ◽  
Ahmad Fakhri Bin Ab. Nasir ◽  
Anwar P. P. Abdul Majeed ◽  
Mohd Azraai Mohd Razman ◽  
Nur Hafieza Ismail

The animal classification system is a technology to classify the animal class (type) automatically and useful in many applications. There are many types of learning models applied to this technology recently. Nonetheless, it is worth noting that the extraction of the features and the classification of the animal features is non-trivial, particularly in the deep learning approach for a successful animal classification system. The use of Transfer Learning (TL) has been demonstrated to be a powerful tool in the extraction of essential features. However, the employment of such a method towards animal classification applications are somewhat limited. The present study aims to determine a suitable TL-conventional classifier pipeline for animal classification. The VGG16 and VGG19 were used in extracting features and then coupled with either k-Nearest Neighbour (k-NN) or Support Vector Machine (SVM) classifier. Prior to that, a total of 4000 images were gathered consisting of a total of five classes which are cows, goats, buffalos, dogs, and cats. The data was split into the ratio of 80:20 for train and test. The classifiers hyper parameters are tuned by the Grids Search approach that utilises the five-fold cross-validation technique. It was demonstrated from the study that the best TL pipeline identified is the VGG16 along with an optimised SVM, as it was able to yield an average classification accuracy of 0.975. The findings of the present investigation could facilitate animal classification application, i.e. for monitoring animals in wildlife.


Author(s):  
Rashmi K. Thakur ◽  
Manojkumar V. Deshpande

Sentiment analysis is one of the popular techniques gaining attention in recent times. Nowadays, people gain information on reviews of users regarding public transportation, movies, hotel reservation, etc., by utilizing the resources available, as they meet their needs. Hence, sentiment classification is an essential process employed to determine the positive and negative responses. This paper presents an approach for sentiment classification of train reviews using MapReduce model with the proposed Kernel Optimized-Support Vector Machine (KO-SVM) classifier. The MapReduce framework handles big data using a mapper, which performs feature extraction and reducer that classifies the review based on KO-SVM classification. The feature extraction process utilizes features that are classification-specific and SentiWordNet-based. KO-SVM adopts SVM for the classification, where the exponential kernel is replaced by an optimized kernel, finding the weights using a novel optimizer, Self-adaptive Lion Algorithm (SLA). In a comparative analysis, the performance of KO-SVM classifier is compared with SentiWordNet, NB, NN, and LSVM, using the evaluation metrics, specificity, sensitivity, and accuracy, with train review and movie review database. The proposed KO-SVM classifier could attain maximum sensitivity of 93.46% and 91.249% specificity of 74.485% and 70.018%; and accuracy of 84.341% and 79.611% respectively, for train review and movie review databases.


2020 ◽  
Vol 14 (3) ◽  
pp. 269-279
Author(s):  
Hayet Djellali ◽  
Nacira Ghoualmi-Zine ◽  
Souad Guessoum

This paper investigates feature selection methods based on hybrid architecture using feature selection algorithm called Adapted Fast Correlation Based Feature selection and Support Vector Machine Recursive Feature Elimination (AFCBF-SVMRFE). The AFCBF-SVMRFE has three stages and composed of SVMRFE embedded method with Correlation based Features Selection. The first stage is the relevance analysis, the second one is a redundancy analysis, and the third stage is a performance evaluation and features restoration stage. Experiments show that the proposed method tested on different classifiers: Support Vector Machine SVM and K nearest neighbors KNN provide a best accuracy on various dataset. The SVM classifier outperforms KNN classifier on these data. The AFCBF-SVMRFE outperforms FCBF multivariate filter, SVMRFE, Particle swarm optimization PSO and Artificial bees colony ABC.


Author(s):  
PETER MC LEOD ◽  
BRIJESH VERMA

This paper presents a novel technique for the classification of suspicious areas in digital mammograms. The proposed technique is based on clustering of input data into numerous clusters and amalgamating them with a Support Vector Machine (SVM) classifier. The technique is called multi-cluster support vector machine (MCSVM) and is designed to provide a fast converging technique with good generalization abilities leading to an improved classification as a benign or malignant class. The proposed MCSVM technique has been evaluated on data from the Digital Database of Screening Mammography (DDSM) benchmark database. The experimental results showed that the proposed MCSVM classifier achieves better results than standard SVM. A paired t-test and Anova analysis showed that the results are statistically significant.


Sign in / Sign up

Export Citation Format

Share Document