Effective Feature Set Selection and Centroid Classifier Algorithm for Web Services Discovery

<p>Text preprocessing and document classification plays a vital role in web services discovery. Nearest centroid classifiers were mostly employed in high-dimensional application including genomics. Feature selection is a major problem in all classifiers and in this paper we propose to use an effective feature selection procedure followed by web services discovery through Centroid classifier algorithm. The task here in this problem statement is to effectively assign a document to one or more classes. Besides being simple and robust, the centroid classifier s not effectively used for document classification due to the computational complexity and larger memory requirements. We address these problems through dimensionality reduction and effective feature set selection before training and testing the classifier. Our preliminary experimentation and results shows that the proposed method outperforms other algorithms mentioned in the literature including K-Nearest neighbors, Naive Bayes classifier and Support Vector Machines.</p>

Download Full-text

BREAST CANCER DETECTION USING RSFS-BASED FEATURE SELECTION ALGORITHMS IN THERMAL IMAGES

Biomedical Engineering Applications Basis and Communications ◽

10.4015/s1016237221500204 ◽

2021 ◽

pp. 2150020

Author(s):

Nazila Darabi ◽

Abdalhossein Rezai ◽

Seyedeh Shahrbanoo Falahieh Hamidpour

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Cancer Detection ◽

Vital Role ◽

Support Vector ◽

Computer Aided Detection ◽

K Nearest Neighbors ◽

Cad System ◽

Common Cancer ◽

Selection Algorithms

Breast cancer is a common cancer in female. Accurate and early detection of breast cancer can play a vital role in treatment. This paper presents and evaluates a thermogram based Computer-Aided Detection (CAD) system for the detection of breast cancer. In this CAD system, the Random Subset Feature Selection (RSFS) algorithm and hybrid of minimum Redundancy Maximum Relevance (mRMR) algorithm and Genetic Algorithm (GA) with RSFS algorithm are utilized for feature selection. In addition, the Support Vector Machine (SVM) and k-Nearest Neighbors (kNN) algorithms are utilized as classifier algorithm. The proposed CAD system is verified using MATLAB 2017 and a dataset that is composed of breast images from 78 patients. The implementation results demonstrate that using RSFS algorithm for feature selection and kNN and SVM algorithms as classifier have accuracy of 85.36% and 75%, and sensitivity of 94.11% and 79.31%, respectively. In addition, using hybrid GA and RSFS algorithm for feature selection and kNN and SVM algorithms as classifier have accuracy of 83.87% and 69.56%, and sensitivity of 96% and 81.81%, respectively, and using hybrid mRMR and RSFS algorithms for feature selection and kNN and SVM algorithms as classifier have accuracy of 77.41% and 73.07%, and sensitivity of 98% and 72.72%, respectively.

Download Full-text

An automatic ECG arrhythmia diagnosis system using support vector machines optimised with GOA and entropy-based feature selection procedure

International Journal of Medical Engineering and Informatics ◽

10.1504/ijmei.2022.119309 ◽

2022 ◽

Vol 14 (1) ◽

pp. 52

Author(s):

Abdullah Jafari Chashmi ◽

Mehdi Chehel Amirani

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Selection Procedure ◽

Support Vector ◽

Diagnosis System ◽

Vector Machines ◽

Ecg Arrhythmia

Download Full-text

An automatic ECG arrhythmia diagnosis system using support vector machines optimised with GOA and entropy-based feature selection procedure

International Journal of Medical Engineering and Informatics ◽

10.1504/ijmei.2021.10035812 ◽

2021 ◽

Vol 1 (1) ◽

pp. 1

Author(s):

Abdullah Jafari Chashmi ◽

Mehdi Chehel Amirani

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Selection Procedure ◽

Support Vector ◽

Diagnosis System ◽

Vector Machines ◽

Ecg Arrhythmia

Download Full-text

Identification of Urdu Ghazal Poets using SVM

Mehran University Research Journal of Engineering and Technology ◽

10.22581/muet1982.1904.07 ◽

2020 ◽

Vol 38 (4) ◽

pp. 935-944

Author(s):

Nida Tariq ◽

Iqra Ijaz ◽

Muhammad Kamran Malik ◽

Zubair Malik ◽

Faisal Bukhari

Keyword(s):

Feature Selection ◽

Support Vector ◽

Sentence Structure ◽

Writing Style ◽

K Nearest Neighbors ◽

Chi Square ◽

Urdu Literature ◽

Vector Machines ◽

Two Factors ◽

Feature Selection Techniques

Urdu literature has a rich tradition of poetry, with many forms, one of which is Ghazal. Urdu poetry structures are mainly of Arabic origin. It has complex and different sentence structure compared to our daily language which makes it hard to classify. Our research is focused on the identification of poets if given with ghazals as input. Previously, no one has done this type of work. Two main factors which help categorize and classify a given text are the contents and writing style. Urdu poets like Mirza Ghalib, Mir Taqi Mir, Iqbal and many others have a different writing style and the topic of interest. Our model caters these two factors, classify ghazals using different classification models such as SVM (Support Vector Machines), Decision Tree, Random forest, Naïve Bayes and KNN (K-Nearest Neighbors). Furthermore, we have also applied feature selection techniques like chi square model and L1 based feature selection. For experimentation, we have prepared a dataset of about 4000 Ghazals. We have also compared the accuracy of different classifiers and concluded the best results for the collected dataset of Ghazals.

Download Full-text

Persian Handwritten Number Recognition Using Adapted Framing Feature and Support Vector Machines

International Journal of Computational Intelligence and Applications ◽

10.1142/s1469026816500048 ◽

2016 ◽

Vol 15 (01) ◽

pp. 1650004 ◽

Cited By ~ 3

Author(s):

Hedieh Sajedi ◽

Mehran Bahador

Keyword(s):

Support Vector Machines ◽

Recognition Rate ◽

Nearest Neighbors ◽

Polynomial Kernel ◽

Support Vector ◽

K Nearest Neighbors ◽

New Approach ◽

Number Recognition ◽

Vector Machines

In this paper, a new approach for segmentation and recognition of Persian handwritten numbers is presented. This method utilizes the framing feature technique in combination with outer profile feature that we named this the adapted framing feature. In our proposed approach, segmentation of the numbers into digits has been carried out automatically. In the classification stage of the proposed method, Support Vector Machines (SVM) and k-Nearest Neighbors (k-NN) are used. Experimentations are conducted on the IFHCDB database consisting 17,740 numeral images and HODA database consisting 102,352 numeral images. In isolated digit level on IFHCDB, the recognition rate of 99.27%, is achieved by using SVM with polynomial kernel. Furthermore, in isolated digit level on HODA, the recognition rate of 99.07% is achieved by using SVM with polynomial kernel. The experiments illustrate that applying our proposed method resulted higher accuracy compared to previous researches.

Download Full-text

Minimax feature selection problem for constructing a classifier using support vector machines

Computational Mathematics and Mathematical Physics ◽

10.1134/s0965542510050143 ◽

2010 ◽

Vol 50 (5) ◽

pp. 917-925

Author(s):

Yu. V. Goncharov

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Selection Problem ◽

Support Vector ◽

Feature Selection Problem ◽

Vector Machines

Download Full-text

High dimensional data classification and feature selection using support vector machines

European Journal of Operational Research ◽

10.1016/j.ejor.2017.08.040 ◽

2018 ◽

Vol 265 (3) ◽

pp. 993-1004 ◽

Cited By ~ 63

Author(s):

Bissan Ghaddar ◽

Joe Naoum-Sawaya

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

High Dimensional Data ◽

Data Classification ◽

High Dimensional ◽

Support Vector ◽

Vector Machines

Download Full-text

Radar Emitter Signal Recognition Based on Feature Selection and Support Vector Machines

Lecture Notes in Computer Science - Advances in Intelligent Computing ◽

10.1007/11538059_74 ◽

2005 ◽

pp. 707-716 ◽

Cited By ~ 2

Author(s):

Gexiang Zhang ◽

Zhexin Cao ◽

Yajun Gu ◽

Weidong Jin ◽

Laizhao Hu

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Support Vector ◽

Signal Recognition ◽

Vector Machines

Download Full-text

Inspeção Automática de Defeitos em Madeiras de Pinus usando Visão Computacional

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.7033 ◽

2008 ◽

Vol 15 (2) ◽

pp. 203-218

Author(s):

Luiz E. S. Oliveira ◽

Paulo R. Cavalin ◽

Alceu S. Britto Jr ◽

Alessandro L. Koerich

Keyword(s):

Neural Networks ◽

Genetic Algorithms ◽

Feature Selection ◽

Defect Detection ◽

Color Image ◽

Support Vector ◽

Grayscale Image ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Vector Machines

This paper addresses the issue of detecting defects in Pine wood using features extracted from grayscale images. The feature set proposed here is based on the concept of texture and it is computed from the co-occurrence matrices. The features provide measures of properties such as smoothness, coarseness, and regularity. Comparative experiments using a color image based feature set extracted from percentile histograms are carried to demonstrate the efficiency of the proposed feature set. Two different learning paradigms, neural networks and support vector machines, and a feature selection algorithm based on multi-objective genetic algorithms were considered in our experiments. The experimental results show that after feature selection, the grayscale image based feature set achieves very competitive performance for the problem of wood defect detection relative to the color image based features.

Download Full-text

A Survey on Phishing Detection and The Importance of Feature Selection In Data Mining Classification Algorithms

Issue 4 - Journal of Science and Technology ◽

10.46243/jst.2020.v5.i6.pp11-18 ◽

2020 ◽

pp. 11-18

Keyword(s):

Data Mining ◽

Feature Selection ◽

Support Vector ◽

Classification Algorithms ◽

End User ◽

Preparation Methods ◽

Survey Paper ◽

Vector Machines ◽

Feature Selection Techniques ◽

Phishing Detection

: In this era of Internet, the issue of security of information is at its peak. One of the main threats in this cyber world is phishing attacks which is an email or website fraud method that targets the genuine webpage or an email and hacks it without the consent of the end user. There are various techniques which help to classify whether the website or an email is legitimate or fake. The major contributors in the process of detection of these phishing frauds include the classification algorithms, feature selection techniques or dataset preparation methods and the feature extraction that plays an important role in detection as well as in prevention of these attacks. This Survey Paper studies the effect of all these contributors and the approaches that are applied in the study conducted on the recent papers. Some of the classification algorithms that are implemented includes Decision tree, Random Forest , Support Vector Machines, Logistic Regression , Lazy K Star, Naive Bayes and J48 etc.

Download Full-text