Performance Comparison of Feature Selection and Extraction Methods with Random Instance Selection

2021 ◽  
pp. 115072
Author(s):  
Milad Malekipirbazari ◽  
Vural Aksakalli ◽  
Waleed Shafqat ◽  
Andrew Eberhard
2014 ◽  
Vol 2014 ◽  
pp. 1-9 ◽  
Author(s):  
Yuchen Qiu ◽  
Jie Song ◽  
Xianglan Lu ◽  
Yuhua Li ◽  
Bin Zheng ◽  
...  

Background. The purpose of this study is to identify a set of features for optimizing the performance of metaphase chromosome detection under high throughput scanning microscopy. In the development of computer-aided detection (CAD) scheme, feature selection is critically important, as it directly determines the accuracy of the scheme. Although many features have been examined previously, selecting optimal features is often application oriented.Methods. In this experiment, 200 bone marrow cells were first acquired by a high throughput scanning microscope. Then 9 different features were applied individually to group captured images into the clinically analyzable and unanalyzable classes. The performance of these different methods was assessed by a receiving operating characteristic (ROC) method.Results. The results show that using the number of labeled regions on each acquired image is suitable for the first on-line CAD scheme. For the second off-line CAD scheme, it would be suggested to combine four feature extraction methods including the number of labeled regions, average regions area, average region pixel value, and the standard deviation of either region distance or circularity.Conclusion. This study demonstrates an effective method of feature selection and comparison to facilitate the optimization of the CAD schemes for high throughput scanning microscope in the future.


2012 ◽  
Vol 532-533 ◽  
pp. 1191-1195 ◽  
Author(s):  
Zhen Yan Liu ◽  
Wei Ping Wang ◽  
Yong Wang

This paper introduces the design of a text categorization system based on Support Vector Machine (SVM). It analyzes the high dimensional characteristic of text data, the reason why SVM is suitable for text categorization. According to system data flow this system is constructed. This system consists of three subsystems which are text representation, classifier training and text classification. The core of this system is the classifier training, but text representation directly influences the currency of classifier and the performance of the system. Text feature vector space can be built by different kinds of feature selection and feature extraction methods. No research can indicate which one is the best method, so many feature selection and feature extraction methods are all developed in this system. For a specific classification task every feature selection method and every feature extraction method will be tested, and then a set of the best methods will be adopted.


2021 ◽  
Vol 6 (22) ◽  
pp. 51-59
Author(s):  
Mustazzihim Suhaidi ◽  
Rabiah Abdul Kadir ◽  
Sabrina Tiun

Extracting features from input data is vital for successful classification and machine learning tasks. Classification is the process of declaring an object into one of the predefined categories. Many different feature selection and feature extraction methods exist, and they are being widely used. Feature extraction, obviously, is a transformation of large input data into a low dimensional feature vector, which is an input to classification or a machine learning algorithm. The task of feature extraction has major challenges, which will be discussed in this paper. The challenge is to learn and extract knowledge from text datasets to make correct decisions. The objective of this paper is to give an overview of methods used in feature extraction for various applications, with a dataset containing a collection of texts taken from social media.


PLoS ONE ◽  
2019 ◽  
Vol 14 (3) ◽  
pp. e0213584 ◽  
Author(s):  
Olga Krakovska ◽  
Gregory Christie ◽  
Andrew Sixsmith ◽  
Martin Ester ◽  
Sylvain Moreno

Sign in / Sign up

Export Citation Format

Share Document