An Improved Feature Selection Based on Effective Range for Classification

Feature selection is a key issue in the domain of machine learning and related fields. The results of feature selection can directly affect the classifier’s classification accuracy and generalization performance. Recently, a statistical feature selection method named effective range based gene selection (ERGS) is proposed. However, ERGS only considers the overlapping area (OA) among effective ranges of each class for every feature; it fails to handle the problem of the inclusion relation of effective ranges. In order to overcome this limitation, a novel efficient statistical feature selection approach called improved feature selection based on effective range (IFSER) is proposed in this paper. In IFSER, an including area (IA) is introduced to characterize the inclusion relation of effective ranges. Moreover, the samples’ proportion for each feature of every class in both OA and IA is also taken into consideration. Therefore, IFSER outperforms the original ERGS and some other state-of-the-art algorithms. Experiments on several well-known databases are performed to demonstrate the effectiveness of the proposed method.

Download Full-text

A Robust Gene selection Method for Microarray-based Cancer Classification

Cancer Informatics ◽

10.4137/cin.s3794 ◽

2010 ◽

Vol 9 ◽

pp. CIN.S3794 ◽

Cited By ~ 21

Author(s):

Xiaosheng Wang ◽

Osamu Gotoh

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Selection ◽

Information Gain ◽

Expression Profiles ◽

Feature Selection Method ◽

Gene Expression Profiles ◽

Molecular Classification ◽

Selection Method ◽

Chi Square

Gene selection is of vital importance in molecular classification of cancer using high-dimensional gene expression data. Because of the distinct characteristics inherent to specific cancerous gene expression profiles, developing flexible and robust feature selection methods is extremely crucial. We investigated the properties of one feature selection approach proposed in our previous work, which was the generalization of the feature selection method based on the depended degree of attribute in rough sets. We compared the feature selection method with the established methods: the depended degree, chi-square, information gain, Relief-F and symmetric uncertainty, and analyzed its properties through a series of classification experiments. The results revealed that our method was superior to the canonical depended degree of attribute based method in robustness and applicability. Moreover, the method was comparable to the other four commonly used methods. More importantly, the method can exhibit the inherent classification difficulty with respect to different gene expression datasets, indicating the inherent biology of specific cancers.

Download Full-text

Statistical Feature Selection Approach for Classification of Emotions From Speech

SSRN Electronic Journal ◽

10.2139/ssrn.3527262 ◽

2020 ◽

Author(s):

Nilima Salankar ◽

Anjali Mishra

Keyword(s):

Feature Selection ◽

Statistical Feature ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

A feature selection method based on effective range and SVM-RFE

International Journal of Wireless and Mobile Computing ◽

10.1504/ijwmc.2018.10016727 ◽

2018 ◽

Vol 15 (2) ◽

pp. 105 ◽

Cited By ~ 1

Author(s):

Yuansheng Yang ◽

Yifei Mao

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Effective Range

Download Full-text

Flow Feature Selection Method Based on Statistics

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1030-1032.1709 ◽

2014 ◽

Vol 1030-1032 ◽

pp. 1709-1712

Author(s):

Kai Min Song ◽

Xun Yi Ren

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Experimental Results ◽

Identification Algorithm ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Statistical Feature ◽

Flow Feature

Through the research on the flow identification algorithm based on statistical feature, this paper puts forward the statistical feature selection algorithm in order to reduce the number of features in identification, increase the speed of the flow identification, the experimental results show that the algorithm can effectively reduce the amount of features, improve the efficiency of identification.

Download Full-text

A Two-Stage Method Based on Multiobjective Differential Evolution for Gene Selection

Computational Intelligence and Neuroscience ◽

10.1155/2021/5227377 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Shuangbao Song ◽

Xingqian Chen ◽

Zheng Tang ◽

Yuki Todo

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Multiobjective Optimization ◽

Differential Evolution ◽

Gene Selection ◽

Feature Selection Method ◽

Selection Method ◽

Selection Problem ◽

Objective Functions ◽

Two Stage

Microarray gene expression data provide a prospective way to diagnose disease and classify cancer. However, in bioinformatics, the gene selection problem, i.e., how to select the most informative genes from thousands of genes, remains challenging. This problem is a specific feature selection problem with high-dimensional features and small sample sizes. In this paper, a two-stage method combining a filter feature selection method and a wrapper feature selection method is proposed to solve the gene selection problem. In contrast to common methods, the proposed method models the gene selection problem as a multiobjective optimization problem. Both stages employ the same multiobjective differential evolution (MODE) as the search strategy but incorporate different objective functions. The three objective functions of the filter method are mainly based on mutual information. The two objective functions of the wrapper method are the number of selected features and the classification error of a naive Bayes (NB) classifier. Finally, the performance of the proposed method is tested and analyzed on six benchmark gene expression datasets. The experimental results verified that this paper provides a novel and effective way to solve the gene selection problem by applying a multiobjective optimization algorithm.

Download Full-text

A feature selection method based on effective range and SVM-RFE

International Journal of Wireless and Mobile Computing ◽

10.1504/ijwmc.2018.095669 ◽

2018 ◽

Vol 15 (2) ◽

pp. 105

Author(s):

Yifei Mao ◽

Yuansheng Yang

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Effective Range

Download Full-text

A Novel Statistical Feature Selection Approach for Text Categorization

Journal of Information Processing Systems ◽

10.3745/jips.02.0076 ◽

2017 ◽

Cited By ~ 4

Author(s):

Mohamed Abdel Fattah

Keyword(s):

Feature Selection ◽

Text Categorization ◽

Statistical Feature ◽

Selection Approach ◽

Feature Selection Approach

Download Full-text

Clustering as feature selection method in spam classification: uncovering sick-leave sellers

Applied Computing and Informatics ◽

10.1108/aci-09-2021-0248 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Mariam Elhussein ◽

Samiha Brahimi

Keyword(s):

Feature Selection ◽

Sick Leave ◽

Feature Selection Method ◽

Classification Performance ◽

Selection Method ◽

Content Type ◽

Similar Work ◽

Classifier Performance ◽

Feature Selection Approach ◽

Practical Implications

PurposeThis paper aims to propose a novel way of using textual clustering as a feature selection method. It is applied to identify the most important keywords in the profile classification. The method is demonstrated through the problem of sick-leave promoters on Twitter.Design/methodology/approachFour machine learning classifiers were used on a total of 35,578 tweets posted on Twitter. The data were manually labeled into two categories: promoter and nonpromoter. Classification performance was compared when the proposed clustering feature selection approach and the standard feature selection were applied.FindingsRadom forest achieved the highest accuracy of 95.91% higher than similar work compared. Furthermore, using clustering as a feature selection method improved the Sensitivity of the model from 73.83% to 98.79%. Sensitivity (recall) is the most important measure of classifier performance when detecting promoters’ accounts that have spam-like behavior.Research limitations/implicationsThe method applied is novel, more testing is needed in other datasets before generalizing its results.Practical implicationsThe model applied can be used by Saudi authorities to report on the accounts that sell sick-leaves online.Originality/valueThe research is proposing a new way textual clustering can be used in feature selection.

Download Full-text

IMMIGRATE: A Margin-Based Feature Selection Method with Interaction Terms

Entropy ◽

10.3390/e22030291 ◽

2020 ◽

Vol 22 (3) ◽

pp. 291

Author(s):

Ruzhang Zhao ◽

Pengyu Hong ◽

Jun S. Liu

Keyword(s):

Feature Selection ◽

State Of The Art ◽

Feature Selection Method ◽

Selection Method ◽

Global Information ◽

Interaction Terms ◽

Feature Interactions ◽

Wide Range ◽

Margin Maximization ◽

Base Learner

Traditional hypothesis-margin researches focus on obtaining large margins and feature selection. In this work, we show that the robustness of margins is also critical and can be measured using entropy. In addition, our approach provides clear mathematical formulations and explanations to uncover feature interactions, which is often lack in large hypothesis-margin based approaches. We design an algorithm, termed IMMIGRATE (Iterative max-min entropy margin-maximization with interaction terms), for training the weights associated with the interaction terms. IMMIGRATE simultaneously utilizes both local and global information and can be used as a base learner in Boosting. We evaluate IMMIGRATE in a wide range of tasks, in which it demonstrates exceptional robustness and achieves the state-of-the-art results with high interpretability.

Download Full-text