A Multi-Objective Multi-Label Feature Selection Algorithm Based on Shapley Value

Multi-label learning is dedicated to learning functions so that each sample is labeled with a true label set. With the increase of data knowledge, the feature dimensionality is increasing. However, high-dimensional information may contain noisy data, making the process of multi-label learning difficult. Feature selection is a technical approach that can effectively reduce the data dimension. In the study of feature selection, the multi-objective optimization algorithm has shown an excellent global optimization performance. The Pareto relationship can handle contradictory objectives in the multi-objective problem well. Therefore, a Shapley value-fused feature selection algorithm for multi-label learning (SHAPFS-ML) is proposed. The method takes multi-label criteria as the optimization objectives and the proposed crossover and mutation operators based on Shapley value are conducive to identifying relevant, redundant and irrelevant features. The comparison of experimental results on real-world datasets reveals that SHAPFS-ML is an effective feature selection method for multi-label classification, which can reduce the classification algorithm’s computational complexity and improve the classification accuracy.

Download Full-text

A Mixed Feature Selection Method Considering Interaction

Mathematical Problems in Engineering ◽

10.1155/2015/989067 ◽

2015 ◽

Vol 2015 ◽

pp. 1-10 ◽

Cited By ~ 3

Author(s):

Zilin Zeng ◽

Hongjun Zhang ◽

Rui Zhang ◽

Youliang Zhang

Keyword(s):

Feature Selection ◽

Rough Sets ◽

Feature Selection Method ◽

Feature Space ◽

Feature Interaction ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Neighborhood Rough Sets ◽

Real World Datasets ◽

Neighborhood Interaction

Feature interaction has gained considerable attention recently. However, many feature selection methods considering interaction are only designed for categorical features. This paper proposes a mixed feature selection algorithm based on neighborhood rough sets that can be used to search for interacting features. In this paper, feature relevance, feature redundancy, and feature interaction are defined in the framework of neighborhood rough sets, the neighborhood interaction weight factor reflecting whether a feature is redundant or interactive is proposed, and a neighborhood interaction weight based feature selection algorithm (NIWFS) is brought forward. To evaluate the performance of the proposed algorithm, we compare NIWFS with other three feature selection algorithms, including INTERACT, NRS, and NMI, in terms of the classification accuracies and the number of selected features with C4.5 and IB1. The results from ten real world datasets indicate that NIWFS not only deals with mixed datasets directly, but also reduces the dimensionality of feature space with the highest average accuracies.

Download Full-text

A Multi-objective Non-Dominated Sorted Artificial Bee Colony Feature Selection Algorithm for Medical Datasets

Indian Journal of Science and Technology ◽

10.17485/ijst/2016/v9i45/102290 ◽

2016 ◽

Vol 9 (45) ◽

Cited By ~ 4

Author(s):

Bhuvaneswari Ragothaman ◽

B. Sarojini

Keyword(s):

Feature Selection ◽

Artificial Bee Colony ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Multi Objective ◽

Bee Colony

Download Full-text

Predicting the Severity of Bug Reports Based on Feature Selection

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194018500158 ◽

2018 ◽

Vol 28 (04) ◽

pp. 537-558 ◽

Cited By ~ 4

Author(s):

Wenjie Liu ◽

Shanshan Wang ◽

Xin Chen ◽

He Jiang

Keyword(s):

Feature Selection ◽

Software Maintenance ◽

Feature Selection Method ◽

Selection Methods ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Bug Reports ◽

Single Feature ◽

Bug Report ◽

Severity Prediction

In software maintenance process, it is a fairly important activity to predict the severity of bug reports. However, manually identifying the severity of bug reports is a tedious and time-consuming task. So developing automatic judgment methods for predicting the severity of bug reports has become an urgent demand. In general, a bug report contains a lot of descriptive natural language texts, thus resulting in a high-dimensional feature set which poses serious challenges to traditionally automatic methods. Therefore, we attempt to use automatic feature selection methods to improve the performance of the severity prediction of bug reports. In this paper, we introduce a ranking-based strategy to improve existing feature selection algorithms and propose an ensemble feature selection algorithm by combining existing ones. In order to verify the performance of our method, we run experiments over the bug reports of Eclipse and Mozilla and conduct comparisons with eight commonly used feature selection methods. The experiment results show that the ranking-based strategy can effectively improve the performance of the severity prediction of bug reports by up to 54.76% on average in terms of [Formula: see text]-measure, and it also can significantly reduce the dimension of the feature set. Meanwhile, the ensemble feature selection method can get better results than a single feature selection algorithm.

Download Full-text

A feature selection algorithm combining information gain and multi-objective genetic search for intrusion detection system

MATEC Web of Conferences ◽

10.1051/matecconf/202133608008 ◽

2021 ◽

Vol 336 ◽

pp. 08008

Author(s):

Tao Xie

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection Rate ◽

Information Gain ◽

Detection System ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Genetic Search ◽

Multi Objective

In order to improve the detection rate and speed of intrusion detection system, this paper proposes a feature selection algorithm. The algorithm uses information gain to rank the features in descending order, and then uses a multi-objective genetic algorithm to gradually search the ranking features to find the optimal feature combination. We classified the Kddcup98 dataset into five classes, DOS, PROBE, R2L, and U2R, and conducted numerous experiments on each class. Experimental results show that for each class of attack, the proposed algorithm can not only speed up the feature selection, but also significantly improve the detection rate of the algorithm.

Download Full-text

A Feature Selection Method for Improved Clonal Algorithm Towards Intrusion Detection

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001416590138 ◽

2016 ◽

Vol 30 (05) ◽

pp. 1659013 ◽

Cited By ~ 7

Author(s):

Chunyong Yin ◽

Luyu Ma ◽

Lu Feng

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

False Positive ◽

False Positive Rate ◽

Feature Selection Method ◽

Selection Method ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Positive Rate ◽

Better Than

Intrusion detection is a kind of security mechanism which is used to detect attacks and intrusion behaviors. Due to the low accuracy and the high false positive rate of the existing clonal selection algorithms applied to intrusion detection, in this paper, we proposed a feature selection method for improved clonal algorithm. The improved method detects the intrusion behavior by selecting the best individual overall and clones them. Experimental results show that the feature selection algorithm is better than the traditional feature selection algorithm on the different classifiers, and it is shown that the final detection results are better than traditional clonal algorithm with 99.6% accuracy and 0.1% false positive rate.

Download Full-text

Machine Learning Based Clinical Diagnosis of Liver Patients with Instance Replacement

Journal of Mobile Multimedia ◽

10.13052/jmm1550-4646.1827 ◽

2021 ◽

Author(s):

J. V. D. Prasad ◽

A. Raghuvira Pratap ◽

Babu Sallagundla

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Research Work ◽

Feature Selection Method ◽

Learning Model ◽

Disease Classification ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Huge Data ◽

Machine Learning Model

With the rapid increase in number of clinical data and hence the prediction and analysing data becomes very difficult. With the help of various machine learning models, it becomes easy to work on these huge data. A machine learning model faces lots of challenges; one among the challenge is feature selection. In this research work, we propose a novel feature selection method based on statistical procedures to increase the performance of the machine learning model. Furthermore, we have tested the feature selection algorithm in liver disease classification dataset and the results obtained shows the efficiency of the proposed method.

Download Full-text

Artificial Bee Colony–Based Feature Selection Algorithm for Cyberbullying

The Computer Journal ◽

10.1093/comjnl/bxaa066 ◽

2020 ◽

Author(s):

Esra Sarac Essiz ◽

Murat Oturakci

Keyword(s):

Feature Selection ◽

Artificial Bee Colony ◽

Feature Selection Method ◽

Classification Performance ◽

Selection Method ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Traditional Methods ◽

Bee Colony ◽

Nature Inspired Algorithm

Abstract As a nature-inspired algorithm, artificial bee colony (ABC) is an optimization algorithm that is inspired by the search behaviour of honey bees. The main aim of this study is to examine the effects of the ABC-based feature selection algorithm on classification performance for cyberbullying, which has become a significant worldwide social issue in recent years. With this purpose, the classification performance of the proposed ABC-based feature selection method is compared with three different traditional methods such as information gain, ReliefF and chi square. Experimental results present that ABC-based feature selection method outperforms than three traditional methods for the detection of cyberbullying. The Macro averaged F_measure of the data set is increased from 0.659 to 0.8 using proposed ABC-based feature selection method.

Download Full-text

Health economics using feature selection algorithm and regression method for prediction of ending cash

Current Signal Transduction Therapy ◽

10.2174/1574362414666191022162244 ◽

2019 ◽

Vol 14 ◽

Author(s):

Zinat Ansari

Keyword(s):

Feature Selection ◽

Health Economics ◽

Feature Selection Method ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Significance Level ◽

Short And Long Term ◽

Attribute Evaluation ◽

Interest Payments

Background: Health economics are amongst academic fields which can aid in ameliorating conditions so as to perform better decisions in regards to the economy such as determining cash prices. The prediction of ending cash is fundamental for internal and external users and can come quite handy in terms of health economics. The most important purpose of financial reporting is the presentation of information to predict ending cash. Ergo, the aim of the research is to predict ending cash value using feature selection and MLR method from 2010-2012. Methods: A feature selection algorithm (Best-First, Greedy-Stepwise and Ranker) was employed in this research to nominate relevant data that affect ending cash. Results: Based on the results of the deployed feature selection method, the following features were indicated as the most relevant in terms of determine ending cash: interest payments for loans, dividends received from short and long term deposits, total net flow of investment activities, net increase (decrease) in cash and beginning cash based on best-first (CFS-Subset-Evaluation) and Greedy-Stepwise (CFS-Subset-Evaluation). Net out flow, dividends, dividends paid, interest payments for loans and dividends received deposits for short and long term were the most important data as indicated by the Ranker (Info-Gain-Attribute-Evaluation, Gain-Ratio-Attribute-Evaluation and Symmetricer-Attribute-Evaluation). According to Ranker (Principal-Components and Relifef-FAttribute-Evaluation) the best data for determining ending cash include beginning cash, interest payments for loans, dividends, net increase (decrease) in cash and dividends received from short and long term deposits. The findings were also indicative of a positive and highly significant correlation between dividends received from short and long term deposits and beginning cash (1.00**), with a significance level of 0.01, whereas the observed correlation between interest payments for loans and ending cash (0.999**), at a significance level of 0.01 was negatively significant. Conclusions: The present research attempted to reduce the volume of data required for predicting end cash by means of employing a feature selection so as to save both precious money and time.

Download Full-text

MRF-RFS: A Modified Random Forest Recursive Feature Selection Algorithm for Nasopharyngeal Carcinoma Segmentation

Methods of Information in Medicine ◽

10.1055/s-0040-1721791 ◽

2020 ◽

Vol 59 (04/05) ◽

pp. 151-161

Author(s):

Yuchen Fei ◽

Fengyu Zhang ◽

Chen Zu ◽

Mei Hong ◽

Xingchen Peng ◽

...

Keyword(s):

Feature Selection ◽

Random Forest ◽

Nasopharyngeal Carcinoma ◽

Soft Tissues ◽

Feature Selection Method ◽

Selection Method ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Tumor Margins

Abstract Background An accurate and reproducible method to delineate tumor margins is of great importance in clinical diagnosis and treatment. In nasopharyngeal carcinoma (NPC), due to limitations such as high variability, low contrast, and discontinuous boundaries in presenting soft tissues, tumor margin can be extremely difficult to identify in magnetic resonance imaging (MRI), increasing the challenge of NPC segmentation task. Objectives The purpose of this work is to develop a semiautomatic algorithm for NPC image segmentation with minimal human intervention, while it is also capable of delineating tumor margins with high accuracy and reproducibility. Methods In this paper, we propose a novel feature selection algorithm for the identification of the margin of NPC image, named as modified random forest recursive feature selection (MRF-RFS). Specifically, to obtain a more discriminative feature subset for segmentation, a modified recursive feature selection method is applied to the original handcrafted feature set. Moreover, we combine the proposed feature selection method with the classical random forest (RF) in the training stage to take full advantage of its intrinsic property (i.e., feature importance measure). Results To evaluate the segmentation performance, we verify our method on the T1-weighted MRI images of 18 NPC patients. The experimental results demonstrate that the proposed MRF-RFS method outperforms the baseline methods and deep learning methods on the task of segmenting NPC images. Conclusion The proposed method could be effective in NPC diagnosis and useful for guiding radiation therapy.

Download Full-text

Flow Feature Selection Method Based on Statistics

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.1030-1032.1709 ◽

2014 ◽

Vol 1030-1032 ◽

pp. 1709-1712

Author(s):

Kai Min Song ◽

Xun Yi Ren

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Selection Method ◽

Experimental Results ◽

Identification Algorithm ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Statistical Feature ◽

Flow Feature

Through the research on the flow identification algorithm based on statistical feature, this paper puts forward the statistical feature selection algorithm in order to reduce the number of features in identification, increase the speed of the flow identification, the experimental results show that the algorithm can effectively reduce the amount of features, improve the efficiency of identification.

Download Full-text