F-test feature selection in Stacking ensemble model for breast cancer prediction

Breast cancer prediction datasets are usually class imbalanced, where the number of data samples in the malignant and benign patient classes are significantly different. Over-sampling techniques can be used to re-balance the datasets to construct more effective prediction models. Moreover, some related studies have considered feature selection to remove irrelevant features from the datasets for further performance improvement. However, since the order of combining feature selection and over-sampling can result in different training sets to construct the prediction model, it is unknown which order performs better. In this paper, the information gain (IG) and genetic algorithm (GA) feature selection methods and the synthetic minority over-sampling technique (SMOTE) are used for different combinations. The experimental results based on two breast cancer datasets show that the combination of feature selection and over-sampling outperform the single usage of either feature selection and over-sampling for the highly class imbalanced datasets. In particular, performing IG first and SMOTE second is the better choice. For other datasets with a small class imbalance ratio and a smaller number of features, performing SMOTE is enough to construct an effective prediction model.

Download Full-text

Breast Cancer Prediction using Feature Selection and Ensemble Voting

2019 International Conference on System Science and Engineering (ICSSE) ◽

10.1109/icsse.2019.8823106 ◽

2019 ◽

Author(s):

Quang H. Nguyen ◽

Trang T.T. Do ◽

Yijing Wang ◽

Sin Swee Heng ◽

Kelly Chen ◽

...

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Cancer Prediction

Download Full-text

Breast Cancer Prediction Using Different Classification Algorithms with Various Feature Selection Strategies

10.1109/icicos53627.2021.9651867 ◽

2021 ◽

Author(s):

Mohamad Sabha ◽

Bulent Tugrul

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Classification Algorithms ◽

Cancer Prediction ◽

Selection Strategies

Download Full-text

Augmentation of Classifier Accuracy through Implication of Feature Selection for Breast Cancer Prediction

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b2216.078219 ◽

2019 ◽

Vol 8 (2) ◽

pp. 6396-6399

Keyword(s):

Breast Cancer ◽

Feature Selection ◽

Random Forest ◽

Multilayer Perceptrons ◽

Accuracy Rate ◽

Cancer Prediction ◽

Malignant Breast ◽

Selection For ◽

Breast Lumps ◽

Feature Selection Techniques

Breast Cancer Examination and Prediction are great provocations to the researchers in the medical applications. Breast Cancer Examination distinguishes benign from malignant breast lumps, Breast Cancer Prediction has great deal in foretelling when Breast Cancer is expected to reoccur in patients that have had their cancers excised. Feature Selection is considered to be the preliminary step used in process to find best subsets of attributes. In this paper authors confer about the performance of five classifiers Sequential minimal optimization (SMO), Multilayer Perceptrons, Kstar, Decision Table and Random Forest with and without feature selection. The results manifest that after implying two feature selection techniques such as Correlation based and information based with ranker algorithm there is an augmentation in the accuracy rate of the classifier. It has been observed that after through implication feature selection techniques accuracy of the classifiers such as SMO, Multilayer Perceptrons, Kstar, Decision Trees, and Random Forest are enhanced.

Download Full-text