scholarly journals F-test feature selection in Stacking ensemble model for breast cancer prediction

2020 ◽  
Vol 171 ◽  
pp. 1561-1570 ◽  
Author(s):  
R Dhanya ◽  
Irene Rose Paul ◽  
Sai Sindhu Akula ◽  
Madhumathi Sivakumar ◽  
Jyothisha J Nair
2021 ◽  
Vol 11 (14) ◽  
pp. 6574
Author(s):  
Min-Wei Huang ◽  
Chien-Hung Chiu ◽  
Chih-Fong Tsai ◽  
Wei-Chao Lin

Breast cancer prediction datasets are usually class imbalanced, where the number of data samples in the malignant and benign patient classes are significantly different. Over-sampling techniques can be used to re-balance the datasets to construct more effective prediction models. Moreover, some related studies have considered feature selection to remove irrelevant features from the datasets for further performance improvement. However, since the order of combining feature selection and over-sampling can result in different training sets to construct the prediction model, it is unknown which order performs better. In this paper, the information gain (IG) and genetic algorithm (GA) feature selection methods and the synthetic minority over-sampling technique (SMOTE) are used for different combinations. The experimental results based on two breast cancer datasets show that the combination of feature selection and over-sampling outperform the single usage of either feature selection and over-sampling for the highly class imbalanced datasets. In particular, performing IG first and SMOTE second is the better choice. For other datasets with a small class imbalance ratio and a smaller number of features, performing SMOTE is enough to construct an effective prediction model.


Author(s):  
Quang H. Nguyen ◽  
Trang T.T. Do ◽  
Yijing Wang ◽  
Sin Swee Heng ◽  
Kelly Chen ◽  
...  

2019 ◽  
Vol 8 (2) ◽  
pp. 6396-6399

Breast Cancer Examination and Prediction are great provocations to the researchers in the medical applications. Breast Cancer Examination distinguishes benign from malignant breast lumps, Breast Cancer Prediction has great deal in foretelling when Breast Cancer is expected to reoccur in patients that have had their cancers excised. Feature Selection is considered to be the preliminary step used in process to find best subsets of attributes. In this paper authors confer about the performance of five classifiers Sequential minimal optimization (SMO), Multilayer Perceptrons, Kstar, Decision Table and Random Forest with and without feature selection. The results manifest that after implying two feature selection techniques such as Correlation based and information based with ranker algorithm there is an augmentation in the accuracy rate of the classifier. It has been observed that after through implication feature selection techniques accuracy of the classifiers such as SMO, Multilayer Perceptrons, Kstar, Decision Trees, and Random Forest are enhanced.


Sign in / Sign up

Export Citation Format

Share Document