scholarly journals Fuzzy PCA and Support Vector Machines for Breast Cancer Classification

2018 ◽  
Vol 7 (3.7) ◽  
pp. 62 ◽  
Author(s):  
Mohamad Faiz Dzulkalnine ◽  
Roselina Sallehuddin ◽  
Yusliza Yusoff ◽  
Nor Haizan Mohamed Radzi ◽  
Noorfa Haszlinna Mustaffa

Breast cancer is the leading cause of death among women in the world and early detection can increase the chance of survival for the patients. However, expert system and machine learning diagnosis are burdened with the presence of irrelevant data and noise which can reduce the accuracy of prediction and increase computational time. In this paper, Fuzzy Principle Component Analysis (FPCA) and Support Vector Machines (SVM) are proposed for the classification of breast cancer dataset. Experimental results on public breast cancer dataset show that the proposed method FPCA-SVM outperformed the benchmark models in terms of accuracy, specificity, and sensitivity and AUC value. The proposed model can assist doctors and medical practitioners for an early detection of breast cancer.  

Author(s):  
Zuherman Rustam ◽  
Yasirly Amalia ◽  
Sri Hartini ◽  
Glori Stephani Saragih

<span id="docs-internal-guid-4db59d91-7fff-c659-478a-6dd7456f380f"><span>Breast cancer is an abnormal cell growth in the breast that keeps changed uncontrolled and it forms a tumor. The tumor can be benign or malignant. Benign could not be dangerous to health and cancerous, but malignant could be has a probability dangerous to health and be cancerous. A specialist doctor will diagnose the patient and give treatment based on the diagnosis which is benign or malignant. Machine learning offer times efficiency to determine a cancer cell. The machine will learn the pattern based on the information from the dataset. Support vector machines and linear discriminant analysis are common methods that can be used in the classification of cancer. In this study, both of linear discriminant analysis and support vector machines are compared by looking from accuracy, sensitivity, specificity, and F1-score. We will know which methods are better in classifying breast cancer dataset. The result shows that the support vector machine has better performance than the linear discriminant analysis. It can be seen from the accuracy is 98.77%.</span></span>


2016 ◽  
Vol 762 ◽  
pp. 012050 ◽  
Author(s):  
Raquel Pezoa ◽  
Luis Salinas ◽  
Claudio Torres ◽  
Steffen Härtel ◽  
Cristián Maureira-Fredes ◽  
...  

2020 ◽  
Vol 14 ◽  

Breast Cancer (BC) is amongst the most common and leading causes of deaths in women throughout the world. Recently, classification and data analysis tools are being widely used in the medical field for diagnosis, prognosis and decision making to help lower down the risks of people dying or suffering from diseases. Advanced machine learning methods have proven to give hope for patients as this has helped the doctors in early detection of diseases like Breast Cancer that can be fatal, in support with providing accurate outcomes. However, the results highly depend on the techniques used for feature selection and classification which will produce a strong machine learning model. In this paper, a performance comparison is conducted using four classifiers which are Multilayer Perceptron (MLP), Support Vector Machine (SVM), K-Nearest Neighbors (KNN) and Random Forest on the Wisconsin Breast Cancer dataset to spot the most effective predictors. The main goal is to apply best machine learning classification methods to predict the Breast Cancer as benign or malignant using terms such as accuracy, f-measure, precision and recall. Experimental results show that Random forest is proven to achieve the highest accuracy of 99.26% on this dataset and features, while SVM and KNN show 97.78% and 97.04% accuracy respectively. MLP shows the least accuracy of 94.07%. All the experiments are conducted using RStudio as the data mining tool platform.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Pooja Rani ◽  
Rajneesh Kumar ◽  
Anurag Jain

PurposeDecision support systems developed using machine learning classifiers have become a valuable tool in predicting various diseases. However, the performance of these systems is adversely affected by the missing values in medical datasets. Imputation methods are used to predict these missing values. In this paper, a new imputation method called hybrid imputation optimized by the classifier (HIOC) is proposed to predict missing values efficiently.Design/methodology/approachThe proposed HIOC is developed by using a classifier to combine multivariate imputation by chained equations (MICE), K nearest neighbor (KNN), mean and mode imputation methods in an optimum way. Performance of HIOC has been compared to MICE, KNN, and mean and mode methods. Four classifiers support vector machine (SVM), naive Bayes (NB), random forest (RF) and decision tree (DT) have been used to evaluate the performance of imputation methods.FindingsThe results show that HIOC performed efficiently even with a high rate of missing values. It had reduced root mean square error (RMSE) up to 17.32% in the heart disease dataset and 34.73% in the breast cancer dataset. Correct prediction of missing values improved the accuracy of the classifiers in predicting diseases. It increased classification accuracy up to 18.61% in the heart disease dataset and 6.20% in the breast cancer dataset.Originality/valueThe proposed HIOC is a new hybrid imputation method that can efficiently predict missing values in any medical dataset.


Sign in / Sign up

Export Citation Format

Share Document