An Efficient System for Early Diagnosis of Breast Cancer using Support Vector Machine

There are many lives lost every year due to cancer and among them; among the women breast cancer causes the most deaths. For the better prediction of breast cancer risks, numerous studies have been undertaken incorporating data mining techniques. 1.1 million Cases of breast cancer were reported in 2004. It has been seen over the years that, that the numbers increase with the increasing industrialization and urbanization. It was earlier observed that mostly affected countries with breast cancer were high income countries such as America but now a days it is also very serious issue in middle and low income countries like Africa, Latin America and Asia. The main objective of this paper is to create a model which can more efficiently and accurately categorize a cancer as malignant or benevolent based on interpretation of the numerical values of attributes of ultrasound images of breast cancer. In this paper various data mining algorithm used like SVM(Support Vector Machine) for prediction and compared it with various other algorithms such as CART, Logistic Regression, KNN for the best training and test accuracy. SVM algorithm gives the most accurate results among the rest algorithm.

Download Full-text

Prediction of benign and malignant breast cancer using data mining techniques

Journal of Algorithms & Computational Technology ◽

10.1177/1748301818756225 ◽

2018 ◽

Vol 12 (2) ◽

pp. 119-126 ◽

Cited By ~ 43

Author(s):

Vikas Chaurasia ◽

Saurabh Pal ◽

BB Tiwari

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Low Income ◽

Prediction Models ◽

Naive Bayes ◽

Naïve Bayes ◽

Low Income Countries ◽

Breast Cancer Dataset ◽

Cancer Dataset ◽

Rbf Network

Breast cancer is the second most leading cancer occurring in women compared to all other cancers. Around 1.1 million cases were recorded in 2004. Observed rates of this cancer increase with industrialization and urbanization and also with facilities for early detection. It remains much more common in high-income countries but is now increasing rapidly in middle- and low-income countries including within Africa, much of Asia, and Latin America. Breast cancer is fatal in under half of all cases and is the leading cause of death from cancer in women, accounting for 16% of all cancer deaths worldwide. The objective of this research paper is to present a report on breast cancer where we took advantage of those available technological advancements to develop prediction models for breast cancer survivability. We used three popular data mining algorithms (Naïve Bayes, RBF Network, J48) to develop the prediction models using a large dataset (683 breast cancer cases). We also used 10-fold cross-validation methods to measure the unbiased estimate of the three prediction models for performance comparison purposes. The results (based on average accuracy Breast Cancer dataset) indicated that the Naïve Bayes is the best predictor with 97.36% accuracy on the holdout sample (this prediction accuracy is better than any reported in the literature), RBF Network came out to be the second with 96.77% accuracy, J48 came out third with 93.41% accuracy.

Download Full-text

Using Stratified Sample and Grid Search to Improve Disease Prediction Accuracy of SVM

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.295-298.644 ◽

2013 ◽

Vol 295-298 ◽

pp. 644-647 ◽

Cited By ~ 1

Author(s):

Yu Kai Yao ◽

Hong Mei Cui ◽

Ming Wei Len ◽

Xiao Yun Chen

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Classification Accuracy ◽

Prediction Accuracy ◽

Support Vector ◽

Disease Prediction ◽

Data Mining Algorithm ◽

Grid Search ◽

Mining Algorithm ◽

Stratified Sample

SVM (Support Vector Machine) is a powerful data mining algorithm, and is mainly used to finish classification or regression tasks. In this literature, SVM is used to conduct disease prediction. We focus on integrating with stratified sample and grid search technology to improve the classification accuracy of SVM, thus, we propose an improved algorithm named SGSVM: Stratified sample and Grid search based SVM. To testify the performance of SGSVM, heart-disease data from UCI are used in our experiment, and the results show SGSVM has obvious improvement in classification accuracy, and this is very valuable especially in disease prediction.

Download Full-text

Improving the performance of support-vector machine by selecting the best features by Gray Wolf algorithm to increase the accuracy of diagnosis of breast cancer

Journal Of Big Data ◽

10.1186/s40537-019-0247-7 ◽

2019 ◽

Vol 6 (1) ◽

Cited By ~ 3

Author(s):

Seyed Reza Kamel ◽

Reyhaneh YaghoubZadeh ◽

Maryam Kheirabadi

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Support Vector Machine ◽

Feature Selection Method ◽

Breast Cancer Diagnosis ◽

Support Vector ◽

Gray Wolf ◽

Diagnose Breast Cancer ◽

Computer Methods ◽

A New Technique

Abstract One of the most common diseases among women is breast cancer, the early diagnosis of which is of paramount importance. Given the time-consuming nature of the diagnosis process of the disease, using new methods such as computer science is extremely important for early detection of the condition. Today, the main emphasis is on the science of data mining as one of the computer methods in the field of diagnosis. In the present study, we used data mining as a combination of feature selection method by Gray Wolf Optimization (GWO) and support vector machine (SVM), which is a new technique with high accuracy compared to other methods in this classification, to increase the accuracy of breast cancer diagnosis. The UCI dataset and functional parameters and various statistical criteria were applied to evaluate the proposed method and assess the validity of the results in MATLAB, respectively. Application of the proposed method increased the improvement of the evaluated criteria, which increased the accuracy of diagnosis by 27.68%, compared to former works in the field. As such, it could be concluded that the proposed method had a higher ability to diagnose breast cancer, compared to previous techniques.

Download Full-text

Perbandingan Teknik Klasifikasi Neural Network, Support Vector Machine, dan Naive Bayes dalam Mendeteksi Kanker Payudara

BINA INSANI ICT JOURNAL ◽

10.51211/biict.v7i1.1343 ◽

2020 ◽

Vol 7 (1) ◽

pp. 53

Author(s):

Derisma Derisma ◽

Fajri Febrian

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Data Mining ◽

Support Vector Machine ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Accuracy Rate ◽

Cancer Disease ◽

Network Support

Abstrak: Kanker payudara merupakan jenis kanker yang sering ditemukan oleh kebanyakan wanita. Di Indonesia Kanker payudara menempati urutan pertama pada pasien rawat inap di seluruh rumah sakit. Tujuan dari penelitian ini adalah melakukan diagnosis penyakit kanker payudara berbasis komputasi yang dapat menghasilkan bagaimana kondisi kanker seseorang berdasarkan akurasi algoritma. Penelitian ini menggunakan pemrograman orange python dan dataset Wisconsin Breast Cancer untuk pemodelan klasifikasi kanker payudara. Metode data mining yang diterapkan yaitu Neural Network, Support Vector Machine, dan Naive Bayes. Dalam penelitian ini didapat algoritma klasifikasi terbaik yaitu algoritma Kernel SVM dengan tingkat akurasi sebesar 98.9 % dan algoritma terendah yaitu Naive Bayes senilai 96.1 %. Kata kunci: kanker payudara, neural network, support vector machine, naive bayes Abstract: Breast cancer is a type of cancer that mostly found in many women. In Indonesia, breast cancer ranks first in hospitalized patients at every hospital. This study aimed to conduct a computation-based diagnose of breast cancer disease that could produce the state of cancer of an individual based on the accuracy of algorithm. This study used python orange programming and Wisconsin Breast Cancer dataset for a modeling and application of breast cancer classification. The data mining methods that were applied in this study were Neural Network, Support Vector Machine, dan Naive Bayes. In this study, Kernel SVM’s algorithm was the best classification algorithm of breast cancer disease with 98.9 % accuracy rate and Naïve Beyes was the lowest with 96.1 % of accuracy rate. Keywords: breast cancer, neural network, support vector machine, naive bayes

Download Full-text

Efficient Data-Mining Algorithm for Predicting Heart Disease Based on an Angiographic Test

Malaysian Journal of Medical Sciences ◽

10.21315/mjms2021.28.5.12 ◽

2021 ◽

Vol 28 (5) ◽

pp. 118-129

Author(s):

Alabi Waheed Banjoko ◽

◽

Kawthar Opeyemi Abdulazeez ◽

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Heart Disease ◽

Cross Validation ◽

Classification Method ◽

Support Vector ◽

Data Mining Algorithm ◽

Machine Method ◽

Mining Algorithm ◽

Splitting Ratio

Background: The computerised classification and prediction of heart disease can be useful for medical personnel for the purpose of fast diagnosis with accurate results. This study presents an efficient classification method for predicting heart disease using a data-mining algorithm. Methods: The algorithm utilises the weighted support vector machine method for efficient classification of heart disease based on a binary response that indicates the presence or absence of heart disease as the result of an angiographic test. The optimal values of the support vector machine and the Radial Basis Function kernel parameters for the heart disease classification were determined via a 10-fold cross-validation method. The heart disease data was partitioned into training and testing sets using different percentages of the splitting ratio. Each of the training sets was used in training the classification method while the predictive power of the method was evaluated on each of the test sets using the Monte-Carlo cross-validation resampling technique. The effect of different percentages of the splitting ratio on the method was also observed. Results: The misclassification error rate was used to compare the performance of the method with three selected machine learning methods and was observed that the proposed method performs best over others in all cases considered. Conclusion: Finally, the results illustrate that the classification algorithm presented can effectively predict the heart disease status of an individual based on the results of an angiographic test.

Download Full-text

Komparasi Algoritma Naive Bayes, Decision Tree dan Support Vector Machine untuk Prediksi Penyakit Kanker Payudara

Jurnal Teknik Komputer ◽

10.31294/jtk.v7i1.9191 ◽

2021 ◽

Vol 7 (1) ◽

pp. 51-54

Author(s):

Lusa Indah Prahartiwi ◽

Wulan Dari

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Support Vector Machine ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector

Kanker payudara merupakan kanker paling umum pada wanita di seluruh dunia dengan menyumbang 25,4% dari total jumlah kasus baru yang didiagnosis pada tahun 2018. Kanker adalah sekelompok besar penyakit yang dapat dimulai di hampir semua organ atau jaringan tubuh ketika sel abnormal tumbuh tak terkendali, melampaui batas biasanya untuk menyerang bagian tubuh yang berdekatan dan/atau menyebar ke organ lain. Penyakit kanker payudara dapat diprediksi dengan pengetahuan data mining. Data mining dapat menemukan korelasi, pola, dan tren baru yang bermakna dengan memilah-milah data dalam jumlah besar yang disimpan dalam repositori, menggunakan teknologi pengenalan pola serta teknik statistik dan matematika. Penelitian ini membandingkan performa Algoritma Naive Bayes, Decision Tree dan Support Vector Machine untuk memprediksi penyakit kanker payudara. Dataset yang digunakan adalah data sekunder Breast Cancer Coimbra yang diambil dari UCI Repository. Hasil dari penelitian ini menunjukan bahwa Algoritma Support Vector Machine menghasilkan tingkat Accuracy tertinggi yaitu sebesar 74,29% dibandingkan dengan Algoritma Naive Bayes dan Decision Tree

Download Full-text

Data Mining Algorithm and the Effectiveness of Mathematics Classroom Teaching based on Support Vector Machine

International Journal of Database Theory and Application ◽

10.14257/ijdta.2016.9.11.15 ◽

2016 ◽

Vol 9 (11) ◽

pp. 163-174 ◽

Cited By ~ 1

Author(s):

Tang Qiang

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Mathematics Classroom ◽

Classroom Teaching ◽

Support Vector ◽

Data Mining Algorithm ◽

Mining Algorithm

Download Full-text

MODEL KLASIFIKASI KEPUASAN MAHASISWA TEKNIK TERHADAP SARANA PEMBELAJARAN MENGGUNAKAN DATA MINING

Jurnal Teknologi Informasi Jurnal Keilmuan dan Aplikasi Bidang Teknik Informatika ◽

10.47111/jti.v14i2.1222 ◽

2020 ◽

Vol 14 (2) ◽

pp. 112-118

Author(s):

Ariesta Lestari ◽

Elga Mariati ◽

Widiatry Widiatry

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Data Mining Algorithm ◽

Prediction System ◽

Data Mining Approach

Student in one of the stakeholder in a university. Therefore, student’s perception in the quality of learning facilities and infrastructures become important to ensure the university’s performance. The Faculty of Engineering of University of Palangka Raya has not comprehensively evaluated the students’ satisfactory of the learning’s facilities. In this research, methods from data mining approach was implemented to classify whether the students satisfy or not with the quality of the learning’s facility in Engineering Faculty. This research compared three data mining algorithm, Decision Tree C4.5, Support Vector Machine, and Naïve Bayes to obtain the best algorithm for the prediction system. 948 responses were collected, 61% of the respondent were satisfied with the quality of the learning facilities and infrastructures, while 39% of the respondents were dissatisfied. The Decision Tree c4.5 had the best performance with accuracy of 88% and precision of 98% compared to the Naïve Bayes and support vector machine.

Download Full-text

KLASIFIKASI SMS SPAM MENGGUNAKAN SUPPORT VECTOR MACHINE

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v15i2.693 ◽

2019 ◽

Vol 15 (2) ◽

pp. 275-280

Author(s):

Agus Setiyono ◽

Hilman F Pardede

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Spam Detection ◽

Support Vector Machine Algorithm ◽

Data Mining Techniques ◽

To Receive

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam. One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.

Download Full-text

Predicting Risk of Antenatal Depression and Anxiety Using Multi-Layer Perceptrons and Support Vector Machines

Journal of Personalized Medicine ◽

10.3390/jpm11030199 ◽

2021 ◽

Vol 11 (3) ◽

pp. 199

Author(s):

Fajar Javed ◽

Syed Omer Gilani ◽

Seemab Latif ◽

Asim Waris ◽

Mohsin Jamil ◽

...

Keyword(s):

Low Income ◽

Operating Characteristic ◽

Mental Health Problems ◽

Characteristic Curve ◽

Antenatal Depression ◽

Low Income Countries ◽

Support Vector ◽

Depression And Anxiety ◽

Gynecology And Obstetrics ◽

Operating Characteristic Curve

Perinatal depression and anxiety are defined to be the mental health problems a woman faces during pregnancy, around childbirth, and after child delivery. While this often occurs in women and affects all family members including the infant, it can easily go undetected and underdiagnosed. The prevalence rates of antenatal depression and anxiety worldwide, especially in low-income countries, are extremely high. The wide majority suffers from mild to moderate depression with the risk of leading to impaired child–mother relationship and infant health, few women end up taking their own lives. Owing to high costs and non-availability of resources, it is almost impossible to diagnose every pregnant woman for depression/anxiety whereas under-detection can have a lasting impact on mother and child’s health. This work proposes a multi-layer perceptron based neural network (MLP-NN) classifier to predict the risk of depression and anxiety in pregnant women. We trained and evaluated our proposed system on a Pakistani dataset of 500 women in their antenatal period. ReliefF was used for feature selection before classifier training. Evaluation metrics such as accuracy, sensitivity, specificity, precision, F1 score, and area under the receiver operating characteristic curve were used to evaluate the performance of the trained model. Multilayer perceptron and support vector classifier achieved an area under the receiving operating characteristic curve of 88% and 80% for antenatal depression and 85% and 77% for antenatal anxiety, respectively. The system can be used as a facilitator for screening women during their routine visits in the hospital’s gynecology and obstetrics departments.

Download Full-text