A Comprehensive Performance Analysis of Various Classifier Models for Coronary Artery Disease Prediction

Cardio Vascular Diseases (CVD) is the major reason for the death of the majority of the people in the world. Earlier diagnosis of disease will reduce the mortality rate. Machine learning (ML) algorithms are giving promising results in the disease diagnosis and it is now widely accepted by medical experts as their clinical decision support system. In this work, the most popular ML models are investigated and compared with one other for heart disease prediction based on various metrics. The base classifiers such as Support Vector Machine (SVM), Logistic regression, Naïve Bayes, Decision Tree, K Nearest Neighbour are used for predicting heart disease. In this paper, bagging and boosting techniques are applied over these individual classifiers to improve the performance of the system. With the Cleveland and Statlog datasets, Naive Bayes as the individual classifier gives the maximum accuracy of 85.13%and 84.81% respectively. Bagging technique improves the accuracy of the decision tree which is identified as a weak classifier by 7% and it is a significant improvement in identifying CVD.

Download Full-text

A Comprehensive Performance Analysis of Various Classifier Models for Coronary Artery Disease Prediction

International Journal of Cognitive Informatics and Natural Intelligence ◽

10.4018/ijcini.20211001.oa36 ◽

2021 ◽

Vol 15 (4) ◽

pp. 1-14

Author(s):

Baranidharan Balakrishnan ◽

Vinoth Kumar C. N. S.

Keyword(s):

Heart Disease ◽

Decision Tree ◽

Naive Bayes ◽

Vascular Diseases ◽

Clinical Decision ◽

Disease Diagnosis ◽

Naïve Bayes ◽

Support Vector ◽

Disease Prediction ◽

Weak Classifier

Download Full-text

A Hybrid System to Improve the Performance of Diabetes Disease Prediction using Genetic Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b7374.129219 ◽

2020 ◽

Vol 9 (2) ◽

pp. 1720-1726

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Mortality Rate ◽

Decision Tree ◽

Prediction Model ◽

Naive Bayes ◽

Medical Science ◽

Naïve Bayes ◽

Support Vector ◽

Disease Prediction

Currently, data mining is playing a significant role in the healthcare system. It helps to extract the hidden pattern from the clinical dataset for further analysis. Also, it can be used to build a tool to manage the medical management system. Among the life-threatening diseases, diabetes mellitus is treated as a serious disease worldwide. Due to its mortality rate, early prediction and diagnosis are very important. Several research works are going on the mentioned issues to reduce the complications caused by diabetes as well as the mortality rate. The medical science needs to analyze an enormous quantity of clinical data for diagnosis purposes using machine learning techniques. In recent approaches, the disease datasets may contain insignificant and digressive features causing less accurate results. The aim of this paper is to analyze the existing prediction systems and hence develop a hybrid disease prediction model using the Genetic Algorithm for Naïve Bayes, Decision Tree and Support Vector Machine classifiers for better accuracy. This proposed diabetes prediction model produces the accuracies of 0.8182, 0.8052, and 0.8312 when Naïve Bayes, Decision Tree, and Support Vector Machine classifiers are used respectively. From the experimental results, it can be demonstrated that for all cases Support Vector Machine provides higher accuracy comparing to the other classifiers. In the analysis, the Pima Indian diabetes dataset is used to construct the proposed model.

Download Full-text

Heart Disease Prediction Model Using Naïve Bayes Algorithm and Machine Learning Techniques

International Journal of Engineering & Technology ◽

10.14419/ijet.v10i1.31310 ◽

2021 ◽

Vol 10 (1) ◽

pp. 46

Author(s):

Maria Yousef ◽

Prof. Khaled Batiha

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Heart Disease ◽

Prediction Model ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Prediction Systems

These days, heart disease comes to be one of the major health problems which have affected the lives of people in the whole world. Moreover, death due to heart disease is increasing day by day. So the heart disease prediction systems play an important role in the prevention of heart problems. Where these prediction systems assist doctors in making the right decision to diagnose heart disease easily. The existing prediction systems suffering from the high dimensionality problem of selected features that increase the prediction time and decrease the performance accuracy of the prediction due to many redundant or irrelevant features. Therefore, this paper aims to provide a solution of the dimensionality problem by proposing a new mixed model for heart disease prediction based on (Naïve Bayes method, and machine learning classifiers).In this study, we proposed a new heart disease prediction model (NB-SKDR) based on the Naïve Bayes algorithm (NB) and several machine learning techniques including Support Vector Machine, K-Nearest Neighbors, Decision Tree, and Random Forest. This prediction model consists of three main phases which include: preprocessing, feature selection, and classification. The main objective of this proposed model is to improve the performance of the prediction system and finding the best subset of features. This proposed approach uses the Naïve Bayes technique based on the Bayes theorem to select the best subset of features for the next classification phase, also to handle the high dimensionality problem by avoiding unnecessary features and select only the important ones in an attempt to improve the efficiency and accuracy of classifiers. This method is able to reduce the number of features from 13 to 6 which are (age, gender, blood pressure, fasting blood sugar, cholesterol, exercise induce engine) by determining the dependency between a set of attributes. The dependent attributes are the attributes in which an attribute depends on the other attribute in deciding the value of the class attribute. The dependency between attributes is measured by the conditional probability, which can be easily computed by Bayes theorem. Moreover, in the classification phase, the proposed system uses different classification algorithms such as (DT Decision Tree, RF Random Forest, SVM Support Vector machine, KNN Nearest Neighbors) as a classifiers for predicting whether a patient has heart disease or not. The model is trained and evaluated using the Cleveland Heart Disease database, which contains 13 features and 303 samples.Different algorithms use different rules for producing different representations of knowledge. So, the selection of algorithms to build our model is based on their performance. In this work, we applied and compared several classification algorithms which are (DT, SVM, RF, and KNN) to identify the best-suited algorithm to achieve high accuracy in the prediction of heart disease. After combining the Naive Bayes method with each one of these previous classifiers the performance of these combines algorithms is evaluated by different performance metrics such as (Specificity, Sensitivity, and Accuracy). Where the experimental results show that out of these four classification models, the combination between the Naive Bayes feature selection approach and the SVM RBF classifier can predict heart disease with the highest accuracy of 98%. Finally, the proposed approach is compared with another two systems which developed based on two different approaches in the feature selection step. The first system, based on the Genetic Algorithm (GA) technique, and the second uses the Principal Component Analysis (PCA) technique. Consequently, the comparison proved that the Naive Bayes selection approach of the proposed system is better than the GA and PCA approach in terms of prediction accuracy.

Download Full-text

Heart Disease Prediction System using Data Mining Classification Techniques: Naïve Bayes, KNN, and Decision Tree

International Journal of Advanced Trends in Computer Science and Engineering ◽

10.30534/ijatcse/2020/82932020 ◽

2020 ◽

Vol 9 (3) ◽

pp. 3028-3035

Author(s):

Maria Theresa Viega

Keyword(s):

Data Mining ◽

Heart Disease ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Disease Prediction ◽

Prediction System ◽

Classification Techniques ◽

Using Data

Download Full-text

KLASIFIKASI SMS SPAM MENGGUNAKAN SUPPORT VECTOR MACHINE

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v15i2.693 ◽

2019 ◽

Vol 15 (2) ◽

pp. 275-280

Author(s):

Agus Setiyono ◽

Hilman F Pardede

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Spam Detection ◽

Support Vector Machine Algorithm ◽

Data Mining Techniques ◽

To Receive

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam. One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.

Download Full-text

Tinjauan Algoritma RoI (Region of Interest) Dengan Metode Pengambangan Otsu Dan Klasterisasi K-Mean; Hasil Dan Tantangannya

Informatik : Jurnal Ilmu Komputer ◽

10.52958/iftk.v16i2.1961 ◽

2020 ◽

Vol 16 (2) ◽

pp. 75

Author(s):

Didit Widiyanto

Keyword(s):

Support Vector Machine ◽

Decision Tree ◽

Naive Bayes ◽

Region Of Interest ◽

Naïve Bayes ◽

Support Vector ◽

Gray Level

Akurasi sebuah klasifikasi citra ditentukan oleh pengklasifikasi. Meskipun RoI (Region of Interest) tidak menentukan secara langsung akurasi, namun RoI menentukan lingkup klasifikasi citra. Terdapat tiga algoritma yang dapat digunakan sebagai algoritma RoI yaitu; Balanced Histogram Thresholding (BHT), algoritma Otsu, dan algoritma klasterisasi K-Means. Paper ini meninjau algoritma Otsu dan algoritma klasterisasi K-Means yang digunakan oleh lima peneliti. Dari ke lima peneliti; tiga peneliti menerapkan algoritma Otsu dan dua peneliti menerapkan algoritma K-Means sebagai algoritma RoI. Setelah operasi RoI, ke lima peneliti menerapkan algoritma GLCM (Gray Level Co-occurance Matrix) sebagai pengekstraksi ciri tekstur. Hasil ekstraksi ciri diklasifikasi dengan menggunakan berbagai pengklasifikasi antara lain SVM (Support Vector Machine), Naive Bayes, dan Decision Tree. Akhirnya dengan membandingkan hasil dari ke lima peneliti, akurasi tertinggi diperoleh sebesar 100% dengan pengklasifikasi SVM menggunakan algoritma Otsu sebagai algoritma RoI, dan akurasi terendah adalah sebesar52% yang menggunakan algoritma Otsu pada kanal S dari citra HSV (Hue, Saturation Value).

Download Full-text

Mobile Money Fraud Prediction—A Cross-Case Analysis on the Efficiency of Support Vector Machines, Gradient Boosted Decision Trees, and Naïve Bayes Algorithms

Information ◽

10.3390/info11080383 ◽

2020 ◽

Vol 11 (8) ◽

pp. 383

Author(s):

Francis Effirim Botchey ◽

Zhen Qin ◽

Kwesi Hughes-Lartey

Keyword(s):

Developing Countries ◽

Support Vector Machines ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

Mobile Money ◽

Vector Machines ◽

Boosted Decision Tree

The onset of COVID-19 has re-emphasized the importance of FinTech especially in developing countries as the major powers of the world are already enjoying the advantages that come with the adoption of FinTech. Handling of physical cash has been established as a means of transmitting the novel corona virus. Again, research has established that, been unbanked raises the potential of sinking one into abject poverty. Over the years, developing countries have been piloting the various forms of FinTech, but the very one that has come to stay is the Mobile Money Transactions (MMT). As mobile money transactions attempt to gain a foothold, it faces several problems, the most important of them is mobile money fraud. This paper seeks to provide a solution to this problem by looking at machine learning algorithms based on support vector machines (kernel-based), gradient boosted decision tree (tree-based) and Naïve Bayes (probabilistic based) algorithms, taking into consideration the imbalanced nature of the dataset. Our experiments showed that the use of gradient boosted decision tree holds a great potential in combating the problem of mobile money fraud as it was able to produce near perfect results.

Download Full-text

Future Prediction of Diabetics using XG Booster Classifiers

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.c5144.029320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2128-2132

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

The Body ◽

Machine Learning Algorithms ◽

Support Vector ◽

Common Disease ◽

Data Set ◽

Glucose Content

Diabetes is a most common disease that occurs to most of the humans now a day. The predictions for this disease are proposed through machine learning techniques. Through this method the risk factors of this disease are identified and can be prevented from increasing. Early prediction in such disease can be controlled and save human’s life. For the early predictions of this disease we collect data set having 8 attributes diabetic of 200 patients. The patients’ sugar level in the body is tested by the features of patient’s glucose content in the body and according to the age. The main Machine learning algorithms are Support vector machine (SVM), naive bayes (NB), K nearest neighbor (KNN) and Decision Tree (DT). In the exiting the Naive Bayes the accuracy levels are 66% but in the Decision tree the accuracy levels are 70 to 71%. The accuracy levels of the patients are not proper in range. But in XG boost classifiers even after the Naïve Bayes 74 Percentage and in Decision tree the accuracy levels are 89 to 90%. In the proposed system the accuracy ranges are shown properly and this is only used mostly. A dataset of 729 patients can be stored in Mongo DB and in that 129 patients repots are taken for the prediction purpose and the remaining are used for training. The training datasets are used for the prediction purposes.

Download Full-text

Komparasi Algoritma Kasifikasi dengan Pendekatan Level Data Untuk Menangani Data Kelas Tidak Seimbang

JURNAL ILMIAH ILMU KOMPUTER ◽

10.35329/jiik.v3i1.60 ◽

2017 ◽

Vol 3 (1) ◽

pp. 1-6

Author(s):

Ahmad Ilham

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Linear Regression ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Level Data ◽

Under Sampling

Masalah data kelas tidak seimbang memiliki efek buruk pada ketepatan prediksi data. Untuk menangani masalah ini, telah banyak penelitian sebelumnya menggunakan algoritma klasifikasi menangani masalah data kelas tidak seimbang. Pada penelitian ini akan menyajikan teknik under-sampling dan over-sampling untuk menangani data kelas tidak seimbang. Teknik ini akan digunakan pada tingkat preprocessing untuk menyeimbangkan kondisi kelas pada data. Hasil eksperimen menunjukkan neural network (NN) lebih unggul dari decision tree (DT), linear regression (LR), naïve bayes (NB) dan support vector machine (SVM).

Download Full-text

Prediction of Heart Disease using Machine Learning

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1081.0982s1019 ◽

2019 ◽

Vol 8 (2S10) ◽

pp. 474-477

Keyword(s):

Machine Learning ◽

Heart Disease ◽

Support Vector Machines ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Data Set ◽

Vector Machines ◽

Naive Bayes Classification ◽

Naïve Bayes Classification

Machine learning is one of the fast growing aspect in current world. Machine learning (ML) and Artificial Neural Network (ANN) are helpful in detection and diagnosis of various heart diseases. Naïve Bayes Classification is a vital approach of classification in machine learning. The heart disease consists of set of range disorders affecting the heart. It includes blood vessel problems such as irregular heart beat issues, weak heart muscles, congenital heart defects, cardio vascular disease and coronary artery disease. Coronary heart disorder is a familiar type of heart disease. It reduces the blood flow to the heart leading to a heart attack. In this paper the UCI machine learning repository data set consisting of patients suffering from heart disease is analyzed using Naïve Bayes classification and support vector machines. The classification accuracy of the patients suffering from heart disease is predicted using Naïve Bayes classification and support vector machines. Implementation is done using R language.

Download Full-text