RB-Bayes algorithm for the prediction of diabetic in Pima Indian dataset

<p>Diabetes is a major concern all over the world. It is increasing at a fast pace. People can avoid diabetes at an early stage without any test. The goal of this paper is to predict the probability of whether the person has a risk of diabetes or not at an early stage. This would lead to having a great impact on their quality of human life. The datasets are Pima Indians diabetes and Cleveland coronary illness and consist of 768 records. Though there are a number of solutions available for information extraction from a huge datasets and to predict the possibility of having diabetes, but the accuracy of their mining process is far from accurate. For achieving highest accuracy, the issue of zero probability which is generally faced by naïve bayes analysis needs to be addressed suitably. The proposed framework RB-Bayes aims to extract the required information with high accuracy that could survive the problem of zero probability and also configure accuracy with other methods like Support Vector Machine, Naive Bayes, and K Nearest Neighbor. We calculated mean to handle missing data and calculated probability for yes (positive) and no (negative). The highest value between yes and no decide the value for the tuple. It is mostly used in text classification. The outcomes on Pima Indian diabetes dataset demonstrate that the proposed methodology enhances the precision as a contrast with other regulated procedures. The accuracy of the proposed methodology large dataset is 72.9%.</p>

Download Full-text

Effectiveness of Classification Methods on the Diabetes System

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v12i330287 ◽

2021 ◽

pp. 33-43

Author(s):

Ahmed T. Shawky ◽

Ismail M. Hagag

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Early Stage ◽

Research Paper ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Machine Learning Classification ◽

Bayes Algorithm

In today’s world using data mining and classification is considered to be one of the most important techniques, as today’s world is full of data that is generated by various sources. However, extracting useful knowledge out of this data is the real challenge, and this paper conquers this challenge by using machine learning algorithms to use data for classifiers to draw meaningful results. The aim of this research paper is to design a model to detect diabetes in patients with high accuracy. Therefore, this research paper using five different algorithms for different machine learning classification includes, Decision Tree, Support Vector Machine (SVM), Random Forest, Naive Bayes, and K- Nearest Neighbor (K-NN), the purpose of this approach is to predict diabetes at an early stage. Finally, we have compared the performance of these algorithms, concluding that K-NN algorithm is a better accuracy (81.16%), followed by the Naive Bayes algorithm (76.06%).

Download Full-text

KOMPARASI NAÏVE BAYES, SUPPORT VECTOR MACHINE DAN K-NEAREST NEIGHBOR UNTUK MENGETAHUI AKURASI TERTINGGI PADA PREDIKSI KELANCARAN PEMBAYARAN TV KABEL

ILKOM Jurnal Ilmiah ◽

10.33096/ilkom.v11i1.408.11-16 ◽

2019 ◽

Vol 11 (1) ◽

pp. 11-16

Author(s):

Mohamad Efendi Lasulika

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Types ◽

Neural Network Algorithm ◽

Bayes Algorithm

One obstacle of the default payment is the lack of analysis in the new customer acceptance process which is only reviewed from the form provided at registration, as for the purpose of this study to find out the highest accuracy results from the comparison of Naïve Bayes, SVM and K-NN Algorithms. It can be seen that the Naïve Bayes algorithm which has the highest accuracy value is 96%, while the K-Neural Network algorithm has the highest accuracy at K = 3 which is 92%, while Support Vector Machine only gets accuracy of 66%. The ROC Curve results show that Naïve Bayes achieved the best AUC value of 0.99. Comparison between data mining classification algorithms namely Naïve Bayes, K-Neural Network and Support Vector Machine for predicting smooth payment using multivariate data types, Naïve Bayes method is an accurate algorithm and this method is also very dominant towards other methods. Based on Accuracy, AUC and T-tests this method falls into the best classification category.

Download Full-text

Prediksi Harga Minyak Kelapa Sawit Dalam Investasi Dengan Membandingkan Algoritma Naïve Bayes, Support Vector Machine dan K-Nearest Neighbor

IT for Society ◽

10.33021/itfs.v4i1.1181 ◽

2019 ◽

Vol 4 (1) ◽

Author(s):

Deny Haryadi ◽

Rila Mandala

Keyword(s):

Support Vector Machine ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

K Nearest Neighbor

Harga minyak kelapa sawit bisa mengalami kenaikan, penurunan maupun tetap setiap hari karena faktor yang mempengaruhi harga minyak kelapa sawit seperti harga minyak nabati lain (minyak kedelai dan minyak canola), harga minyak mentah dunia, maupun nilai tukar riil antara kurs dolar terhadap mata uang negara produsen (rupiah, ringgit, dan canada) atau mata uang negara konsumen (rupee). Untuk itu dibutuhkan prediksi harga minyak kelapa sawit yang cukup akurat agar para investor bisa mendapatkan keuntungan sesuai perencanaan yang dibuat. tujuan dari penelitian ini yaitu untuk mengetahui perbandingan accuracy, precision, dan recall yang dihasilkan oleh algoritma Naïve Bayes, Support Vector Machine, dan K-Nearest Neighbor dalam menyelesaikan masalah prediksi harga minyak kelapa sawit dalam investasi. Berdasarkan hasil pengujian dalam penelitian yang telah dilakukan, algoritma Support Vector Machine memiliki accuracy, precision, dan recall dengan jumlah paling tinggi dibandingkan dengan algoritma Naïve Bayes dan algoritma K-Nearest Neighbor. Nilai accuracy tertinggi pada penelitian ini yaitu 82,46% dengan precision tertinggi yaitu 86% dan recall tertinggi yaitu 89,06%.

Download Full-text

Comparative analysis on bayesian classification for breast cancer problem

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v8i4.1628 ◽

2019 ◽

Vol 8 (4) ◽

Author(s):

Wan Nor Liyana Wan Hassan Ibeni ◽

Mohd Zaki Mohd Salikon ◽

Aida Mustapha ◽

Saiful Adli Daud ◽

Mohd Najib Mohd Salleh

Keyword(s):

Breast Cancer ◽

Bayesian Networks ◽

Nearest Neighbor ◽

Naive Bayes ◽

Likelihood Estimation ◽

Predictive Distribution ◽

Naïve Bayes ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor

The problem of imbalanced class distribution or small datasets is quite frequent in certain fields especially in medical domain. However, the classical Naive Bayes approach in dealing with uncertainties within medical datasets face with the difficulties in selecting prior distributions, whereby parameter estimation such as the maximum likelihood estimation (MLE) and maximum a posteriori (MAP) often hurt the accuracy of predictions. This paper presents the full Bayesian approach to assess the predictive distribution of all classes using three classifiers; naïve bayes (NB), bayesian networks (BN), and tree augmented naïve bayes (TAN) with three datasets; Breast cancer, breast cancer wisconsin, and breast tissue dataset. Next, the prediction accuracies of bayesian approaches are also compared with three standard machine learning algorithms from the literature; K-nearest neighbor (K-NN), support vector machine (SVM), and decision tree (DT). The results showed that the best performance was the bayesian networks (BN) algorithm with accuracy of 97.281%. The results are hoped to provide as base comparison for further research on breast cancer detection. All experiments are conducted in WEKA data mining tool.

Download Full-text

COMPARATIVE STUDY OF CLASSIFICATION ALGORITHMS: HOLDOUTS AS ACCURACY ESTIMATION

CogITo Smart Journal ◽

10.31154/cogito.v1i1.2.13-23 ◽

2016 ◽

Vol 1 (1) ◽

pp. 13 ◽

Cited By ~ 1

Author(s):

Debby Erce Sondakh

Keyword(s):

Decision Tree ◽

Nearest Neighbor ◽

Naive Bayes ◽

Decision Rules ◽

Naïve Bayes ◽

Support Vector ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Accuracy Estimation ◽

F Measure

Penelitian ini bertujuan untuk mengukur dan membandingkan kinerja lima algoritma klasifikasi teks berbasis pembelajaran mesin, yaitu decision rules, decision tree, k-nearest neighbor (k-NN), naïve Bayes, dan Support Vector Machine (SVM), menggunakan dokumen teks multi-class. Perbandingan dilakukan pada efektifiatas algoritma, yaitu kemampuan untuk mengklasifikasi dokumen pada kategori yang tepat, menggunakan metode holdout atau percentage split. Ukuran efektifitas yang digunakan adalah precision, recall, F-measure, dan akurasi. Hasil eksperimen menunjukkan bahwa untuk algoritma naïve Bayes, semakin besar persentase dokumen pelatihan semakin tinggi akurasi model yang dihasilkan. Akurasi tertinggi naïve Bayes pada persentase 90/10, SVM pada 80/20, dan decision tree pada 70/30. Hasil eksperimen juga menunjukkan, algoritma naïve Bayes memiliki nilai efektifitas tertinggi di antara lima algoritma yang diuji, dan waktu membangun model klasiifikasi yang tercepat, yaitu 0.02 detik. Algoritma decision tree dapat mengklasifikasi dokumen teks dengan nilai akurasi yang lebih tinggi dibanding SVM, namun waktu membangun modelnya lebih lambat. Dalam hal waktu membangun model, k-NN adalah yang tercepat namun nilai akurasinya kurang.

Download Full-text

Comparison of the Performance of the k-Nearest Neighbor, Naïve Bayes Classifier and Support Vector Machine Algorithm With SMOTE for Classification of Bully Behavior on the WhatsApp Messenger Application

Proceedings of the 1st International Conference on Folklore, Language, Education and Exhibition (ICOFLEX 2019) ◽

10.2991/assehr.k.201230.028 ◽

2020 ◽

Author(s):

Irwansyah Saputra ◽

Puput Irfansyah ◽

Erlando Doni Sirait ◽

Dwi Dani Apriyani ◽

Michael Sonny

Keyword(s):

Support Vector Machine ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Support Vector Machine Algorithm ◽

Bayes Classifier ◽

K Nearest Neighbor ◽

Bully Behavior

Download Full-text

KOMPARASI ALGORITMA KLASIFIKASI PADA ANALISIS REVIEW HOTEL

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v14i2.1023 ◽

2018 ◽

Vol 14 (2) ◽

pp. 261

Author(s):

Lila Dini Utami

Keyword(s):

Support Vector Machine ◽

Nearest Neighbor ◽

Naive Bayes ◽

Service Providers ◽

Naïve Bayes ◽

Support Vector ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Auc Value

At this time the freedom to express opinions in oral and written forms about everything is very easy. This activity can be used to make decisions by some business people. Especially by service providers, such as hotels. This will be very useful in the development of the hotel business itself. But the review data must be processed using the right algorithm. So this study was conducted to find out which algorithms are more feasible to use to get the highest accuracy. The methods used are Naïve Bayes (NB), Support Vector Machine (SVM), and k-Nearest Neighbor (k-NN). From the process that has been done, the results of Naïve Bayes accuracy are 71.50% with the AUC value is 0.500, Support Vector Machine is 72.50% with the AUC value is 0.936 and the accuracy results if using the k-Nearest Neighbor algorithm is 75.00% with the AUC value is 0.500. The use of the k-Nearest Neighbor algorithm can help in making more appropriate decisions for hotel reviews at this time.

Download Full-text

Sentimen Analisis Stay Home menggunakan metode klasifikasi Naive Bayes, Support Vector Machine, dan k-Nearest Neighbor

Paradigma - Jurnal Komputer dan Informatika ◽

10.31294/p.v22i2.8237 ◽

2020 ◽

Vol 22 (2) ◽

pp. 169-174

Author(s):

Ikhwanul Hakim ◽

Arifin Nugroho ◽

Sulaeman Hadi Sukmana ◽

Windu Gata

Keyword(s):

Support Vector Machine ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

KOMPARASI METODE KLASIFIKASI PADA ANALISIS SENTIMEN USAHA WARALABA BERDASARKAN DATA TWITTER

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v15i2.752 ◽

2019 ◽

Vol 15 (2) ◽

pp. 267-274

Author(s):

Tati Mardiana ◽

Hafiz Syahreva ◽

Tuslaela Tuslaela

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Decision Tree ◽

Nearest Neighbor ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Support Vector ◽

K Nearest Neighbor

Saat ini usaha waralaba di Indonesia memiliki daya tarik yang relatif tinggi. Namun, para pelaku usaha banyak juga yang mengalami kegagalan. Bagi seseorang yang ingin memulai usaha perlu mempertimbangkan sentimen masyarakat terhadap usaha waralaba. Meskipun demikian, tidak mudah untuk melakukan analisis sentimen karena banyaknya jumlah percakapan di Twitter terkait usaha waralaba dan tidak terstruktur. Tujuan penelitian ini adalah melakukan komparasi akurasi metode Neural Network, K-Nearest Neighbor, Naïve Bayes, Support Vector Machine, dan Decision Tree dalam mengekstraksi atribut pada dokumen atau teks yang berisi komentar untuk mengetahui ekspresi didalamnya dan mengklasifikasikan menjadi komentar positif dan negatif. Penelitian ini menggunakan data realtime dari tweets pada Twitter. Selanjutnya mengolah data tersebut dengan terlebih dulu membersihkannya dari noise dengan menggunakan Phyton. Hasil pengujian dengan confusion matrix diperoleh nilai akurasi Neural Network sebesar 83%, K-Nearest Neighbor sebesar 52%, Support Vector Machine sebesar 83%, dan Decision Tree sebesar 81%. Penelitian ini menunjukkan metode Support Vector Machine dan Neural Network paling baik untuk mengklasifikasikan komentar positif dan negatif terkait usaha waralaba.

Download Full-text

DIAGNOSA HAMA DAN PENYAKIT PADA TANAMAN PADI MENGGUNAKAN METODE NAIVE BAYES DAN K-NEAREST NEIGHBOR

Jurnal Komputer dan Informatika ◽

10.35508/jicon.v8i2.2906 ◽

2020 ◽

Vol 8 (2) ◽

pp. 156-162

Author(s):

Restanti M Bianome ◽

Yelly Y Nabuasa ◽

Derwin R Sina

Keyword(s):

Test Data ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

Case Based Reasoning ◽

K Nearest Neighbor ◽

Rice Plants ◽

Average Value ◽

Bayes Algorithm ◽

Degree Of Similarity

This study builds systems Case Based Reasoning (CBR) to diagnose pests and diseases in rice plants using Naïve Bayes algorithm and K-Nearest Neighbor. CBR is one method of solving the problem with new cases of decision making based on the solution of previous cases by calculating the degree of similarity (similarity), The case consists of 13 species and 10 types of disease pests of rice plants. The degree of similarity can be determined by indexing and nonindexing. Indexing is the process of grouping the cases by classes that have been determined, while nonindexing a process without grouping cases. Based on cross validation testing using average values obtained accuracy of 92.88% to 153 test data on testing using the indexing and the average value of 89.63% accuracy of the test data in the test 153 using nonindexing.

Download Full-text