scholarly journals Instance Selection dengan Naïve Bayes pada Klasifikasi Kanker Serviks

2021 ◽  
Vol 5 (2) ◽  
pp. 83-91
Author(s):  
Fari Katul Fikriah

There are several deadly disease for woman, one of which is servical cancer. The growth and development of the disease is very slow, so that treatment if know form the beginning will facilitate the healing process, but conversely unknown cancers from the beginning will become dangereous and deadly disease due to relatively difficult healing. Biopsy action is one way to detect the presence of cancer. In the previous study, classification of cervical cancer had the bighest accuracy value of 97,515% using the decision tree method of several feature selection technique. for this reason, this research will use the decision tree or tree C4.5 classification method, logistic function and zeroR which were previously carried out processing with instance selection with Naïve Bayes by reducing the elimination of missing values with the aim of increasing the level of accuracy better than previous studies. C4.5 classification in this study has the most maximum results compared to other classification methods with an accuracy value of 99,69%.

2020 ◽  
Vol 4 (4) ◽  
pp. 635-641
Author(s):  
Nurul Chamidah ◽  
Mayanda Mega Santoni ◽  
Nurhafifah Matondang

Oversampling is a technique to balance the number of data records for each class by generating data with a small number of records in a class, so that the amount is balanced with data with a class with a large number of records. Oversampling in this study is applied to hypertension dataset where hypertensive class has a small number of records when compared to the number of records for non-hypertensive classes. This study aims to evaluate the effect of oversampling on the classification of hypertension dataset consisting of hypertensive and non-hypertensive classes by utilizing the Naïve Bayes, Decision Tree, and Artificial Neural Network (ANN) as well as finding the best model of the three algorithms. Evaluation of the use of oversampling on hypertension dataset is done by processing the data by imputing missing values, oversampling, and transforming data into the same range, then using the Naïve Bayes, Decision Tree, and ANN to build classification models. By dividing 80% of data as training data to build models and 20% as validation data for testing models, we had an increase in classification performance in the form of accuracy, precision, and recall of the oversampled data when compared without oversampling. The best performance in this study resulted in the highest accuracy using ANN with 0.91, precision 0.86 and recall 0.99.


2021 ◽  
Vol 5 (4) ◽  
pp. 646
Author(s):  
Rani Puspita ◽  
Agus Widodo

BPJS is really helpful because one of its goal is to provide good service for the member in terms of healthiness. But, when there’s many people using the service, then it will cause more pros and contras. Therefore, researcher will be doing sentiment analysis in the field of data mining towards bpjs users on social media Twitter as much as 1000 data that later will be filtered to be 903 data because there are some data that has been duplicated. Researchers used the KNN, Decision Tree, and Naïve Bayes methods to compare the accuracy of the three methods. Researchers used the RapidMiner version 9.7.2 tools. The results showed that the sentiment analysis of Twitter data on BPJS services using the KNN method reached an accuracy level of 95.58% with class precision for pred. negative is 45.00%, pred. positive is 0.00%, and pred. neutral is 96.83%. Then the Decision Tree method the accuracy rate reaches 96.13% with the precision class for pred. negative is 55.00%, pred. positive is 0.00%, and pred. neutral is 97.28%. And the last one is the Naïve Bayes method which achieves 89.14% accuracy with precision class for pred. negative is 16.67%, pred. positive was 1.64%, and pred. neutral is 98.40%.


2019 ◽  
Vol 15 (2) ◽  
pp. 275-280
Author(s):  
Agus Setiyono ◽  
Hilman F Pardede

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam.  One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.


2019 ◽  
Vol 64 (2) ◽  
pp. 53-71
Author(s):  
Botond Benedek ◽  
Ede László

Abstract Customer segmentation represents a true challenge in the automobile insurance industry, as datasets are large, multidimensional, unbalanced and it also requires a unique price determination based on the risk profile of the customer. Furthermore, the price determination of an insurance policy or the validity of the compensation claim, in most cases must be an instant decision. Therefore, the purpose of this research is to identify an easily usable data mining tool that is capable to identify key automobile insurance fraud indicators, facilitating the segmentation. In addition, the methods used by the tool, should be based primarily on numerical and categorical variables, as there is no well-functioning text mining tool for Central Eastern European languages. Hence, we decided on the SQL Server Analysis Services (SSAS) tool and to compare the performance of the decision tree, neural network and Naïve Bayes methods. The results suggest that decision tree and neural network are more suitable than Naïve Bayes, however the best conclusion can be drawn if we use the decision tree and neural network together.


Author(s):  
Kholoud Maswadi ◽  
Norjihan Abdul Ghani ◽  
Suraya Hamid ◽  
Muhammads Babar Rasheed

2014 ◽  
Vol 2014 ◽  
pp. 1-16 ◽  
Author(s):  
Qingchao Liu ◽  
Jian Lu ◽  
Shuyan Chen ◽  
Kangjia Zhao

This study presents the applicability of the Naïve Bayes classifier ensemble for traffic incident detection. The standard Naive Bayes (NB) has been applied to traffic incident detection and has achieved good results. However, the detection result of the practically implemented NB depends on the choice of the optimal threshold, which is determined mathematically by using Bayesian concepts in the incident-detection process. To avoid the burden of choosing the optimal threshold and tuning the parameters and, furthermore, to improve the limited classification performance of the NB and to enhance the detection performance, we propose an NB classifier ensemble for incident detection. In addition, we also propose to combine the Naïve Bayes and decision tree (NBTree) to detect incidents. In this paper, we discuss extensive experiments that were performed to evaluate the performances of three algorithms: standard NB, NB ensemble, and NBTree. The experimental results indicate that the performances of five rules of the NB classifier ensemble are significantly better than those of standard NB and slightly better than those of NBTree in terms of some indicators. More importantly, the performances of the NB classifier ensemble are very stable.


2019 ◽  
Vol 3 (3) ◽  
pp. 103
Author(s):  
Ni Wayan Wardani ◽  
Ni Kadek Ariasih

Pelanggan adalah salah satu aset utama bagi perusahaan ritel. Perusahaan harus dapat mengenali bagaimana karakter pelanggan mereka sehingga mereka dapat mempertahankan pelanggan yang sudah ada agar tidak berhenti membeli dan pindah ke perusahaan ritel yang bersaing (churn). Salah satu model yang tepat untuk mengenali karakter pelanggan adalah model RFM (Recency, Frekuensi, Moneter). Model RFM mampu menghasilkan kelas pelanggan dan di setiap kelas pelanggan dapat dianalisis atau diprediksi dengan konsep data mining apakah pelanggan tetap sebagai pelanggan atau churn. Data yang digunakan berasal dari data pelanggan dan data penjualan di UD. Mawar Sari. Kelas pelanggan UD Mawar Sari yang dihasilkan dari model RFM adalah Dormant, Everyday, Golden dan Superstar. Konsep data mining dengan membangun model prediksi dalam penelitian ini menggunakan algoritma Decision Tree C4.5 dan Naïve Bayes. Di semua kelas pelanggan kinerja Algoritma Naïve Bayes lebih baik daripada Algoritma Decision Tree C4.5 dengan Recall 95,92%, Precision 84,15%, dan Accuracy 83,49% dan kelas pelanggan yang memiliki potensi churn tinggi adalah Dormant B, Dormant E, dan Dormant F.Kata Kunci: Prediksi Churn, RFM, C4.5, Naïve Bayes


2018 ◽  
Vol 7 (1.7) ◽  
pp. 137 ◽  
Author(s):  
Danda Shashank Reddy ◽  
Chinta Naga Harshitha ◽  
Carmel Mary Belinda

Now a day’s many advanced techniques are proposed in diagnosing the tumor in brain like magnetic resonance imaging, computer tomography scan, angiogram, spinal tap and biospy. Based on diagnosis it is easy to predict treatment. All of the types of brain tumor are officially reclassified by the World Health Organization. Brain tumors are of 120 types, almost each tumor is having same symptoms and it is difficult to predict treatment. For this regard we are proposing more accurate and efficient algorithm in predicting the type of brain tumor is Naïve Bayes’ classification and decision tree algorithm. The main focus is on solving tumor classification problem using these algorithms. Here the main goal is to show that the prediction through the decision tree algorithm is simple and easy than the Naïve Bayes’ algorithm.


Sign in / Sign up

Export Citation Format

Share Document