Impute, Select, Decision Tree and Naïve Bayes (ISE-DNC): An Ensemble Learning Approach to Classify the Lung Cancer

2020 ◽  
Author(s):  
Bhanumathi S ◽  
Dr. Chandrashekara S N
2019 ◽  
Vol 15 (2) ◽  
pp. 275-280
Author(s):  
Agus Setiyono ◽  
Hilman F Pardede

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam.  One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.


2019 ◽  
Vol 64 (2) ◽  
pp. 53-71
Author(s):  
Botond Benedek ◽  
Ede László

Abstract Customer segmentation represents a true challenge in the automobile insurance industry, as datasets are large, multidimensional, unbalanced and it also requires a unique price determination based on the risk profile of the customer. Furthermore, the price determination of an insurance policy or the validity of the compensation claim, in most cases must be an instant decision. Therefore, the purpose of this research is to identify an easily usable data mining tool that is capable to identify key automobile insurance fraud indicators, facilitating the segmentation. In addition, the methods used by the tool, should be based primarily on numerical and categorical variables, as there is no well-functioning text mining tool for Central Eastern European languages. Hence, we decided on the SQL Server Analysis Services (SSAS) tool and to compare the performance of the decision tree, neural network and Naïve Bayes methods. The results suggest that decision tree and neural network are more suitable than Naïve Bayes, however the best conclusion can be drawn if we use the decision tree and neural network together.


Author(s):  
Kholoud Maswadi ◽  
Norjihan Abdul Ghani ◽  
Suraya Hamid ◽  
Muhammads Babar Rasheed

2021 ◽  
Vol 30 (1) ◽  
pp. 774-792
Author(s):  
Mazin Abed Mohammed ◽  
Dheyaa Ahmed Ibrahim ◽  
Akbal Omran Salman

Abstract Spam electronic mails (emails) refer to harmful and unwanted commercial emails sent to corporate bodies or individuals to cause harm. Even though such mails are often used for advertising services and products, they sometimes contain links to malware or phishing hosting websites through which private information can be stolen. This study shows how the adaptive intelligent learning approach, based on the visual anti-spam model for multi-natural language, can be used to detect abnormal situations effectively. The application of this approach is for spam filtering. With adaptive intelligent learning, high performance is achieved alongside a low false detection rate. There are three main phases through which the approach functions intelligently to ascertain if an email is legitimate based on the knowledge that has been gathered previously during the course of training. The proposed approach includes two models to identify the phishing emails. The first model has proposed to identify the type of the language. New trainable model based on Naive Bayes classifier has also been proposed. The proposed model is trained on three types of languages (Arabic, English and Chinese) and the trained model has used to identify the language type and use the label for the next model. The second model has been built by using two classes (phishing and normal email for each language) as a training data. The second trained model (Naive Bayes classifier) has been applied to identify the phishing emails as a final decision for the proposed approach. The proposed strategy is implemented using the Java environments and JADE agent platform. The testing of the performance of the AIA learning model involved the use of a dataset that is made up of 2,000 emails, and the results proved the efficiency of the model in accurately detecting and filtering a wide range of spam emails. The results of our study suggest that the Naive Bayes classifier performed ideally when tested on a database that has the biggest estimate (having a general accuracy of 98.4%, false positive rate of 0.08%, and false negative rate of 2.90%). This indicates that our Naive Bayes classifier algorithm will work viably on the off chance, connected to a real-world database, which is more common but not the largest.


2014 ◽  
Vol 2014 ◽  
pp. 1-16 ◽  
Author(s):  
Qingchao Liu ◽  
Jian Lu ◽  
Shuyan Chen ◽  
Kangjia Zhao

This study presents the applicability of the Naïve Bayes classifier ensemble for traffic incident detection. The standard Naive Bayes (NB) has been applied to traffic incident detection and has achieved good results. However, the detection result of the practically implemented NB depends on the choice of the optimal threshold, which is determined mathematically by using Bayesian concepts in the incident-detection process. To avoid the burden of choosing the optimal threshold and tuning the parameters and, furthermore, to improve the limited classification performance of the NB and to enhance the detection performance, we propose an NB classifier ensemble for incident detection. In addition, we also propose to combine the Naïve Bayes and decision tree (NBTree) to detect incidents. In this paper, we discuss extensive experiments that were performed to evaluate the performances of three algorithms: standard NB, NB ensemble, and NBTree. The experimental results indicate that the performances of five rules of the NB classifier ensemble are significantly better than those of standard NB and slightly better than those of NBTree in terms of some indicators. More importantly, the performances of the NB classifier ensemble are very stable.


2019 ◽  
Vol 3 (3) ◽  
pp. 103
Author(s):  
Ni Wayan Wardani ◽  
Ni Kadek Ariasih

Pelanggan adalah salah satu aset utama bagi perusahaan ritel. Perusahaan harus dapat mengenali bagaimana karakter pelanggan mereka sehingga mereka dapat mempertahankan pelanggan yang sudah ada agar tidak berhenti membeli dan pindah ke perusahaan ritel yang bersaing (churn). Salah satu model yang tepat untuk mengenali karakter pelanggan adalah model RFM (Recency, Frekuensi, Moneter). Model RFM mampu menghasilkan kelas pelanggan dan di setiap kelas pelanggan dapat dianalisis atau diprediksi dengan konsep data mining apakah pelanggan tetap sebagai pelanggan atau churn. Data yang digunakan berasal dari data pelanggan dan data penjualan di UD. Mawar Sari. Kelas pelanggan UD Mawar Sari yang dihasilkan dari model RFM adalah Dormant, Everyday, Golden dan Superstar. Konsep data mining dengan membangun model prediksi dalam penelitian ini menggunakan algoritma Decision Tree C4.5 dan Naïve Bayes. Di semua kelas pelanggan kinerja Algoritma Naïve Bayes lebih baik daripada Algoritma Decision Tree C4.5 dengan Recall 95,92%, Precision 84,15%, dan Accuracy 83,49% dan kelas pelanggan yang memiliki potensi churn tinggi adalah Dormant B, Dormant E, dan Dormant F.Kata Kunci: Prediksi Churn, RFM, C4.5, Naïve Bayes


2018 ◽  
Vol 7 (1.7) ◽  
pp. 137 ◽  
Author(s):  
Danda Shashank Reddy ◽  
Chinta Naga Harshitha ◽  
Carmel Mary Belinda

Now a day’s many advanced techniques are proposed in diagnosing the tumor in brain like magnetic resonance imaging, computer tomography scan, angiogram, spinal tap and biospy. Based on diagnosis it is easy to predict treatment. All of the types of brain tumor are officially reclassified by the World Health Organization. Brain tumors are of 120 types, almost each tumor is having same symptoms and it is difficult to predict treatment. For this regard we are proposing more accurate and efficient algorithm in predicting the type of brain tumor is Naïve Bayes’ classification and decision tree algorithm. The main focus is on solving tumor classification problem using these algorithms. Here the main goal is to show that the prediction through the decision tree algorithm is simple and easy than the Naïve Bayes’ algorithm.


2020 ◽  
Vol 16 (2) ◽  
pp. 75
Author(s):  
Didit Widiyanto

Akurasi sebuah klasifikasi citra ditentukan oleh pengklasifikasi.  Meskipun RoI (Region of Interest) tidak menentukan secara langsung akurasi, namun RoI menentukan lingkup klasifikasi citra.   Terdapat tiga algoritma yang dapat digunakan sebagai algoritma RoI yaitu; Balanced Histogram Thresholding (BHT), algoritma Otsu, dan algoritma klasterisasi K-Means.  Paper ini meninjau algoritma Otsu dan algoritma klasterisasi K-Means yang digunakan oleh lima peneliti.  Dari ke lima peneliti; tiga peneliti menerapkan algoritma Otsu dan dua peneliti menerapkan algoritma K-Means sebagai algoritma RoI. Setelah operasi RoI, ke lima peneliti menerapkan algoritma GLCM (Gray Level Co-occurance Matrix) sebagai pengekstraksi ciri tekstur.  Hasil ekstraksi ciri diklasifikasi dengan menggunakan berbagai pengklasifikasi antara lain SVM (Support Vector Machine), Naive Bayes, dan Decision Tree. Akhirnya dengan membandingkan hasil dari ke lima peneliti, akurasi tertinggi diperoleh sebesar 100% dengan pengklasifikasi SVM menggunakan algoritma Otsu sebagai algoritma RoI, dan akurasi terendah adalah sebesar52% yang menggunakan algoritma Otsu pada kanal S dari citra HSV (Hue, Saturation Value).


2020 ◽  
Vol 17 (1) ◽  
pp. 37-42
Author(s):  
Yuris Alkhalifi ◽  
Ainun Zumarniansyah ◽  
Rian Ardianto ◽  
Nila Hardi ◽  
Annisa Elfina Augustia

Non-Cash Food Assistance or Bantuan Pangan Non-Tunai (BPNT) is food assistance from the government given to the Beneficiary Family (KPM) every month through an electronic account mechanism that is used only to buy food at the Electronic Shop Mutual Assistance Joint Business Group Hope Family Program (e-Warong KUBE PKH ) or food traders working with Bank Himbara. In its distribution, BPNT still has problems that occur that are experienced by the village apparatus especially the apparatus of Desa Wanasari on making decisions, which ones are worthy of receiving (poor) and not worthy of receiving (not poor). So one way that helps in making decisions can be done through the concept of data mining. In this study, a comparison of 2 algorithms will be carried out namely Naive Bayes Classifier and Decision Tree C.45. The total sample used is as much as 200 head of household data which will then be divided into 2 parts into validation techniques is 90% training data and 10% test data of the total sample used then the proposed model is made in the RapidMiner application and then evaluated using the Confusion Matrix table to find out the highest level of accuracy from 2 of these methods. The results in this classification indicate that the level of accuracy in the Naive Bayes Classifier method is 98.89% and the accuracy level in the Decision Tree C.45 method is 95.00%. Then the conclusion that in this study the algorithm with the highest level of accuracy is the Naive Bayes Classifier algorithm method with a difference in the accuracy rate of 3.89%.


Sign in / Sign up

Export Citation Format

Share Document