A Comparative Study of Bug Classification Algorithms

Author(s):  
Naresh Kumar Nagwani ◽  
Shrish Verma

The performance of ten classic algorithms to classify the software bugs for different bug repositories are compared. The algorithms included in the study are Naïve Bayes, Naïve Bayes Multinomial, Discriminative Multinomial Naïve Bayes (DMNB), J48, Support Vector Machine, Radial Basis Function (RBF) Neural Network, Classification using Clustering, Classification using Regression, Adaptive Boosting (AdaBoost) and Bagging. These algorithms are applied on four open source bug repositories namely Android, JBoss-Seam, Mozilla and MySql. The classification is evaluated using 10-fold cross validation technique. The accuracy and F-measure parameters are compared for all of the algorithms. The concept of software bug taxonomy hierarchy is also introduced with eleven standard bug categories (classes). The comparative study also covers the effect of number of categories over performance of classifiers in terms of accuracy and F-measure. The results are produced in tabular and graphical forms.

2016 ◽  
Vol 1 (1) ◽  
pp. 13 ◽  
Author(s):  
Debby Erce Sondakh

Penelitian ini bertujuan untuk mengukur dan membandingkan kinerja lima algoritma klasifikasi teks berbasis pembelajaran mesin, yaitu decision rules, decision tree, k-nearest neighbor (k-NN), naïve Bayes, dan Support Vector Machine (SVM), menggunakan dokumen teks multi-class. Perbandingan dilakukan pada efektifiatas algoritma, yaitu kemampuan untuk mengklasifikasi dokumen pada kategori yang tepat, menggunakan metode holdout atau percentage split. Ukuran efektifitas yang digunakan adalah precision, recall, F-measure, dan akurasi. Hasil eksperimen menunjukkan bahwa untuk algoritma naïve Bayes, semakin besar persentase dokumen pelatihan semakin tinggi akurasi model yang dihasilkan. Akurasi tertinggi naïve Bayes pada persentase 90/10, SVM pada 80/20, dan decision tree pada 70/30. Hasil eksperimen juga menunjukkan, algoritma naïve Bayes memiliki nilai efektifitas tertinggi di antara lima algoritma yang diuji, dan waktu membangun model klasiifikasi yang tercepat, yaitu 0.02 detik. Algoritma decision tree dapat mengklasifikasi dokumen teks dengan nilai akurasi yang lebih tinggi dibanding SVM, namun waktu membangun modelnya lebih lambat. Dalam hal waktu membangun model, k-NN adalah yang tercepat namun nilai akurasinya kurang.


2016 ◽  
Vol 2016 ◽  
pp. 1-12 ◽  
Author(s):  
Abdulaziz Y. Barnawi ◽  
Ismail M. Keshta

Maximizing wireless sensor networks (WSNs) lifetime is a primary objective in the design of these networks. Intelligent energy management models can assist designers to achieve this objective. These models aim to reduce the number of selected sensors to report environmental measurements and, hence, achieve higher energy efficiency while maintaining the desired level of accuracy in the reported measurement. In this paper, we present a comparative study of three intelligent models based on Naive Bayes, Multilayer Perceptrons (MLP), and Support Vector Machine (SVM) classifiers. Simulation results show that Linear-SVM selects sensors that produce higher energy efficiency compared to those selected by MLP and Naive Bayes for the same WSNs Lifetime Extension Factor.


2020 ◽  
Vol 5 ◽  
pp. 19-24
Author(s):  
Dyah Retno Utari ◽  
Arief Wibowo

Asuransi kendaraan bermotor merupakan jenis usaha pertanggungan terhadap kerugian atau risiko kerusakan yang dapat timbul dari berbagai macam potensi kejadian yang menimpa kendaraan. Persaingan dalam bisnis asuransi khususnya untuk kendaraan bermotor menuntut inovasi dan strategi agar keberlangsungan bisnis tetap terjamin. Salah satu upaya yang dapat dilakukan perusahaan adalah memprediksi status keberlanjutan polis asuransi kendaraan dengan menganalisis data-data profil dan transaksi nasabah. Prediksi terhadap keputusan pemegang polis menjadi sangat penting bagi perusahaan, karena dapat menentukan strategi pemasaran yang mempengaruhi keputusan pelanggan untuk pembaharuan polis asuransi. Penelitian ini telah mengusulkan suatu model prediksi status keberlanjutan polis asuransi kendaraan dengan teknik pemilihan mayoritas dari hasil klasifikasi menggunakan algoritma- algoritma data mining seperti Naive Bayes, Support Vector Machine dan Decision Tree. Hasil pengujian menggunakan confusion matrix menunjukkan nilai akurasi terbaik diperoleh sebesar 93,57%, apapun untuk nilai precision mencapai 97,20%, dan nilai recall sebesar 95,20% serta nilai F-Measure sebesar 95,30%. Nilai evaluasi model terbaik dihasilkan menggunakan pendekatan pemilihan mayoritas (majority voting), mengungguli kinerja model prediksi berbasis pengklasifikasi tunggal.


2021 ◽  
Author(s):  
Anshika Arora ◽  
Pinaki Chakraborty ◽  
M.P.S. Bhatia

Excessive use of smartphones throughout the day having dependency on them for social interaction, entertainment and information retrieval may lead users to develop nomophobia. This makes them feel anxious during non-availability of smartphones. This study describes the usefulness of real time smartphone usage data for prediction of nomophobia severity using machine learning. Data is collected from 141 undergraduate students analyzing their perception about their smartphone using the Nomophobia Questionnaire (NMP-Q) and their real time smartphone usage patterns using a purpose-built android application. Supervised machine learning models including Random Forest, Decision Tree, Support Vector Machines, Naïve Bayes and K-Nearest Neighbor are trained using two features sets where the first feature set comprises only the NMP-Q features and the other comprises real time smartphone usage features along with the NMP-Q features. Performance of these models is evaluated using f-measure and area under ROC and It is observed that all the models perform better when provided with smartphone usage features along with the NMP-Q features. Naïve Bayes outperforms other models in prediction of nomophobia achieving a f-measure value of 0.891 and ROC area value of 0.933.


2020 ◽  
Vol 17 (11) ◽  
pp. 5109-5112
Author(s):  
Raj Kumar ◽  
Sanjay Singla

During the software development, all most 30–35 present cost is due to the testing. This means that if a bug travels from one phase to succeeding phases without detection, it will definitely increase the cost of the software development and due to this software quality may be compromised. So use of the data mining algorithm for the software bug classification is highly appreciable. Bug severity may be categorised into S1, S2, S3, S4 and S5 categories, depending on the impact of the severity. In this paper, multiclass of bug severity is done using SVM, KNN, Decision Tree and Naïve Bayes. Comparative analysis of these algorithms is done with respect to accuracy, precision, recall and execution time.


2019 ◽  
Vol 13 (1) ◽  
pp. 11
Author(s):  
Yoga Pristyanto

Pada bidang data mining sering kali para peneliti tidak memperhatikan keseimbangan distribusi kelas pada dataset. Hal ini dapat menimbulkan kesulitan yang cukup serius pada algoritme klasifikasi. karena secara teori mayoritas classifier mengasumsikan distribusi yang relatif seimbang, sehingga menyebabkan kinerja suatu algoritme klasifikasi menjadi kurang maksimal. Oleh karena itu, pada penelitian ini diterapkan metode ensemble dengan penambahan adaptive boosting untuk menyelesaikan permasalahan tersebut. Dari hasil pengujian yang dilakukan pada penelitian ini, metode ensemble dengan penambahan adaptive boosting dapat meningkatkan nilai kinerja algoritme klasifikasi. Nilai kinerja algoritme Naive Bayes dengan Adaptive Boosting akurasi yang dihasilkan sebesar 91.98%, sensitifitas sebesar 91.98%, spesifisitas sebesar 96.49%, dan g-mean sebesar 94.21%. Nilai kinerja algoritme Support Vector Machine dengan Adaptive Boosting akurasi yang dihasilkan sebesar 91.52%, sensitifitas sebesar 91.52%, spesifisitas sebesar 96.29%, dan g-mean sebesar 93.88%. Sedangkan Nilai kinerja algoritme Decision Tree dengan Adaptive Boosting akurasi yang dihasilkan sebesar 94.37%, sensitifitas sebesar 94.37%, spesifisitas sebesar 97.73%, dan g-mean sebesar 96.03%. Hal ini menunjukkan bahwa metode ensemble dengan Adaptive Boosting dapat menjadi solusi untuk meningkatkan kinerja algoritme pada imbalanced dataset.Kata Kunci: adaptive boosting, data mining, ensemble, ketidakseimbangan kelas, klasifikasi.


2019 ◽  
Vol 15 (2) ◽  
pp. 275-280
Author(s):  
Agus Setiyono ◽  
Hilman F Pardede

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam.  One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.


Sign in / Sign up

Export Citation Format

Share Document