Ensemble Machine Learning Model for Software Defect Prediction

Software defect prediction is a significant activity in every software firm. It helps in producing quality software by reliable defect prediction, defect elimination, and prediction of modules that are susceptible to defect. Several researchers have proposed different software prediction approaches in the past. However, these conventional software defect predictions are prone to low classification accuracy, time-consuming, and tasking. This paper aims to develop a novel multi-model ensemble machine-learning for software defect prediction. The ensemble technique can reduce inconsistency among training and test datasets and eliminate bias in the training and testing phase of the model, thereby overcoming the downsides that have characterized the existing techniques used for the prediction of a software defect. To address these shortcomings, this paper proposes a new ensemble machine-learning model for software defect prediction using k Nearest Neighbour (kNN), Generalized Linear Model with Elastic Net Regularization (GLMNet), and Linear Discriminant Analysis (LDA) with Random Forest as base learner. Experiments were conducted using the proposed model on CM1, JM1, KC3, and PC3 datasets from the NASA PROMISE repository using the RStudio simulation tool. The ensemble technique achieved 87.69% for CM1 dataset, 81.11% for JM1 dataset, 90.70% for PC3 dataset, and 94.74% for KC3 dataset. The performance of the proposed system was compared with that of other existing techniques in literature in terms of AUC. The ensemble technique achieved 87%, which is better than the other seven state-of-the-art techniques under consideration. On average, the proposed model achieved an overall prediction accuracy of 88.56% for all datasets used for experiments. The results demonstrated that the ensemble model succeeded in effectively predicting the defects in PROMISE datasets that are notorious for their noisy features and high dimensions. This shows that ensemble machine learning is promising and the future of software defect prediction.

Download Full-text

Optimal Machine learning Model for Software Defect Prediction

International Journal of Intelligent Systems and Applications ◽

10.5815/ijisa.2019.02.05 ◽

2019 ◽

Vol 11 (2) ◽

pp. 36-48

Author(s):

Tripti Lamba ◽

◽

Kavita ◽

A.K. Mishra

Keyword(s):

Machine Learning ◽

Learning Model ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Machine Learning Model ◽

Optimal Machine

Download Full-text

A Machine Learning Model Comparison and Selection Framework for Software Defect Prediction Using VIKOR

10.1007/978-3-030-85540-6_113 ◽

2021 ◽

pp. 890-898

Author(s):

Miguel Ángel Quiroz Martinez ◽

Byron Alcívar Martínez Tayupanda ◽

Sulay Stephanie Camatón Paguay ◽

Luis Andy Briones Peñafiel

Keyword(s):

Machine Learning ◽

Model Comparison ◽

Learning Model ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Machine Learning Model ◽

Selection Framework

Download Full-text

An Improved Approach to Software Defect Prediction using a Hybrid Machine Learning Model

2018 20th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing (SYNASC) ◽

10.1109/synasc.2018.00074 ◽

2018 ◽

Author(s):

Diana-Lucia Miholca

Keyword(s):

Machine Learning ◽

Learning Model ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Machine Learning Model ◽

Hybrid Machine

Download Full-text

Improvement in Software Defect Prediction Outcome Using Principal Component Analysis and Ensemble Machine Learning Algorithms

International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI) 2018 - Lecture Notes on Data Engineering and Communications Technologies ◽

10.1007/978-3-030-03146-6_44 ◽

2018 ◽

pp. 397-406 ◽

Cited By ~ 2

Author(s):

N. Dhamayanthi ◽

B. Lavanya

Keyword(s):

Machine Learning ◽

Principal Component Analysis ◽

Learning Algorithms ◽

Principal Component ◽

Machine Learning Algorithms ◽

Defect Prediction ◽

Software Defect Prediction ◽

Prediction Outcome ◽

Software Defect ◽

Ensemble Machine Learning

Download Full-text

AN EFFICIENT MACHINE LEARNING MODEL FOR PREDICTION OF ACUTE MYOCARDIAL INFARCTION

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813666200325104317 ◽

2020 ◽

Vol 13 ◽

Author(s):

Dhilsath Fathima.M ◽

S. Justin Samuel ◽

R. Hari Haran

Keyword(s):

Machine Learning ◽

Myocardial Infarction ◽

Acute Myocardial Infarction ◽

Logistic Regression ◽

Decision Tree ◽

Learning Model ◽

Training Dataset ◽

Data Set ◽

Machine Learning Model ◽

Proposed Model

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.

Download Full-text

Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study

10.1109/icscc51209.2021.9528170 ◽

2021 ◽

Author(s):

Sushant Kumar Pandey ◽

Anil Kumar Tripathi

Keyword(s):

Machine Learning ◽

Empirical Study ◽

Prediction Models ◽

Class Imbalance ◽

Machine Learning Techniques ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Learning Techniques ◽

Defect Prediction Models

Download Full-text

SDP-ML: An Automated Approach of Software Defect Prediction employing Machine Learning Techniques

10.1109/icecit54077.2021.9641218 ◽

2021 ◽

Author(s):

Md Nasir Uddin ◽

Bixin Li ◽

Md Naim Mondol ◽

Md Mostafizur Rahman ◽

Md Suman Mia ◽

...

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Learning Techniques

Download Full-text

A Novel Ensemble Machine Learning Model for Prediction of Zika Virus T-Cell Epitopes

10.1007/978-981-16-6285-0_23 ◽

2021 ◽

pp. 275-292

Author(s):

Syed Nisar Hussain Bukhari ◽

Amit Jain ◽

Ehtishamul Haq

Keyword(s):

Machine Learning ◽

T Cell ◽

Zika Virus ◽

Learning Model ◽

T Cell Epitopes ◽

Ensemble Machine Learning ◽

Machine Learning Model

Download Full-text

Towards an Ensemble Machine Learning Model of Random Subspace Based Functional Tree Classifier for Snow Avalanche Susceptibility Mapping

IEEE Access ◽

10.1109/access.2020.3014816 ◽

2020 ◽

Vol 8 ◽

pp. 145968-145983 ◽

Cited By ~ 3

Author(s):

Amirhosein Mosavi ◽

Ataollah Shirzadi ◽

Bahram Choubin ◽

Fereshteh Taromideh ◽

Farzaneh Sajedi Hosseini ◽

...

Keyword(s):

Machine Learning ◽

Learning Model ◽

Susceptibility Mapping ◽

Snow Avalanche ◽

Random Subspace ◽

Ensemble Machine Learning ◽

Machine Learning Model ◽

Tree Classifier

Download Full-text

Ensemble Machine Learning Model for Mortality Prediction Inside Intensive Care Unit

Studies in Computational Intelligence - Medical Informatics and Bioimaging Using Artificial Intelligence ◽

10.1007/978-3-030-91103-4_14 ◽

2021 ◽

pp. 245-258

Author(s):

Nora El-Rashidy ◽

Shaker El-Sappagh ◽

Samir Abdelrazik ◽

Hazem El-Bakry

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Learning Model ◽

Mortality Prediction ◽

Ensemble Machine Learning ◽

Machine Learning Model

Download Full-text