Impact of Feature Selection Methods on the Predictive Performance of Software Defect Prediction Models: An Extensive Empirical Study

Feature selection (FS) is a feasible solution for mitigating high dimensionality problem, and many FS methods have been proposed in the context of software defect prediction (SDP). Moreover, many empirical studies on the impact and effectiveness of FS methods on SDP models often lead to contradictory experimental results and inconsistent findings. These contradictions can be attributed to relative study limitations such as small datasets, limited FS search methods, and unsuitable prediction models in the respective scope of studies. It is hence critical to conduct an extensive empirical study to address these contradictions to guide researchers and buttress the scientific tenacity of experimental conclusions. In this study, we investigated the impact of 46 FS methods using Naïve Bayes and Decision Tree classifiers over 25 software defect datasets from 4 software repositories (NASA, PROMISE, ReLink, and AEEEM). The ensuing prediction models were evaluated based on accuracy and AUC values. Scott–KnottESD and the novel Double Scott–KnottESD rank statistical methods were used for statistical ranking of the studied FS methods. The experimental results showed that there is no one best FS method as their respective performances depends on the choice of classifiers, performance evaluation metrics, and dataset. However, we recommend the use of statistical-based, probability-based, and classifier-based filter feature ranking (FFR) methods, respectively, in SDP. For filter subset selection (FSS) methods, correlation-based feature selection (CFS) with metaheuristic search methods is recommended. For wrapper feature selection (WFS) methods, the IWSS-based WFS method is recommended as it outperforms the conventional SFS and LHS-based WFS methods.

Download Full-text

Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study

10.1109/icscc51209.2021.9528170 ◽

2021 ◽

Author(s):

Sushant Kumar Pandey ◽

Anil Kumar Tripathi

Keyword(s):

Machine Learning ◽

Empirical Study ◽

Prediction Models ◽

Class Imbalance ◽

Machine Learning Techniques ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Learning Techniques ◽

Defect Prediction Models

Download Full-text

An empirical study on pareto based multi-objective feature selection for software defect prediction

Journal of Systems and Software ◽

10.1016/j.jss.2019.03.012 ◽

2019 ◽

Vol 152 ◽

pp. 215-238 ◽

Cited By ~ 9

Author(s):

Chao Ni ◽

Xiang Chen ◽

Fangfang Wu ◽

Yuxiang Shen ◽

Qing Gu

Keyword(s):

Feature Selection ◽

Empirical Study ◽

Defect Prediction ◽

Software Defect Prediction ◽

Multi Objective ◽

Software Defect ◽

Selection For

Download Full-text

Impact of feature selection on classification via clustering techniques in software defect prediction

Journal of Computer Science and Its Application ◽

10.4314/jcsia.v26i1.8 ◽

2020 ◽

Vol 26 (1) ◽

Cited By ~ 1

Author(s):

F.E. Usman-Hamza ◽

A.F. Atte ◽

A.O. Balogun ◽

H.A. Mojeed ◽

A.O. Bajeh ◽

...

Keyword(s):

Feature Selection ◽

Information Gain ◽

Feature Selection Method ◽

Predictive Performance ◽

Defect Prediction ◽

Software Defect Prediction ◽

Selection Methods ◽

Clustering Techniques ◽

Software Defect ◽

The Impact

Software testing using software defect prediction aims to detect as many defects as possible in software before the software release. This plays an important role in ensuring quality and reliability. Software defect prediction can be modeled as a classification problem that classifies software modules into two classes: defective and non-defective; and classification algorithms are used for this process. This study investigated the impact of feature selection methods on classification via clustering techniques for software defect prediction. Three clustering techniques were selected; Farthest First Clusterer, K-Means and Make-Density Clusterer, and three feature selection methods: Chi-Square, Clustering Variation, and Information Gain were used on software defect datasets from NASA repository. The best software defect prediction model was farthest-first using information gain feature selection method with an accuracy of 78.69%, precision value of 0.804 and recall value of 0.788. The experimental results showed that the use of clustering techniques as a classifier gave a good predictive performance and feature selection methods further enhanced their performance. This indicates that classification via clustering techniques can give competitive results against standard classification methods with the advantage of not having to train any model using labeled dataset; as it can be used on the unlabeled datasets.Keywords: Classification, Clustering, Feature Selection, Software Defect PredictionVol. 26, No 1, June, 2019

Download Full-text

The Use of Ensemble-Based Data Preprocessing Techniques for Software Defect Prediction

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194014400105 ◽

2014 ◽

Vol 24 (09) ◽

pp. 1229-1253 ◽

Cited By ~ 3

Author(s):

Kehan Gao ◽

Taghi M. Khoshgoftaar ◽

Amri Napolitano

Keyword(s):

Feature Selection ◽

Prediction Models ◽

Measurement Data ◽

Class Imbalance ◽

Data Preprocessing ◽

High Dimensionality ◽

Training Dataset ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Software defect prediction models that use software metrics such as code-level measurements and defect data to build classification models are useful tools for identifying potentially-problematic program modules. Effectiveness of detecting such modules is affected by the software measurements used, making data preprocessing an important step during software quality prediction. Generally, there are two problems affecting software measurement data: high dimensionality (where a training dataset has an extremely large number of independent attributes, or features) and class imbalance (where a training dataset has one class with relatively many more members than the other class). In this paper, we present a novel form of ensemble learning based on boosting that incorporates data sampling to alleviate class imbalance and feature (software metric) selection to address high dimensionality. As we adopt two different sampling methods (Random Undersampling (RUS) and Synthetic Minority Oversampling (SMOTE)) in the technique, we have two forms of our new ensemble-based approach: selectRUSBoost and selectSMOTEBoost. To evaluate the effectiveness of these new techniques, we apply them to two groups of datasets from two real-world software systems. In the experiments, four learners and nine feature selection techniques are employed to build our models. We also consider versions of the technique which do not incorporate feature selection, and compare all four techniques (the two different ensemble-based approaches which utilize feature selection and the two versions which use sampling only). The experimental results demonstrate that selectRUSBoost is generally more effective in improving defect prediction performance than selectSMOTEBoost, and that the techniques with feature selection do help for getting better prediction than the techniques without feature selection.

Download Full-text

An Empirical Study on Software Defect Prediction Using CodeBERT Model

Applied Sciences ◽

10.3390/app11114793 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4793

Author(s):

Cong Pan ◽

Minyan Lu ◽

Biao Xu

Keyword(s):

Deep Learning ◽

Software Engineering ◽

Empirical Study ◽

Empirical Studies ◽

Language Model ◽

Prediction Performance ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Cross Project

Deep learning-based software defect prediction has been popular these days. Recently, the publishing of the CodeBERT model has made it possible to perform many software engineering tasks. We propose various CodeBERT models targeting software defect prediction, including CodeBERT-NT, CodeBERT-PS, CodeBERT-PK, and CodeBERT-PT. We perform empirical studies using such models in cross-version and cross-project software defect prediction to investigate if using a neural language model like CodeBERT could improve prediction performance. We also investigate the effects of different prediction patterns in software defect prediction using CodeBERT models. The empirical results are further discussed.

Download Full-text

Incremental Feature Selection Method for Software Defect Prediction

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1252.0782s319 ◽

2019 ◽

Vol 8 (2S3) ◽

pp. 1345-1353 ◽

Cited By ~ 1

Keyword(s):

Feature Selection ◽

Software Metrics ◽

Prediction Models ◽

Search Algorithm ◽

Feature Selection Method ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Defect Prediction Models ◽

Selection Of

Software defect prediction models are essential for understanding quality attributes relevant for software organization to deliver better software reliability. This paper focuses mainly based on the selection of attributes in the perspective of software quality estimation for incremental database. A new dimensionality reduction method Wilk’s Lambda Average Threshold (WLAT) is presented for selection of optimal features which are used for classifying modules as fault prone or not. This paper uses software metrics and defect data collected from benchmark data sets. The comparative results confirm that the statistical search algorithm (WLAT) outperforms the other relevant feature selection methods for most classifiers. The main advantage of the proposed WLAT method is: The selected features can be reused when there is increase or decrease in database size, without the need of extracting features afresh. In addition, performances of the defect prediction models either remains unchanged or improved even after eliminating 85% of the software metrics.

Download Full-text

Empirical Analysis of Rank Aggregation-Based Multi-Filter Feature Selection Methods in Software Defect Prediction

Electronics ◽

10.3390/electronics10020179 ◽

2021 ◽

Vol 10 (2) ◽

pp. 179

Author(s):

Abdullateef O. Balogun ◽

Shuib Basri ◽

Saipunidzam Mahamad ◽

Said Jadid Abdulkadir ◽

Luiz Fernando Capretz ◽

...

Keyword(s):

Feature Selection ◽

Prediction Models ◽

Rank Aggregation ◽

Selection Problem ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Filter Methods ◽

Rank List ◽

Viable Solution

Selecting the most suitable filter method that will produce a subset of features with the best performance remains an open problem that is known as filter rank selection problem. A viable solution to this problem is to independently apply a mixture of filter methods and evaluate the results. This study proposes novel rank aggregation-based multi-filter feature selection (FS) methods to address high dimensionality and filter rank selection problem in software defect prediction (SDP). The proposed methods combine rank lists generated by individual filter methods using rank aggregation mechanisms into a single aggregated rank list. The proposed methods aim to resolve the filter selection problem by using multiple filter methods of diverse computational characteristics to produce a dis-joint and complete feature rank list superior to individual filter rank methods. The effectiveness of the proposed method was evaluated with Decision Tree (DT) and Naïve Bayes (NB) models on defect datasets from NASA repository. From the experimental results, the proposed methods had a superior impact (positive) on prediction performances of NB and DT models than other experimented FS methods. This makes the combination of filter rank methods a viable solution to filter rank selection problem and enhancement of prediction models in SDP.

Download Full-text

An Empirical Study on the Impact of Class Overlapin Just-in-Time Software Defect Prediction (S)

10.18293/seke2021-076 ◽

2021 ◽

Author(s):

Minyang Yi

Keyword(s):

Empirical Study ◽

Defect Prediction ◽

Just In Time ◽

Software Defect Prediction ◽

Software Defect ◽

The Impact

Download Full-text

The Effect of the Dataset Size on the Accuracy of Software Defect Prediction Models: An Empirical Study

INTELIGENCIA ARTIFICIAL ◽

10.4114/intartif.vol24iss68pp72-88 ◽

2021 ◽

Vol 24 (68) ◽

pp. 72-88

Author(s):

Mohammad Alshayeb ◽

Mashaan A. Alshammari

Keyword(s):

Feature Selection ◽

Prediction Model ◽

Prediction Models ◽

Fault Prediction ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Dataset Size ◽

Defect Prediction Models ◽

Selection Algorithms

The ongoing development of computer systems requires massive software projects. Running the components of these huge projects for testing purposes might be a costly process; therefore, parameter estimation can be used instead. Software defect prediction models are crucial for software quality assurance. This study investigates the impact of dataset size and feature selection algorithms on software defect prediction models. We use two approaches to build software defect prediction models: a statistical approach and a machine learning approach with support vector machines (SVMs). The fault prediction model was built based on four datasets of different sizes. Additionally, four feature selection algorithms were used. We found that applying the SVM defect prediction model on datasets with a reduced number of measures as features may enhance the accuracy of the fault prediction model. Also, it directs the test effort to maintain the most influential set of metrics. We also found that the running time of the SVM fault prediction model is not consistent with dataset size. Therefore, having fewer metrics does not guarantee a shorter execution time. From the experiments, we found that dataset size has a direct influence on the SVM fault prediction model. However, reduced datasets performed the same or slightly lower than the original datasets.

Download Full-text