An empirical study on software defect prediction with a simplified metric set

Class Imbalance Issue in Software Defect Prediction Models by various Machine Learning Techniques: An Empirical Study

10.1109/icscc51209.2021.9528170 ◽

2021 ◽

Author(s):

Sushant Kumar Pandey ◽

Anil Kumar Tripathi

Keyword(s):

Machine Learning ◽

Empirical Study ◽

Prediction Models ◽

Class Imbalance ◽

Machine Learning Techniques ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Learning Techniques ◽

Defect Prediction Models

Software defect prediction using K‐PCA and various kernel‐based extreme learning machine: an empirical study

IET Software ◽

10.1049/iet-sen.2020.0119 ◽

2020 ◽

Vol 14 (7) ◽

pp. 768-782

Author(s):

Sushant Kumar Pandey ◽

Deevashwer Rathee ◽

Anil Kumar Tripathi

Keyword(s):

Empirical Study ◽

Extreme Learning Machine ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Learning Machine

An Empirical Study of Model-Agnostic Interpretation Technique for Just-in-Time Software Defect Prediction

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Collaborative Computing: Networking, Applications and Worksharing ◽

10.1007/978-3-030-92635-9_25 ◽

2021 ◽

pp. 420-438

Author(s):

Xingguang Yang ◽

Huiqun Yu ◽

Guisheng Fan ◽

Zijie Huang ◽

Kang Yang ◽

...

Keyword(s):

Empirical Study ◽

Defect Prediction ◽

Just In Time ◽

Software Defect Prediction ◽

Software Defect

An empirical study on pareto based multi-objective feature selection for software defect prediction

Journal of Systems and Software ◽

10.1016/j.jss.2019.03.012 ◽

2019 ◽

Vol 152 ◽

pp. 215-238 ◽

Cited By ~ 9

Author(s):

Chao Ni ◽

Xiang Chen ◽

Fangfang Wu ◽

Yuxiang Shen ◽

Qing Gu

Keyword(s):

Feature Selection ◽

Empirical Study ◽

Defect Prediction ◽

Software Defect Prediction ◽

Multi Objective ◽

Software Defect ◽

Selection For

Software Defect Prediction Using Propositionalization Based Data Preprocessing: An Empirical Study

2018 2nd International Conference on Data Science and Business Analytics (ICDSBA) ◽

10.1109/icdsba.2018.00021 ◽

2018 ◽

Author(s):

CholMyong Pak ◽

Tiantian Wang ◽

Xiaohong Su

Keyword(s):

Empirical Study ◽

Data Preprocessing ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

An empirical study to investigate oversampling methods for improving software defect prediction using imbalanced data

Neurocomputing ◽

10.1016/j.neucom.2018.04.090 ◽

2019 ◽

Vol 343 ◽

pp. 120-140 ◽

Cited By ~ 11

Author(s):

Ruchika Malhotra ◽

Shine Kamal

Keyword(s):

Empirical Study ◽

Imbalanced Data ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

An Empirical Study on Software Defect Prediction Using CodeBERT Model

Applied Sciences ◽

10.3390/app11114793 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4793

Author(s):

Cong Pan ◽

Minyan Lu ◽

Biao Xu

Keyword(s):

Deep Learning ◽

Software Engineering ◽

Empirical Study ◽

Empirical Studies ◽

Language Model ◽

Prediction Performance ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Cross Project

Deep learning-based software defect prediction has been popular these days. Recently, the publishing of the CodeBERT model has made it possible to perform many software engineering tasks. We propose various CodeBERT models targeting software defect prediction, including CodeBERT-NT, CodeBERT-PS, CodeBERT-PK, and CodeBERT-PT. We perform empirical studies using such models in cross-version and cross-project software defect prediction to investigate if using a neural language model like CodeBERT could improve prediction performance. We also investigate the effects of different prediction patterns in software defect prediction using CodeBERT models. The empirical results are further discussed.

A Ranking-Oriented Approach to Cross-Project Software Defect Prediction: An Empirical Study

Proceedings of the 27th International Conference on Software Engineering and Knowledge Engineering ◽

10.18293/seke2016-047 ◽

2016 ◽

Cited By ~ 2

Author(s):

Guoan You ◽

Yutao Ma

Keyword(s):

Empirical Study ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Oriented Approach ◽

Cross Project

Micro-interaction Metrics Based Software Defect Prediction with Machine Learning, Immune Inspired and Evolutionary Classifiers: An Empirical Study

Proceedings of First International Conference on Information and Communication Technology for Intelligent Systems: Volume 1 - Smart Innovation, Systems and Technologies ◽

10.1007/978-3-319-30933-0_24 ◽

2016 ◽

pp. 221-233

Author(s):

Arvinder Kaur ◽

Kamadeep Kaur

Keyword(s):

Machine Learning ◽

Empirical Study ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Impact of Feature Selection Methods on the Predictive Performance of Software Defect Prediction Models: An Extensive Empirical Study

Symmetry ◽

10.3390/sym12071147 ◽

2020 ◽

Vol 12 (7) ◽

pp. 1147 ◽

Cited By ~ 2

Author(s):

Abdullateef O. Balogun ◽

Shuib Basri ◽

Saipunidzam Mahamad ◽

Said J. Abdulkadir ◽

Malek A. Almomani ◽

...

Keyword(s):

Feature Selection ◽

Empirical Study ◽

Prediction Models ◽

Empirical Studies ◽

Experimental Results ◽

Defect Prediction ◽

Software Defect Prediction ◽

Search Methods ◽

Software Defect ◽

The Impact

Feature selection (FS) is a feasible solution for mitigating high dimensionality problem, and many FS methods have been proposed in the context of software defect prediction (SDP). Moreover, many empirical studies on the impact and effectiveness of FS methods on SDP models often lead to contradictory experimental results and inconsistent findings. These contradictions can be attributed to relative study limitations such as small datasets, limited FS search methods, and unsuitable prediction models in the respective scope of studies. It is hence critical to conduct an extensive empirical study to address these contradictions to guide researchers and buttress the scientific tenacity of experimental conclusions. In this study, we investigated the impact of 46 FS methods using Naïve Bayes and Decision Tree classifiers over 25 software defect datasets from 4 software repositories (NASA, PROMISE, ReLink, and AEEEM). The ensuing prediction models were evaluated based on accuracy and AUC values. Scott–KnottESD and the novel Double Scott–KnottESD rank statistical methods were used for statistical ranking of the studied FS methods. The experimental results showed that there is no one best FS method as their respective performances depends on the choice of classifiers, performance evaluation metrics, and dataset. However, we recommend the use of statistical-based, probability-based, and classifier-based filter feature ranking (FFR) methods, respectively, in SDP. For filter subset selection (FSS) methods, correlation-based feature selection (CFS) with metaheuristic search methods is recommended. For wrapper feature selection (WFS) methods, the IWSS-based WFS method is recommended as it outperforms the conventional SFS and LHS-based WFS methods.