Establishing a software defect prediction model via effective dimension reduction

Just-in-time software defect prediction (JIT-SDP) is a fine-grained software defect prediction technology, which aims to identify the defective code changes in software systems. Effort-aware software defect prediction is a software defect prediction technology that takes into consideration the cost of code inspection, which can find more defective code changes in limited test resources. The traditional effort-aware defect prediction model mainly measures the effort based on the number of lines of code (LOC) and rarely considers additional factors. This paper proposes a novel effort measure method called Multi-Metric Joint Calculation (MMJC). When measuring the effort, MMJC takes into account not only LOC, but also the distribution of modified code across different files (Entropy), the number of developers that changed the files (NDEV) and the developer experience (EXP). In the simulation experiment, MMJC is combined with Linear Regression, Decision Tree, Random Forest, LightGBM, Support Vector Machine and Neural Network, respectively, to build the software defect prediction model. Several comparative experiments are conducted between the models based on MMJC and baseline models. The results show that indicators ACC and [Formula: see text] of the models based on MMJC are improved by 35.3% and 15.9% on average in the three verification scenarios, respectively, compared with the baseline models.

Download Full-text

Software Defect Prediction Using Hybrid Distribution Base Balance Instance Selection and Radial Basis Function Classifier

International Journal of System Dynamics Applications ◽

10.4018/ijsda.2019070103 ◽

2019 ◽

Vol 8 (3) ◽

pp. 53-75 ◽

Cited By ~ 1

Author(s):

Mrutyunjaya Panda

Keyword(s):

Prediction Model ◽

Radial Basis Function ◽

Basis Function ◽

Rapid Development ◽

Defect Prediction ◽

Instance Selection ◽

Software Defect Prediction ◽

Software Defect ◽

Radial Basis ◽

Base Balance

Software is an important part of human life and with the rapid development of software engineering the demands for software to be reliable with low defects is increasingly pressing. The building of a software defect prediction model is proposed in this article by using various software metrics with publicly available historical software defect datasets collected from several projects. Such a prediction model can enable the software engineers to take proactive actions in enhancing software quality from the early stages of the software development cycle. This article introduces a hybrid classification method (DBBRBF) by combining distribution base balance (DBB) based instance selection and radial basis function (RBF) neural network classifier to obtain the best prediction compared to the existing research. The experimental results with post-hoc statistical significance tests shows the effectiveness of the proposed approach.

Download Full-text

Software Defect Prediction Model Based on Stacked Denoising Auto-Encoder

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Artificial Intelligence for Communications and Networks ◽

10.1007/978-3-030-22971-9_2 ◽

2019 ◽

pp. 18-27

Author(s):

Yu Zhu ◽

Dongjin Yin ◽

Yingtao Gan ◽

Lanlan Rui ◽

Guoxin Xia

Keyword(s):

Prediction Model ◽

Defect Prediction ◽

Software Defect Prediction ◽

Model Based ◽

Software Defect

Download Full-text

Exhaustive and heuristic search approaches for learning a software defect prediction model

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2009.10.001 ◽

2010 ◽

Vol 23 (1) ◽

pp. 34-40 ◽

Cited By ~ 19

Author(s):

Parag C. Pendharkar

Keyword(s):

Prediction Model ◽

Heuristic Search ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Deep Learning Software Defect Prediction Methods for Cloud Environments Research

Scientific Programming ◽

10.1155/2021/2323100 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Wenjian Liu ◽

Baoping Wang ◽

Wennan Wang

Keyword(s):

Deep Learning ◽

Prediction Model ◽

Real Time ◽

Defect Prediction ◽

Prediction Methods ◽

Learning Approach ◽

Software Defect Prediction ◽

Imbalance Problem ◽

Software Defect ◽

Ladder Network

This paper provides an in-depth study and analysis of software defect prediction methods in a cloud environment and uses a deep learning approach to justify software prediction. A cost penalty term is added to the supervised part of the deep ladder network; that is, the misclassification cost of different classes is added to the model. A cost-sensitive deep ladder network-based software defect prediction model is proposed, which effectively mitigates the negative impact of the class imbalance problem on defect prediction. To address the problem of lack or insufficiency of historical data from the same project, a flow learning-based geodesic cross-project software defect prediction method is proposed. Drawing on data information from other projects, a migration learning approach was used to embed the source and target datasets into a Gaussian manifold. The kernel encapsulates the incremental changes between the differences and commonalities between the two domains. To this point, the subspace is the space of two distributional approximations formed by the source and target data transformations, with traditional in-project software defect classifiers used to predict labels. It is found that real-time defect prediction is more practical because it has a smaller amount of code to review; only individual changes need to be reviewed rather than entire files or packages while making it easier for developers to assign fixes to defects. More importantly, this paper combines deep belief network techniques with real-time defect prediction at a fine-grained level and TCA techniques to deal with data imbalance and proposes an improved deep belief network approach for real-time defect prediction, while trying to change the machine learning classifier underlying DBN for different experimental studies, and the results not only validate the effectiveness of using TCA techniques to solve the data imbalance problem but also show that the defect prediction model learned by the improved method in this paper has better prediction performance.

Download Full-text

Implementation of Chaotic Gaussian Particle Swarm Optimization for Optimize Learning-to-Rank Software Defect Prediction Model Construction

Journal of Physics Conference Series ◽

10.1088/1742-6596/978/1/012079 ◽

2018 ◽

Vol 978 ◽

pp. 012079 ◽

Cited By ~ 1

Author(s):

M A Buchari ◽

S Mardiyanto ◽

B Hendradjaya

Keyword(s):

Particle Swarm Optimization ◽

Prediction Model ◽

Learning To Rank ◽

Particle Swarm ◽

Model Construction ◽

Defect Prediction ◽

Software Defect Prediction ◽

Swarm Optimization ◽

Software Defect

Download Full-text

Research and Appalication of Software Defect Predictionn based on BP-Migration learning

MATEC Web of Conferences ◽

10.1051/matecconf/201823203017 ◽

2018 ◽

Vol 232 ◽

pp. 03017

Author(s):

Jie Zhang ◽

Gang Wang ◽

Haobo Jiang ◽

Fangzheng Zhao ◽

Guilin Tian

Keyword(s):

Prediction Model ◽

Historical Data ◽

Defect Prediction ◽

Software Project ◽

Data Sets ◽

Software Defect Prediction ◽

Software Module ◽

Data Set ◽

Software Defect ◽

Project Data

Software Defect Prediction has been an important part of Software engineering research since the 1970s. This technique is used to calculate and analyze the measurement and defect information of the historical software module to complete the defect prediction of the new software module. Currently, most software defect prediction model is established on the basis of the same software project data set. The training date sets used to construct the model and the test data sets used to validate the model are from the same software projects. But in practice, for those has less historical data of a software project or new projects, the defect of traditional prediction method shows lower forecast performance. For the traditional method, when the historical data is insufficient, the software defect prediction model cannot be fully studied. It is difficult to achieve high prediction accuracy. In the process of cross-project prediction, the problem that we will faced is data distribution differences. For the above problems, this paper presents a software defect prediction model based on migration learning and traditional software defect prediction model. This model uses the existing project data sets to predict software defects across projects. The main work of this article includes: 1) Data preprocessing. This section includes data feature correlation analysis, noise reduction and so on, which effectively avoids the interference of over-fitting problem and noise data on prediction results. 2) Migrate learning. This section analyzes two different but related project data sets and reduces the impact of data distribution differences. 3) Artificial neural networks. According to class imbalance problems of the data set, using artificial neural network and dynamic selection training samples reduce the influence of prediction results because of the positive and negative samples data. The data set of the Relink project and AEEEM is studied to evaluate the performance of the f-measure and the ROC curve and AUC calculation. Experiments show that the model has high predictive performance.

Download Full-text

Software defect prediction using semi-supervised learning with dimension reduction

Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering - ASE 2012 ◽

10.1145/2351676.2351734 ◽

2012 ◽

Cited By ~ 31

Author(s):

Huihua Lu ◽

Bojan Cukic ◽

Mark Culp

Keyword(s):

Dimension Reduction ◽

Supervised Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text