An Improved CNN Model for Within-Project Software Defect Prediction

To improve software reliability, software defect prediction is used to find software bugs and prioritize testing efforts. Recently, some researchers introduced deep learning models, such as the deep belief network (DBN) and the state-of-the-art convolutional neural network (CNN), and used automatically generated features extracted from abstract syntax trees (ASTs) and deep learning models to improve defect prediction performance. However, the research on the CNN model failed to reveal clear conclusions due to its limited dataset size, insufficiently repeated experiments, and outdated baseline selection. To solve these problems, we built the PROMISE Source Code (PSC) dataset to enlarge the original dataset in the CNN research, which we named the Simplified PROMISE Source Code (SPSC) dataset. Then, we proposed an improved CNN model for within-project defect prediction (WPDP) and compared our results to existing CNN results and an empirical study. Our experiment was based on a 30-repetition holdout validation and a 10 * 10 cross-validation. Experimental results showed that our improved CNN model was comparable to the existing CNN model, and it outperformed the state-of-the-art machine learning models significantly for WPDP. Furthermore, we defined hyperparameter instability and examined the threat and opportunity it presents for deep learning models on defect prediction.

Download Full-text

Software Defect Prediction Using a Hybrid Model Based on Semantic Features Learned from the Source Code

Knowledge Science, Engineering and Management - Lecture Notes in Computer Science ◽

10.1007/978-3-030-29551-6_23 ◽

2019 ◽

pp. 262-274

Author(s):

Diana-Lucia Miholca ◽

Gabriela Czibula

Keyword(s):

Hybrid Model ◽

Source Code ◽

Defect Prediction ◽

Semantic Features ◽

Software Defect Prediction ◽

Model Based ◽

Software Defect

Download Full-text

Deep Transfer Learning for Source Code Modeling

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194020500230 ◽

2020 ◽

Vol 30 (05) ◽

pp. 649-668

Author(s):

Yasir Hussain ◽

Zhiqiu Huang ◽

Yu Zhou ◽

Senzhang Wang

Keyword(s):

Neural Network ◽

Deep Learning ◽

Transfer Learning ◽

Software defect prediction: A multi-criteria decision-making approach

Nigerian Journal of Technological Research ◽

10.4314/njtr.v15i1.7 ◽

2020 ◽

Vol 15 (1) ◽

pp. 35-42

Author(s):

A.O. Balogun ◽

A.O. Bajeh ◽

H.A. Mojeed ◽

A.G. Akintola

Keyword(s):

Machine Learning ◽

Software Testing ◽

Evaluation Metrics ◽

Defect Prediction ◽

Software Systems ◽

Software Defect Prediction ◽

Learning Models ◽

Decision Method ◽

Software Defect ◽

Machine Learning Models

Failure of software systems as a result of software testing is very much rampant as modern software systems are large and complex. Software testing which is an integral part of the software development life cycle (SDLC), consumes both human and capital resources. As such, software defect prediction (SDP) mechanisms are deployed to strengthen the software testing phase in SDLC by predicting defect prone modules or components in software systems. Machine learning models are used for developing the SDP models with great successes achieved. Moreover, some studies have highlighted that a combination of machine learning models as a form of an ensemble is better than single SDP models in terms of prediction accuracy. However, the efficiency of machine learning models can change with diverse predictive evaluation metrics. Thus, more studies are needed to establish the effectiveness of ensemble SDP models over single SDP models. This study proposes the deployment of Multi-Criteria Decision Method (MCDM) techniques to rank machine learning models. Analytic Network Process (ANP) and Preference Ranking Organization Method for Enrichment Evaluation (PROMETHEE) which are types of MCDM techniques are deployed on 9 machine learning models with 11 performance evaluation metrics and 11 software defects datasets. The experimental results showed that ensemble SDP models are best appropriate SDP models as Boosted SMO and Boosted PART ranked highest for each of the MCDM techniques. Besides, the experimental results also validated the stand of not considering accuracy as the only performance evaluation metrics for SDP models. Conclusively, more performance metrics other than predictive accuracy should be considered when ranking and evaluating machine learning models. Keywords: Ensemble; Multi-Criteria Decision Method; Software Defect Prediction

Download Full-text

Deep Learning Software Defect Prediction Methods for Cloud Environments Research

Scientific Programming ◽

10.1155/2021/2323100 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Wenjian Liu ◽

Baoping Wang ◽

Wennan Wang

Keyword(s):

Deep Learning ◽

Prediction Model ◽

Real Time ◽

Defect Prediction ◽

Prediction Methods ◽

Learning Approach ◽

Software Defect Prediction ◽

Imbalance Problem ◽

Software Defect ◽

Ladder Network

This paper provides an in-depth study and analysis of software defect prediction methods in a cloud environment and uses a deep learning approach to justify software prediction. A cost penalty term is added to the supervised part of the deep ladder network; that is, the misclassification cost of different classes is added to the model. A cost-sensitive deep ladder network-based software defect prediction model is proposed, which effectively mitigates the negative impact of the class imbalance problem on defect prediction. To address the problem of lack or insufficiency of historical data from the same project, a flow learning-based geodesic cross-project software defect prediction method is proposed. Drawing on data information from other projects, a migration learning approach was used to embed the source and target datasets into a Gaussian manifold. The kernel encapsulates the incremental changes between the differences and commonalities between the two domains. To this point, the subspace is the space of two distributional approximations formed by the source and target data transformations, with traditional in-project software defect classifiers used to predict labels. It is found that real-time defect prediction is more practical because it has a smaller amount of code to review; only individual changes need to be reviewed rather than entire files or packages while making it easier for developers to assign fixes to defects. More importantly, this paper combines deep belief network techniques with real-time defect prediction at a fine-grained level and TCA techniques to deal with data imbalance and proposes an improved deep belief network approach for real-time defect prediction, while trying to change the machine learning classifier underlying DBN for different experimental studies, and the results not only validate the effectiveness of using TCA techniques to solve the data imbalance problem but also show that the defect prediction model learned by the improved method in this paper has better prediction performance.

Download Full-text

Deep learning based software defect prediction

Neurocomputing ◽

10.1016/j.neucom.2019.11.067 ◽

2020 ◽

Vol 385 ◽

pp. 100-110 ◽

Cited By ~ 2

Author(s):

Lei Qiao ◽

Xuesong Li ◽

Qasim Umer ◽

Ping Guo

Keyword(s):

Deep Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

The state of the art of deep learning models in medical science and their challenges

Multimedia Systems ◽

10.1007/s00530-020-00694-1 ◽

2020 ◽

Author(s):

Chandradeep Bhatt ◽

Indrajeet Kumar ◽

V. Vijayakumar ◽

Kamred Udham Singh ◽

Abhishek Kumar

Keyword(s):

Deep Learning ◽

State Of The Art ◽

Medical Science ◽

The State ◽

Learning Models

Download Full-text

Research on software defect prediction technology based on deep learning

2021 2nd International Conference on Computing and Data Science (CDS) ◽

10.1109/cds52072.2021.00024 ◽

2021 ◽

Author(s):

Pengcheng Jiang

Keyword(s):

Deep Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Defects in The Next Release; Software Defect Prediction Based on Source Code Versions

Electrical Engineering (ICEE), Iranian Conference on ◽

10.1109/icee.2018.8472535 ◽

2018 ◽

Author(s):

Molouk Mishmast Nehi ◽

Zahra Fakhrpoor ◽

Mohammad R. Moosavi

Keyword(s):

Source Code ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

An Empirical Study on Software Defect Prediction Using CodeBERT Model

Applied Sciences ◽

10.3390/app11114793 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4793

Author(s):

Cong Pan ◽

Minyan Lu ◽

Biao Xu

Keyword(s):

Deep Learning ◽

Software Engineering ◽

Empirical Study ◽

Empirical Studies ◽

Language Model ◽

Prediction Performance ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Cross Project

Deep learning-based software defect prediction has been popular these days. Recently, the publishing of the CodeBERT model has made it possible to perform many software engineering tasks. We propose various CodeBERT models targeting software defect prediction, including CodeBERT-NT, CodeBERT-PS, CodeBERT-PK, and CodeBERT-PT. We perform empirical studies using such models in cross-version and cross-project software defect prediction to investigate if using a neural language model like CodeBERT could improve prediction performance. We also investigate the effects of different prediction patterns in software defect prediction using CodeBERT models. The empirical results are further discussed.

Download Full-text

Deep Learning for Software Defect Prediction in time

2018 Fifth International Conference on Parallel, Distributed and Grid Computing (PDGC) ◽

10.1109/pdgc.2018.8745804 ◽

2018 ◽

Author(s):

Monika Yadav ◽

Vijendra Singh ◽

Priyanka Rastogi

Keyword(s):

Deep Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text