Local modelling approach for cross-project defect prediction

Intelligent Decision Technologies ◽

10.3233/idt-210130 ◽

2021 ◽

pp. 1-15

Author(s):

Nayeem Ahmad Bhat ◽

Sheikh Umar Farooq

Keyword(s):

Probability Of Detection ◽

Cluster Models ◽

Defect Prediction ◽

Local Cluster ◽

Modeling Approach ◽

Local Modeling ◽

Overall Performance ◽

The Cross ◽

Project Data ◽

Cross Project

Prediction approaches used for cross-project defect prediction (CPDP) are usually impractical because of high false alarms, or low detection rate. Instance based data filter techniques that improve the CPDP performance are time-consuming and each time a new test set arrives for prediction the entire filter procedure is repeated. We propose to use local modeling approach for the utilization of ever-increasing cross-project data for CPDP. We cluster the cross-project data, train per cluster prediction models and predict the target test instances using corresponding cluster models. Over 7 NASA Data sets performance comparison using statistical methods between within-project, cross-project, and our local modeling approach were performed. Compared to within-project prediction the cross-project prediction increased the probability of detection (PD) associated with an increase in the probability of false alarm (PF) and decreased overall performance Balance. The application of local modeling decreased the (PF) associated with a decrease in (PD) and an overall performance improvement in terms of Balance. Moreover, compared to one state of the art filter technique – Burak filter, our approach is simple, fast, performance comparable, and opens a new perspective for the utilization of ever-increasing cross-project data for defect prediction. Therefore, when insufficient within-project data is available we recommend training local cluster models than training a single global model on cross-project datasets.

Download Full-text

Cross-version defect prediction: use historical data, cross-project data, or both?

Empirical Software Engineering ◽

10.1007/s10664-019-09777-8 ◽

2020 ◽

Vol 25 (2) ◽

pp. 1573-1595

Author(s):

Sousuke Amasaki

Keyword(s):

Historical Data ◽

Defect Prediction ◽

Project Data ◽

Cross Project

Download Full-text

The cross-project defect prediction based on PSO and Feature Dependent Naive Bayes

Journal of Physics Conference Series ◽

10.1088/1742-6596/1237/2/022126 ◽

2019 ◽

Vol 1237 ◽

pp. 022126

Author(s):

Zhexi Yao ◽

Li Sun ◽

Tao Zhang ◽

Jinbo Wang

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Defect Prediction ◽

The Cross ◽

Cross Project

Download Full-text

An Empirical Study of Training Data Selection Methods for Ranking-Oriented Cross-Project Defect Prediction

Sensors ◽

10.3390/s21227535 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7535

Author(s):

Haoyu Luo ◽

Heng Dai ◽

Weiqiang Peng ◽

Wenhua Hu ◽

Fuyang Li

Keyword(s):

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Selection Methods ◽

Industrial Project ◽

Software Modules ◽

Industrial Projects ◽

Project Data ◽

Training Data Selection ◽

Cross Project

Ranking-oriented cross-project defect prediction (ROCPDP), which ranks software modules of a new target industrial project based on the predicted defect number or density, has been suggested in the literature. A major concern of ROCPDP is the distribution difference between the source project (aka. within-project) data and target project (aka. cross-project) data, which evidently degrades prediction performance. To investigate the impacts of training data selection methods on the performances of ROCPDP models, we examined the practical effects of nine training data selection methods, including a global filter, which does not filter out any cross-project data. Additionally, the prediction performances of ROCPDP models trained on the filtered cross-project data using the training data selection methods were compared with those of ranking-oriented within-project defect prediction (ROWPDP) models trained on sufficient and limited within-project data. Eleven available defect datasets from the industrial projects were considered and evaluated using two ranking performance measures, i.e., FPA and Norm(Popt). The results showed no statistically significant differences among these nine training data selection methods in terms of FPA and Norm(Popt). The performances of ROCPDP models trained on filtered cross-project data were not comparable with those of ROWPDP models trained on sufficient historical within-project data. However, ROCPDP models trained on filtered cross-project data achieved better performance values than ROWPDP models trained on limited historical within-project data. Therefore, we recommended that software quality teams exploit other project datasets to perform ROCPDP when there is no or limited within-project data.

Download Full-text

An exploratory study about the cross-project defect prediction: Impact of using different classification algorithms and a measure of performance in building predictive models

2015 Latin American Computing Conference (CLEI) ◽

10.1109/clei.2015.7360033 ◽

2015 ◽

Cited By ~ 2

Author(s):

Ricardo F. P. Satin ◽

Igor Scaliante Wiese ◽

Reginaldo Re

Keyword(s):

Predictive Models ◽

Exploratory Study ◽

Defect Prediction ◽

Classification Algorithms ◽

The Cross ◽

Measure Of Performance ◽

Cross Project

Download Full-text

Heterogeneous Cross Project Defect Prediction in Software

SSRN Electronic Journal ◽

10.2139/ssrn.3580671 ◽

2020 ◽

Author(s):

Sonali Srivastava ◽

Shikha Rani ◽

Shailly Singh ◽

Saurabh Singh ◽

Rohit Vashisht

Keyword(s):

Defect Prediction ◽

Cross Project

Download Full-text

Joint feature representation learning and progressive distribution matching for cross-project defect prediction

Information and Software Technology ◽

10.1016/j.infsof.2021.106588 ◽

2021 ◽

Vol 137 ◽

pp. 106588

Author(s):

Quanyi Zou ◽

Lu Lu ◽

Zhanyu Yang ◽

Xiaowei Gu ◽

Shaojian Qiu

Keyword(s):

Representation Learning ◽

Feature Representation ◽

Defect Prediction ◽

Distribution Matching ◽

Cross Project

Download Full-text

Cross project defect prediction: a comprehensive survey with its SWOT analysis

Innovations in Systems and Software Engineering ◽

10.1007/s11334-020-00380-5 ◽

2021 ◽

Author(s):

Yogita Khatri ◽

Sandeep Kumar Singh

Keyword(s):

Swot Analysis ◽

Defect Prediction ◽

Comprehensive Survey ◽

Cross Project

Download Full-text

An investigation of cross-project learning in online just-in-time software defect prediction

Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering ◽

10.1145/3377811.3380403 ◽

2020 ◽

Cited By ~ 1

Author(s):

Sadia Tabassum ◽

Leandro L. Minku ◽

Danyi Feng ◽

George G. Cabral ◽

Liyan Song

Keyword(s):

Defect Prediction ◽

Just In Time ◽

Software Defect Prediction ◽

Project Learning ◽

Software Defect ◽

Cross Project

Download Full-text

An Improved SDA Based Defect Prediction Framework for Both Within-Project and Cross-Project Class-Imbalance Problems

IEEE Transactions on Software Engineering ◽

10.1109/tse.2016.2597849 ◽

2017 ◽

Vol 43 (4) ◽

pp. 321-339 ◽

Cited By ~ 57

Author(s):

Xiao-Yuan Jing ◽

Fei Wu ◽

Xiwei Dong ◽

Baowen Xu

Keyword(s):

Class Imbalance ◽

Defect Prediction ◽

Cross Project

Download Full-text

Cross-project smell-based defect prediction

Soft Computing ◽

10.1007/s00500-021-06254-7 ◽

2021 ◽

Author(s):

Bruno Sotto-Mayor ◽

Meir Kalech

Keyword(s):

Defect Prediction ◽

Cross Project

Download Full-text