A cross‐project defect prediction method based on multi‐adaptation and nuclear norm

Cross-Project Defect Prediction Method Based on Manifold Feature Transformation

Future Internet ◽

10.3390/fi13080216 ◽

2021 ◽

Vol 13 (8) ◽

pp. 216

Author(s):

Yu Zhao ◽

Yi Zhu ◽

Qiao Yu ◽

Xiaoying Chen

Keyword(s):

Prediction Model ◽

Data Distribution ◽

Prediction Method ◽

Feature Space ◽

Defect Prediction ◽

Software Project ◽

Feature Transformation ◽

Traditional Methods ◽

The Difference ◽

Cross Project

Traditional research methods in software defect prediction use part of the data in the same project to train the defect prediction model and predict the defect label of the remaining part of the data. However, in the practical realm of software development, the software project that needs to be predicted is generally a brand new software project, and there is not enough labeled data to build a defect prediction model; therefore, traditional methods are no longer applicable. Cross-project defect prediction uses the labeled data of the same type of project similar to the target project to build the defect prediction model, so as to solve the problem of data loss in traditional methods. However, the difference in data distribution between the same type of project and the target project reduces the performance of defect prediction. To solve this problem, this paper proposes a cross-project defect prediction method based on manifold feature transformation. This method transforms the original feature space of the project into a manifold space, then reduces the difference in data distribution of the transformed source project and the transformed target project in the manifold space, and finally uses the transformed source project to train a naive Bayes prediction model with better performance. A comparative experiment was carried out using the Relink dataset and the AEEEM dataset. The experimental results show that compared with the benchmark method and several cross-project defect prediction methods, the proposed method effectively reduces the difference in data distribution between the source project and the target project, and obtains a higher F1 value, which is an indicator commonly used to measure the performance of the two-class model.

Download Full-text

Dissimilarity Space Based Multi-Source Cross-Project Defect Prediction

Algorithms ◽

10.3390/a12010013 ◽

2019 ◽

Vol 12 (1) ◽

pp. 13

Author(s):

Shengbing Ren ◽

Wanying Zhang ◽

Hafiz Shahbaz Munir ◽

Lei Xia

Keyword(s):

Nearest Neighbor ◽

Prediction Method ◽

Poor Performance ◽

Performance Model ◽

Training Data ◽

Cluster Center ◽

Defect Prediction ◽

K Nearest Neighbor ◽

Target Set ◽

Cross Project

Software defect prediction is an important means to guarantee software quality. Because there are no sufficient historical data within a project to train the classifier, cross-project defect prediction (CPDP) has been recognized as a fundamental approach. However, traditional defect prediction methods use feature attributes to represent samples, which cannot avoid negative transferring, may result in poor performance model in CPDP. This paper proposes a multi-source cross-project defect prediction method based on dissimilarity space (DM-CPDP). This method not only retains the original information, but also obtains the relationship with other objects. So it can enhances the discriminant ability of the sample attributes to the class label. This method firstly uses the density-based clustering method to construct the prototype set with the cluster center of samples in the target set. Then, the arc-cosine kernel is used to calculate the sample dissimilarities between the prototype set and the source domain or the target set to form the dissimilarity space. In this space, the training set is obtained with the earth mover’s distance (EMD) method. For the unlabeled samples converted from the target set, the k-Nearest Neighbor (KNN) algorithm is used to label those samples. Finally, the model is learned from training data based on TrAdaBoost method and used to predict new potential defects. The experimental results show that this approach has better performance than other traditional CPDP methods.

Download Full-text

On Applicability of Cross-project Defect Prediction Method for Multi-Versions Projects

Proceedings of the 13th International Conference on Predictive Models and Data Analytics in Software Engineering - PROMISE ◽

10.1145/3127005.3127015 ◽

2017 ◽

Cited By ~ 4

Author(s):

Sousuke Amasaki

Keyword(s):

Prediction Method ◽

Defect Prediction ◽

Cross Project

Download Full-text

Dissimilarity Space Based Multi-Source Cross-Project Defect Prediction

10.20944/preprints201811.0461.v1 ◽

2018 ◽

Author(s):

Shengbing Ren ◽

Wanying Zhang ◽

Hafiz Shahbaz Munir ◽

Lei Xia

Keyword(s):

Prediction Method ◽

Poor Performance ◽

Performance Model ◽

Cluster Center ◽

Defect Prediction ◽

Software Defect ◽

Density Based Clustering ◽

Important Means ◽

Target Set ◽

Cross Project

Software defect prediction is an important means to guarantee software quality. Because there are no sufficient historical data within a project to train the classifier, cross-project defect prediction (CPDP) has been recognized as a fundamental approach.  However, traditional defect prediction methods using feature attributes to represent samples, which can not avoid negative transferring, may result in poor performance model in CPDP. This paper proposes a multi-source cross-project defect prediction method based on dissimilarity space ( DM-CPDP). This method first uses the density-based clustering method to construct the prototype set with the cluster center of samples in the target set. Then, the arc-cosine kernel is used to form the dissimilarity space, and in this space the training set is obtained with the earth mover’s distance (EMD) method. For the unlabeled samples converted from the target set, the KNN algorithm is used to label those samples. Finally, we use TrAdaBoost method to establish the prediction model.  The experimental results show that our approach has better performance than other traditional CPDP methods.

Download Full-text

Heterogeneous Cross Project Defect Prediction in Software

SSRN Electronic Journal ◽

10.2139/ssrn.3580671 ◽

2020 ◽

Author(s):

Sonali Srivastava ◽

Shikha Rani ◽

Shailly Singh ◽

Saurabh Singh ◽

Rohit Vashisht

Keyword(s):

Defect Prediction ◽

Cross Project

Download Full-text

Joint feature representation learning and progressive distribution matching for cross-project defect prediction

Information and Software Technology ◽

10.1016/j.infsof.2021.106588 ◽

2021 ◽

Vol 137 ◽

pp. 106588

Author(s):

Quanyi Zou ◽

Lu Lu ◽

Zhanyu Yang ◽

Xiaowei Gu ◽

Shaojian Qiu

Keyword(s):

Representation Learning ◽

Feature Representation ◽

Defect Prediction ◽

Distribution Matching ◽

Cross Project

Download Full-text

Cross project defect prediction: a comprehensive survey with its SWOT analysis

Innovations in Systems and Software Engineering ◽

10.1007/s11334-020-00380-5 ◽

2021 ◽

Author(s):

Yogita Khatri ◽

Sandeep Kumar Singh

Keyword(s):

Swot Analysis ◽

Defect Prediction ◽

Comprehensive Survey ◽

Cross Project

Download Full-text

An investigation of cross-project learning in online just-in-time software defect prediction

Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering ◽

10.1145/3377811.3380403 ◽

2020 ◽

Cited By ~ 1

Author(s):

Sadia Tabassum ◽

Leandro L. Minku ◽

Danyi Feng ◽

George G. Cabral ◽

Liyan Song

Keyword(s):

Defect Prediction ◽

Just In Time ◽

Software Defect Prediction ◽

Project Learning ◽

Software Defect ◽

Cross Project

Download Full-text

An Improved SDA Based Defect Prediction Framework for Both Within-Project and Cross-Project Class-Imbalance Problems

IEEE Transactions on Software Engineering ◽

10.1109/tse.2016.2597849 ◽

2017 ◽

Vol 43 (4) ◽

pp. 321-339 ◽

Cited By ~ 57

Author(s):

Xiao-Yuan Jing ◽

Fei Wu ◽

Xiwei Dong ◽

Baowen Xu

Keyword(s):

Class Imbalance ◽

Defect Prediction ◽

Cross Project

Download Full-text

Cross-project smell-based defect prediction

Soft Computing ◽

10.1007/s00500-021-06254-7 ◽

2021 ◽

Author(s):

Bruno Sotto-Mayor ◽

Meir Kalech

Keyword(s):

Defect Prediction ◽

Cross Project

Download Full-text