Training Data Selection Using Ensemble Dataset Approach for Software Defect Prediction

Cyber Security and Computer Science - Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ◽

10.1007/978-3-030-52856-0_19 ◽

2020 ◽

pp. 243-256

Author(s):

Md Fahimuzzman Sohan ◽

Md Alamgir Kabir ◽

Mostafijur Rahman ◽

S. M. Hasan Mahmud ◽

Touhid Bhuiyan

Keyword(s):

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Training Data Selection

Download Full-text

An Improved Method for Training Data Selection for Cross-Project Defect Prediction

Arabian Journal for Science and Engineering ◽

10.1007/s13369-021-06088-3 ◽

2021 ◽

Author(s):

Nayeem Ahmad Bhat ◽

Sheikh Umar Farooq

Keyword(s):

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Improved Method ◽

Selection For ◽

Training Data Selection ◽

Cross Project

Download Full-text

Training data selection for cross-project defect prediction

Proceedings of the 9th International Conference on Predictive Models in Software Engineering - PROMISE '13 ◽

10.1145/2499393.2499395 ◽

2013 ◽

Cited By ~ 50

Author(s):

Steffen Herbold

Keyword(s):

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Selection For ◽

Training Data Selection ◽

Cross Project

Download Full-text

Training data selection for imbalanced cross-project defect prediction

Computers & Electrical Engineering ◽

10.1016/j.compeleceng.2021.107370 ◽

2021 ◽

Vol 94 ◽

pp. 107370

Author(s):

Shang Zheng ◽

Jinjing Gai ◽

Hualong Yu ◽

Haitao Zou ◽

Shang Gao

Keyword(s):

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Selection For ◽

Training Data Selection ◽

Cross Project

Download Full-text

CDS: A Cross–Version Software Defect Prediction Model With Data Selection

IEEE Access ◽

10.1109/access.2020.3001440 ◽

2020 ◽

Vol 8 ◽

pp. 110059-110072

Author(s):

Jie Zhang ◽

Jiajing Wu ◽

Chuan Chen ◽

Zibin Zheng ◽

Michael R. Lyu

Keyword(s):

Prediction Model ◽

Data Selection ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Search Based Training Data Selection For Cross Project Defect Prediction

Proceedings of the The 12th International Conference on Predictive Models and Data Analytics in Software Engineering - PROMISE 2016 ◽

10.1145/2972958.2972964 ◽

2016 ◽

Cited By ~ 10

Author(s):

Seyedrebvar Hosseini ◽

Burak Turhan ◽

Mika Mäntylä

Keyword(s):

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Selection For ◽

Training Data Selection ◽

Cross Project

Download Full-text

Empirical Study of Software Defect Prediction: A Systematic Mapping

Symmetry ◽

10.3390/sym11020212 ◽

2019 ◽

Vol 11 (2) ◽

pp. 212 ◽

Cited By ~ 14

Author(s):

Le Son ◽

Nakul Pritam ◽

Manju Khari ◽

Raghvendra Kumar ◽

Pham Phuong ◽

...

Keyword(s):

Software Quality ◽

Empirical Studies ◽

Training Data ◽

Defect Prediction ◽

Software Defect Prediction ◽

Systematic Mapping ◽

Software Defect ◽

Multi Stage ◽

Stage Process ◽

Or Organization

Software defect prediction has been one of the key areas of exploration in the domain of software quality. In this paper, we perform a systematic mapping to analyze all the software defect prediction literature available from 1995 to 2018 using a multi-stage process. A total of 156 studies are selected in the first step, and the final mapping is conducted based on these studies. The ability of a model to learn from data that does not come from the same project or organization will help organizations that do not have sufficient training data or are going to start work on new projects. The findings of this research are useful not only to the software engineering domain, but also to the empirical studies, which mainly focus on symmetry as they provide steps-by-steps solutions for questions raised in the article.

Download Full-text

An Exploratory Study of Search Based Training Data Selection for Cross Project Defect Prediction

2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA) ◽

10.1109/seaa.2018.00048 ◽

2018 ◽

Author(s):

Seyedrebvar Hosseini ◽

Burak Turhan

Keyword(s):

Exploratory Study ◽

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Selection For ◽

Training Data Selection ◽

Cross Project

Download Full-text

An Empirical Study of Training Data Selection Methods for Ranking-Oriented Cross-Project Defect Prediction

Sensors ◽

10.3390/s21227535 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7535

Author(s):

Haoyu Luo ◽

Heng Dai ◽

Weiqiang Peng ◽

Wenhua Hu ◽

Fuyang Li

Keyword(s):

Training Data ◽

Data Selection ◽

Defect Prediction ◽

Selection Methods ◽

Industrial Project ◽

Software Modules ◽

Industrial Projects ◽

Project Data ◽

Training Data Selection ◽

Cross Project

Ranking-oriented cross-project defect prediction (ROCPDP), which ranks software modules of a new target industrial project based on the predicted defect number or density, has been suggested in the literature. A major concern of ROCPDP is the distribution difference between the source project (aka. within-project) data and target project (aka. cross-project) data, which evidently degrades prediction performance. To investigate the impacts of training data selection methods on the performances of ROCPDP models, we examined the practical effects of nine training data selection methods, including a global filter, which does not filter out any cross-project data. Additionally, the prediction performances of ROCPDP models trained on the filtered cross-project data using the training data selection methods were compared with those of ranking-oriented within-project defect prediction (ROWPDP) models trained on sufficient and limited within-project data. Eleven available defect datasets from the industrial projects were considered and evaluated using two ranking performance measures, i.e., FPA and Norm(Popt). The results showed no statistically significant differences among these nine training data selection methods in terms of FPA and Norm(Popt). The performances of ROCPDP models trained on filtered cross-project data were not comparable with those of ROWPDP models trained on sufficient historical within-project data. However, ROCPDP models trained on filtered cross-project data achieved better performance values than ROWPDP models trained on limited historical within-project data. Therefore, we recommended that software quality teams exploit other project datasets to perform ROCPDP when there is no or limited within-project data.

Download Full-text

Local and Regional Hour-Ahead Forecasts of Solar Irradiance with Training Data Selection and Support Vector Regression

IEEJ Transactions on Power and Energy ◽

10.1541/ieejpes.136.898 ◽

2016 ◽

Vol 136 (12) ◽

pp. 898-907 ◽

Cited By ~ 2

Author(s):

Joao Gari da Silva Fonseca Junior ◽

Hideaki Ohtake ◽

Takashi Oozeki ◽

Kazuhiko Ogimoto

Keyword(s):

Support Vector Regression ◽

Solar Irradiance ◽

Training Data ◽

Data Selection ◽

Support Vector ◽

Training Data Selection

Download Full-text

Software Defect Prediction Incremental Model using Ensemble Learning

International Journal of Performability Engineering ◽

10.23940/ijpe.20.11.p9.17711780 ◽

2020 ◽

Vol 16 (11) ◽

pp. 1771

Author(s):

Wang Shibo ◽

Li Yong ◽

Mi Wenbo ◽

Liu Ying

Keyword(s):

Ensemble Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Incremental Model ◽

Software Defect

Download Full-text