Data collection for Software Defect Prediction - An exploratory case study of open source software projects

Author(s):  
Goran Mausa ◽  
Tihana Galinac Grbac ◽  
Bojana Dalbelo Basic
2021 ◽  
Vol 11 (5) ◽  
pp. 2002
Author(s):  
Jonggu Kang ◽  
Sunjae Kwon ◽  
Duksan Ryu ◽  
Jongmoon Baik

Software is playing the most important role in recent vehicle innovations, and consequently the amount of software has rapidly grown in recent decades. The safety-critical nature of ships, one sort of vehicle, makes software quality assurance (SQA) a fundamental prerequisite. Just-in-time software defect prediction (JIT-SDP) aims to conduct software defect prediction (SDP) on commit-level code changes to achieve effective SQA resource allocation. The first case study of SDP in the maritime domain reported feasible prediction performance. However, we still consider that the prediction model has room for improvement since the parameters of the model are not optimized yet. Harmony search (HS) is a widely used music-inspired meta-heuristic optimization algorithm. In this article, we demonstrated that JIT-SDP can produce better performance of prediction by applying HS-based parameter optimization with balanced fitness value. Using two real-world datasets from the maritime software project, we obtained an optimized model that meets the performance criterion beyond the baseline of a previous case study throughout various defect to non-defect class imbalance ratio of datasets. Experiments with open source software also showed better recall for all datasets despite the fact that we considered balance as a performance index. HS-based parameter optimized JIT-SDP can be applied to the maritime domain software with a high class imbalance ratio. Finally, we expect that our research can be extended to improve the performance of JIT-SDP not only in maritime domain software but also in open source software.


2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Haijin Ji ◽  
Song Huang

Different data preprocessing methods and classifiers have been established and evaluated earlier for the software defect prediction (SDP) across projects. These novel approaches have provided relatively acceptable prediction results for different software projects. However, to the best of our knowledge, few researchers have combined data preprocessing and building robust classifier simultaneously to improve prediction performances in SDP. Therefore, this paper presents a new whole framework for predicting fault-prone software modules. The proposed framework consists of instance filtering, feature selection, instance reduction, and establishing a new classifier. Additionally, we find that the 21 main software metrics commonly do follow nonnormal distribution after performing a Kolmogorov-Smirnov test. Therefore, the newly proposed classifier is built on the maximum correntropy criterion (MCC). The MCC is well-known for its effectiveness in handling non-Gaussian noise. To evaluate the new framework, the experimental study is designed with due care using nine open-source software projects with their 32 releases, obtained from the PROMISE data repository. The prediction accuracy is evaluated using F-measure. The state-of-the-art methods for Cross-Project Defect Prediction are also included for comparison. All of the evidences derived from the experimentation verify the effectiveness and robustness of our new framework.


Author(s):  
Donald Wynn Jr.

This study examines the concept of an ecosystem as originated in the field of ecology and applied to open source software projects. Additionally, a framework for assessing the three dimensions of ecosystem health is defined and explained using examples from a specific open source ecosystem. The conceptual framework is explained in the context of a case study for a sponsored open source ecosystem. The framework and case study highlight a number of characteristics and aspects of these ecosystems which can be evaluated by existing and potential members to gauge the health and sustainability of open source projects and the products and services they produce.


Sign in / Sign up

Export Citation Format

Share Document