Estimation of Target Defect Prediction Coverage in Heterogeneous Cross Software Projects

Heterogeneous cross-project defect prediction (HCPDP) is an evolving area under quality assurance domain which aims to predict defects in a target project that has restricted historical defect data as well as completely non-uniform software metrics from other projects using a model built on another source project. The article discusses a particular source project group's problem of defect prediction coverage (DPC) and also proposes a novel two phase model for addressing this issue in HCPDP. The study has evaluated DPC on 13 benchmarked datasets in three open source software projects. One hundred percent of DPC is achieved with higher defect prediction accuracy for two project group pairs. The issue of partial DPC is found in third prediction pairs and a new strategy is proposed in the research study to overcome this issue. Furthermore, this paper compares HCPDP modeling with reference to with-in project defect prediction (WPDP), both empirically and theoretically, and it is found that the performance of WPDP is highly comparable to HCPDP and gradient boosting method performs best among all three classifiers.

Download Full-text

A New Framework Consisted of Data Preprocessing and Classifier Modelling for Software Defect Prediction

Mathematical Problems in Engineering ◽

10.1155/2018/9616938 ◽

2018 ◽

Vol 2018 ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Haijin Ji ◽

Song Huang

Keyword(s):

Software Metrics ◽

Data Preprocessing ◽

Defect Prediction ◽

Data Repository ◽

Software Defect Prediction ◽

Software Projects ◽

Software Defect ◽

Nonnormal Distribution ◽

Non Gaussian ◽

New Framework

Different data preprocessing methods and classifiers have been established and evaluated earlier for the software defect prediction (SDP) across projects. These novel approaches have provided relatively acceptable prediction results for different software projects. However, to the best of our knowledge, few researchers have combined data preprocessing and building robust classifier simultaneously to improve prediction performances in SDP. Therefore, this paper presents a new whole framework for predicting fault-prone software modules. The proposed framework consists of instance filtering, feature selection, instance reduction, and establishing a new classifier. Additionally, we find that the 21 main software metrics commonly do follow nonnormal distribution after performing a Kolmogorov-Smirnov test. Therefore, the newly proposed classifier is built on the maximum correntropy criterion (MCC). The MCC is well-known for its effectiveness in handling non-Gaussian noise. To evaluate the new framework, the experimental study is designed with due care using nine open-source software projects with their 32 releases, obtained from the PROMISE data repository. The prediction accuracy is evaluated using F-measure. The state-of-the-art methods for Cross-Project Defect Prediction are also included for comparison. All of the evidences derived from the experimentation verify the effectiveness and robustness of our new framework.

Download Full-text

Cross project defect prediction for open source software

International Journal of Information Technology ◽

10.1007/s41870-019-00299-6 ◽

2019 ◽

Cited By ~ 2

Author(s):

Anushree Agrawal ◽

Ruchika Malhotra

Keyword(s):

Open Source ◽

Open Source Software ◽

Defect Prediction ◽

Cross Project

Download Full-text

Data collection for Software Defect Prediction - An exploratory case study of open source software projects

2015 38th International Convention on Information and Communication Technology, Electronics and Microelectronics (MIPRO) ◽

10.1109/mipro.2015.7160316 ◽

2015 ◽

Author(s):

Goran Mausa ◽

Tihana Galinac Grbac ◽

Bojana Dalbelo Basic

Keyword(s):

Data Collection ◽

Open Source ◽

Open Source Software ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Projects ◽

Exploratory Case Study ◽

Software Defect

Download Full-text

Impact of Historical Software Metric Changes in Predicting Future Maintainability Trends in Open-Source Software Development

Applied Sciences ◽

10.3390/app10134624 ◽

2020 ◽

Vol 10 (13) ◽

pp. 4624

Author(s):

Mitja Gradišnik ◽

Tina Beranič ◽

Sašo Karakatič

Keyword(s):

Open Source ◽

Open Source Software ◽

Software Maintenance ◽

Software Metrics ◽

Prediction Models ◽

Measurement Data ◽

Software Project ◽

Software Projects ◽

Software Metric ◽

Software Maintainability

Software maintenance is one of the key stages in the software lifecycle and it includes a variety of activities that consume the significant portion of the costs of a software project. Previous research suggest that future software maintainability can be predicted, based on various source code aspects, but most of the research focuses on the prediction based on the present state of the code and ignores its history. While taking the history into account in software maintainability prediction seems intuitive, the research empirically testing this has not been done, and is the main goal of this paper. This paper empirically evaluates the contribution of historical measurements of the Chidamber & Kemerer (C&K) software metrics to software maintainability prediction models. The main contribution of the paper is the building of the prediction models with classification and regression trees and random forest learners in iterations by adding historical measurement data extracted from previous releases gradually. The maintainability prediction models were built based on software metric measurements obtained from real-world open-source software projects. The analysis of the results show that an additional amount of historical metric measurements contributes to the maintainability prediction. Additionally, the study evaluates the contribution of individual C&K software metrics on the performance of maintainability prediction models.

Download Full-text

Motivated and Capable but No Space for Error

The International Journal of Information, Diversity, & Inclusion (IJIDI) ◽

10.33137/ijidi.v5i3.36197 ◽

2021 ◽

Vol 5 (3) ◽

Author(s):

Vandana Singh ◽

Brice Bongiovanni

Keyword(s):

Information Technology ◽

Open Source ◽

Open Source Software ◽

Research Study ◽

Qualitative Interviews ◽

Codes Of Conduct ◽

Software Projects ◽

Nuanced Understanding ◽

Experiences Of Women ◽

Masculine Culture

This article presents the results of a research study about the experiences of women in Open Source Software communities. The lack of women in computing professions serves as a cause of social inequity and in this research we develop a nuanced understanding of the experiences of women participating in open-source software. In-depth qualitative interviews were conducted with eleven women representing multiple countries and a variety of open-source software projects. The theory of individual differences in gender and information technology (IT) laid the foundation for data analysis and interpretation. The results demonstrate varied experiences of women, the need for women-to-women mentoring, and the need for presence and enforcement of Codes of Conduct in the online communities. Women shared their experiences of working in a variety of roles and the importance of all the roles in product development and maintenance. The persistence of women in OSS communities despite the toxic masculine culture, and their interest in improving the environment for other women and marginalized newcomers, was evident from the interviews.

Download Full-text

A two-phase transfer learning model for cross-project defect prediction

Information and Software Technology ◽

10.1016/j.infsof.2018.11.005 ◽

2019 ◽

Vol 107 ◽

pp. 125-136 ◽

Cited By ~ 15

Author(s):

Chao Liu ◽

Dan Yang ◽

Xin Xia ◽

Meng Yan ◽

Xiaohong Zhang

Keyword(s):

Transfer Learning ◽

Phase Transfer ◽

Learning Model ◽

Defect Prediction ◽

Two Phase ◽

Cross Project

Download Full-text

An Empirical Study of Software Metrics Diversity for Cross-Project Defect Prediction

Mathematical Problems in Engineering ◽

10.1155/2021/3135702 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Yiwen Zhong ◽

Kun Song ◽

ShengKai Lv ◽

Peng He

Keyword(s):

Software Metrics ◽

Historical Data ◽

Prediction Performance ◽

Defect Prediction ◽

Improvement Rate ◽

Modeling Techniques ◽

Structural Metrics ◽

The Impact ◽

F Measure ◽

Cross Project

Cross-project defect prediction (CPDP) is a mainstream method estimating the most defect-prone components of software with limited historical data. Several studies investigate how software metrics are used and how modeling techniques influence prediction performance. However, the software’s metrics diversity impact on the predictor remains unclear. Thus, this paper aims to assess the impact of various metric sets on CPDP and investigate the feasibility of CPDP with hybrid metrics. Based on four software metrics types, we investigate the impact of various metric sets on CPDP in terms of F-measure and statistical methods. Then, we validate the dominant performance of CPDP with hybrid metrics. Finally, we further verify the CPDP-OSS feasibility built with three types of metrics (orient-object, semantic, and structural metrics) and challenge them against two current models. The experimental results suggest that the impact of different metric sets on the performance of CPDP is significantly distinct, with semantic and structural metrics performing better. Additionally, trials indicate that it is helpful for CPDP to increase the software’s metrics diversity appropriately, as the CPDP-OSS improvement is up to 53.8%. Finally, compared with two baseline methods, TCA+ and TDSelector, the optimized CPDP model is viable in practice, and the improvement rate is up to 50.6% and 25.7%, respectively.

Download Full-text

GRADIENT BOOSTING METHOD APPLICATION TO SUPPORT PROCESS DECISIONS IN THE ELECTRON-BEAM WELDING PROCESS

Siberian Journal of Science and Technology ◽

10.31772/2587-6066-2020-21-2-206-214 ◽

2020 ◽

Vol 21 (2) ◽

pp. 206-214

Author(s):

V. S. Tynchenko ◽

◽

I. A. Golovenok ◽

V. E. Petrenko ◽

A. V. Milov ◽

...

Keyword(s):

Electron Beam ◽

Electron Beam Welding ◽

Welding Process ◽

Gradient Boosting ◽

Boosting Method

Download Full-text

Heterogeneous Cross Project Defect Prediction in Software

SSRN Electronic Journal ◽

10.2139/ssrn.3580671 ◽

2020 ◽

Author(s):

Sonali Srivastava ◽

Shikha Rani ◽

Shailly Singh ◽

Saurabh Singh ◽

Rohit Vashisht

Keyword(s):

Defect Prediction ◽

Cross Project

Download Full-text

Copula-based software metrics aggregation

Software Quality Journal ◽

10.1007/s11219-021-09568-9 ◽

2021 ◽

Author(s):

Maria Ulan ◽

Welf Löwe ◽

Morgan Ericsson ◽

Anna Wingkvist

Keyword(s):

Open Source Software ◽

Software Metrics ◽

Information Quality ◽

Empirical Studies ◽

Probabilistic Approach ◽

Software Systems ◽

Quality Model ◽

Quality Models ◽

Joint Distributions ◽

Abstract Notion

AbstractA quality model is a conceptual decomposition of an abstract notion of quality into relevant, possibly conflicting characteristics and further into measurable metrics. For quality assessment and decision making, metrics values are aggregated to characteristics and ultimately to quality scores. Aggregation has often been problematic as quality models do not provide the semantics of aggregation. This makes it hard to formally reason about metrics, characteristics, and quality. We argue that aggregation needs to be interpretable and mathematically well defined in order to assess, to compare, and to improve quality. To address this challenge, we propose a probabilistic approach to aggregation and define quality scores based on joint distributions of absolute metrics values. To evaluate the proposed approach and its implementation under realistic conditions, we conduct empirical studies on bug prediction of ca. 5000 software classes, maintainability of ca. 15000 open-source software systems, and on the information quality of ca. 100000 real-world technical documents. We found that our approach is feasible, accurate, and scalable in performance.

Download Full-text