defect prediction models
Recently Published Documents


TOTAL DOCUMENTS

66
(FIVE YEARS 29)

H-INDEX

15
(FIVE YEARS 4)

2022 ◽  
Vol 31 (1) ◽  
pp. 1-26
Author(s):  
Davide Falessi ◽  
Aalok Ahluwalia ◽  
Massimiliano DI Penta

Defect prediction models can be beneficial to prioritize testing, analysis, or code review activities, and has been the subject of a substantial effort in academia, and some applications in industrial contexts. A necessary precondition when creating a defect prediction model is the availability of defect data from the history of projects. If this data is noisy, the resulting defect prediction model could result to be unreliable. One of the causes of noise for defect datasets is the presence of “dormant defects,” i.e., of defects discovered several releases after their introduction. This can cause a class to be labeled as defect-free while it is not, and is, therefore “snoring.” In this article, we investigate the impact of snoring on classifiers' accuracy and the effectiveness of a possible countermeasure, i.e., dropping too recent data from a training set. We analyze the accuracy of 15 machine learning defect prediction classifiers, on data from more than 4,000 defects and 600 releases of 19 open source projects from the Apache ecosystem. Our results show that on average across projects (i) the presence of dormant defects decreases the recall of defect prediction classifiers, and (ii) removing from the training set the classes that in the last release are labeled as not defective significantly improves the accuracy of the classifiers. In summary, this article provides insights on how to create defects datasets by mitigating the negative effect of dormant defects on defect prediction.


2022 ◽  
Vol 12 (1) ◽  
pp. 493
Author(s):  
Mahesha Pandit ◽  
Deepali Gupta ◽  
Divya Anand ◽  
Nitin Goyal ◽  
Hani Moaiteq Aljahdali ◽  
...  

Using artificial intelligence (AI) based software defect prediction (SDP) techniques in the software development process helps isolate defective software modules, count the number of software defects, and identify risky code changes. However, software development teams are unaware of SDP and do not have easy access to relevant models and techniques. The major reason for this problem seems to be the fragmentation of SDP research and SDP practice. To unify SDP research and practice this article introduces a cloud-based, global, unified AI framework for SDP called DePaaS—Defects Prediction as a Service. The article describes the usage context, use cases and detailed architecture of DePaaS and presents the first response of the industry practitioners to DePaaS. In a first of its kind survey, the article captures practitioner’s belief into SDP and ability of DePaaS to solve some of the known challenges of the field of software defect prediction. This article also provides a novel process for SDP, detailed description of the structure and behaviour of DePaaS architecture components, six best SDP models offered by DePaaS, a description of algorithms that recommend SDP models, feature sets and tunable parameters, and a rich set of challenges to build, use and sustain DePaaS. With the contributions of this article, SDP research and practice could be unified enabling building and using more pragmatic defect prediction models leading to increase in the efficiency of software testing.


Author(s):  
Elisa Verna ◽  
Gianfranco Genta ◽  
Maurizio Galetto ◽  
Fiorenzo Franceschini

AbstractTypically, monitoring quality characteristics of very personalized products is a difficult task due to the lack of experimental data. This is the typical case of processes where the production volume continues to shrink due to the growing complexity and customization of products, thus requiring low-volume productions. This paper presents a novel approach to statistically monitor defects-per-unit (DPU) of assembled products based on the use of defect prediction models. The innovative aspect of such DPU-chart is that, unlike conventional SPC charts requiring preliminary experimental data to estimate the control limits (phase I), it is constructed using a predictive model based on a priori knowledge of DPU. This defect prediction model is based on the structural complexity of the assembled product. By avoiding phase I, the novel approach may be of interest to researchers and practitioners to speed up the chart’s construction phase, especially in low-volume productions. The description of the method is supported by a real industrial case study in the electromechanical field.


2021 ◽  
Vol 24 (68) ◽  
pp. 72-88
Author(s):  
Mohammad Alshayeb ◽  
Mashaan A. Alshammari

The ongoing development of computer systems requires massive software projects. Running the components of these huge projects for testing purposes might be a costly process; therefore, parameter estimation can be used instead. Software defect prediction models are crucial for software quality assurance. This study investigates the impact of dataset size and feature selection algorithms on software defect prediction models. We use two approaches to build software defect prediction models: a statistical approach and a machine learning approach with support vector machines (SVMs). The fault prediction model was built based on four datasets of different sizes. Additionally, four feature selection algorithms were used. We found that applying the SVM defect prediction model on datasets with a reduced number of measures as features may enhance the accuracy of the fault prediction model. Also, it directs the test effort to maintain the most influential set of metrics. We also found that the running time of the SVM fault prediction model is not consistent with dataset size. Therefore, having fewer metrics does not guarantee a shorter execution time. From the experiments, we found that dataset size has a direct influence on the SVM fault prediction model. However, reduced datasets performed the same or slightly lower than the original datasets.


2021 ◽  
Author(s):  
Elisa Verna ◽  
Gianfranco Genta ◽  
Maurizio Galetto ◽  
Fiorenzo Franceschini

Abstract Typically, monitoring quality characteristics of very personalized products is a difficult task due to the lack of experimental data. This is the typical case of processes where the production volume continues to shrink due to the growing complexity and customization of products, thus requiring low-volume productions. This paper presents a novel approach to statistically monitor Defects Per Unit (DPU) of assembled products based on the use of defect prediction models. Unlike traditional control charts requiring preliminary experimental data to estimate the control limits (phase I), the proposed DPU-chart is constructed using a predictive model based on a priori knowledge of DPU. This defect prediction model is built on the structural complexity of assembled product. The novel approach may be of interest to researchers and practitioners to speed up the construction of the chart, especially in cases of low-volume productions due to the limited amount of data. The description of the method is supported by a real industrial case study in the electromechanical field.


Author(s):  
Elisa Verna ◽  
Gianfranco Genta ◽  
Maurizio Galetto ◽  
Fiorenzo Franceschini

AbstractDesigning appropriate quality-inspections in manufacturing processes has always been a challenge to maintain competitiveness in the market. Recent studies have been focused on the design of appropriate in-process inspection strategies for assembly processes based on probabilistic models. Despite this general interest, a practical tool allowing for the assessment of the adequacy of alternative inspection strategies is still lacking. This paper proposes a general framework to assess the effectiveness and cost of inspection strategies. In detail, defect probabilities obtained by prediction models and inspection variables are combined to define a pair of indicators for developing an inspection strategy map. Such a map acts as an analysis tool, enabling positioning assessment and benchmarking of the strategies adopted by manufacturing companies, but also as a design tool to achieve the desired targets. The approach can assist designers of manufacturing processes, and particularly low-volume productions, in the early stages of inspection planning.


2021 ◽  
Vol 10 (2) ◽  
pp. 1063-1070
Author(s):  
Ruchika Malhotra ◽  
Anjali Sharma

In prediction modeling, the choice of features chosen from the original feature set is crucial for accuracy and model interpretability. Feature ranking techniques rank the features by its importance but there is no consensus on the number of features to be cut-off. Thus, it becomes important to identify a threshold value or range, so as to remove the redundant features. In this work, an empirical study is conducted for identification of the threshold benchmark for feature ranking algorithms. Experiments are conducted on Apache Click dataset with six popularly used ranker techniques and six machine learning techniques, to deduce a relationship between the total number of input features (N) to the threshold range. The area under the curve analysis shows that ≃ 33-50% of the features are necessary and sufficient to yield a reasonable performance measure, with a variance of 2%, in defect prediction models. Further, we also find that the log2(N) as the ranker threshold value represents the lower limit of the range.


Sign in / Sign up

Export Citation Format

Share Document