Software defect prediction using relational association rule mining

Software defect prediction, if is effective, enables the developers to distribute their testing efforts efficiently and let them focus on defect prone modules. It would be very resource consuming to test all the modules while the defect lies in fraction of modules. Information about fault-proneness of classes and methods can be used to develop new strategies which can help mitigate the overall development cost and increase the customer satisfaction. Several machine learning strategies have been used in recent past to identify defective modules. These models are built using publicly available historical software defect data sets. Most of the proposed techniques are not able to deal with the class imbalance problem efficiently. Therefore, it is necessary to develop a prediction model which consists of small simple and comprehensible rules. Considering these facts, in this paper, the authors propose a novel defect prediction approach named GUHA based Classification Association Rule Mining algorithm (G-CARM) where “GUHA” stands for General Unary Hypothesis Automaton. G-CARM approach is primarily based on Classification Association Rule Mining, and deploys a two stage process involving attribute discretization, and rule generation using GUHA. GUHA is oldest yet very powerful method of pattern mining. The basic idea of GUHA procedure is to mine the interesting attribute patterns that indicate defect proneness. The new method has been compared against five other models reported in recent literature viz. Naive Bayes, Support Vector Machine, RIPPER, J48 and Nearest Neighbour classifier by using several measures, including AUC and probability of detection. The experimental results indicate that the prediction performance of G-CARM approach is better than other prediction approaches. The authors' approach achieved 76% mean recall and 83% mean precision for defective modules and 93% mean recall and 83% mean precision for non-defective modules on CM1, KC1, KC2 and Eclipse data sets. Further defect rule generation process often generates a large number of rules which require considerable efforts while using these rules as a defect predictor, hence, a rule sub-set selection process is also proposed to select best set of rules according to the requirements. Evolution criteria for defect prediction like sensitivity, specificity, precision often compete against each other. It is therefore, important to use multi-objective optimization algorithms for selecting prediction rules. In this paper the authors report prediction rules that are Pareto efficient in the sense that no further improvements in the rule set is possible without sacrificing some performance criteria. Non-Dominated Sorting Genetic Algorithm has been used to find Pareto front and defect prediction rules.

Download Full-text

INTEGRATING ACTION-BASED DEFECT PREDICTION TO PROVIDE RECOMMENDATIONS FOR DEFECT ACTION CORRECTION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194013500022 ◽

2013 ◽

Vol 23 (02) ◽

pp. 147-172

Author(s):

CHING-PAO CHANG

Keyword(s):

Prediction Model ◽

Association Rule ◽

Association Rule Mining ◽

Software Process ◽

Negative Association ◽

Defect Prediction ◽

Rule Mining ◽

Mining Technique ◽

Software Defect ◽

Recommendations For Action

Reducing software defects is an essential activity for Software Process Improvement. The Action-Based Defect Prediction (ABDP) approach fragments the software process into actions, and builds software defect prediction models using data collected from the execution of actions and reported defects. Though the ABDP approach can be applied to predict possible defects in subsequent actions, the efficiency of corrections is dependent on the skill and knowledge of the stakeholders. To address this problem, this study proposes the Action Correction Recommendation (ACR) model to provide recommendations for action correction, using the Negative Association Rule mining technique. In addition to applying the association rule mining technique to build a High Defect Prediction Model (HDPM) to identify high defect action, the ACR builds a Low Defect Prediction Model (LDPM). For a submitted action, each HDPM rule used to predict the action as a high defect action and the LDPM rules are analyzed using negative association rule mining to spot the rule items with different characteristics in HDPM and LDPM rules. This information not only identifies the attributes required for corrections, but also provides a range (or a value) to facilitate the high defect action corrections. This study applies the ACR approach to a business software project to validate the efficiency of the proposed approach. The results show that the recommendations obtained can be applied to decrease software defect removal efforts.

Download Full-text

Improved Hybrid Genetic Based Rule Mining Algorithm for Software Defect Prediction

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i4.11881195 ◽

2019 ◽

Vol 7 (4) ◽

pp. 1188-1195

Author(s):

S. Maheswari ◽

R. Ganesan ◽

K. Chitra

Keyword(s):

Defect Prediction ◽

Software Defect Prediction ◽

Rule Mining ◽

Software Defect ◽

Mining Algorithm

Download Full-text

Software Defect Prediction Using Software Metrics with Naïve Bayes and Rule Mining Association Methods

2019 5th International Conference on Science and Technology (ICST) ◽

10.1109/icst47872.2019.9166448 ◽

2019 ◽

Author(s):

Fernando Maruli Tua ◽

Wikan Danar Sunindyo

Keyword(s):

Software Metrics ◽

Naive Bayes ◽

Naïve Bayes ◽

Defect Prediction ◽

Software Defect Prediction ◽

Rule Mining ◽

Software Defect

Download Full-text

Too trivial to test? An inverse view on defect prediction to identify methods with low fault risk

PeerJ Computer Science ◽

10.7717/peerj-cs.187 ◽

2019 ◽

Vol 5 ◽

pp. e187 ◽

Cited By ~ 1

Author(s):

Rainer Niedermayr ◽

Tobias Röhm ◽

Stefan Wagner

Keyword(s):

Empirical Study ◽

Association Rule ◽

Association Rule Mining ◽

Defect Prediction ◽

Efficient Allocation ◽

Rule Mining ◽

Scarce Resources ◽

Development Teams ◽

Code Metrics ◽

Cross Project

BackgroundTest resources are usually limited and therefore it is often not possible to completely test an application before a release. To cope with the problem of scarce resources, development teams can apply defect prediction to identify fault-prone code regions. However, defect prediction tends to low precision in cross-project prediction scenarios.AimsWe take an inverse view on defect prediction and aim to identify methods that can be deferred when testing because they contain hardly any faults due to their code being “trivial”. We expect that characteristics of such methods might be project-independent, so that our approach could improve cross-project predictions.MethodWe compute code metrics and apply association rule mining to create rules for identifying methods with low fault risk (LFR). We conduct an empirical study to assess our approach with six Java open-source projects containing precise fault data at the method level.ResultsOur results show that inverse defect prediction can identify approx. 32–44% of the methods of a project to have a LFR; on average, they are about six times less likely to contain a fault than other methods. In cross-project predictions with larger, more diversified training sets, identified methods are even 11 times less likely to contain a fault.ConclusionsInverse defect prediction supports the efficient allocation of test resources by identifying methods that can be treated with less priority in testing activities and is well applicable in cross-project prediction scenarios.

Download Full-text

Adaptive PSO Based Association Rule Mining Technique for Software Defect Classification Using ANN

Procedia Computer Science ◽

10.1016/j.procs.2015.02.041 ◽

2015 ◽

Vol 46 ◽

pp. 432-442 ◽

Cited By ~ 7

Author(s):

B. Dhanalaxmi ◽

G. Apparao Naidu ◽

K. Anuradha

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Defect Classification ◽

Rule Mining ◽

Mining Technique ◽

Software Defect ◽

Adaptive Pso

Download Full-text

Software Defect Prediction Based on Association Rule Classification

SSRN Electronic Journal ◽

10.2139/ssrn.1785381 ◽

2011 ◽

Cited By ~ 5

Author(s):

Ma Baojun ◽

Karel Dejaeger ◽

Jan Vanthienen ◽

Bart Baesens

Keyword(s):

Association Rule ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Too trivial to test? An inverse view on defect prediction to identify methods with low fault risk

10.7287/peerj.preprints.27304v1 ◽

2018 ◽

Author(s):

Rainer Niedermayr ◽

Tobias Röhm ◽

Stefan Wagner

Keyword(s):

Empirical Study ◽

Association Rule ◽

Association Rule Mining ◽

Defect Prediction ◽

Efficient Allocation ◽

Rule Mining ◽

Scarce Resources ◽

Development Teams ◽

Code Metrics ◽

Cross Project

Background. Test resources are usually limited and therefore it is often not possible to completely test an application before a release. To cope with the problem of scarce resources, development teams can apply defect prediction to identify fault-prone code regions. However, defect prediction tends to low precision in cross-project prediction scenarios. Aims. We take an inverse view on defect prediction and aim to identify methods that can be deferred when testing because they contain hardly any faults due to their code being "trivial". We expect that characteristics of such methods might be project-independent, so that our approach could improve cross-project predictions. Method. We compute code metrics and apply association rule mining to create rules for identifying methods with low fault risk. We conduct an empirical study to assess our approach with six Java open-source projects containing precise fault data at the method level. Results. Our results show that inverse defect prediction can identify approx. 32-44% of the methods of a project to have a low fault risk; on average, they are about six times less likely to contain a fault than other methods. In cross-project predictions with larger, more diversified training sets, identified methods are even eleven times less likely to contain a fault. Conclusions. Inverse defect prediction supports the efficient allocation of test resources by identifying methods that can be treated with less priority in testing activities and is well applicable in cross-project prediction scenarios.

Download Full-text