A novel software defect prediction based on atomic class-association rule mining

Software defect prediction, if is effective, enables the developers to distribute their testing efforts efficiently and let them focus on defect prone modules. It would be very resource consuming to test all the modules while the defect lies in fraction of modules. Information about fault-proneness of classes and methods can be used to develop new strategies which can help mitigate the overall development cost and increase the customer satisfaction. Several machine learning strategies have been used in recent past to identify defective modules. These models are built using publicly available historical software defect data sets. Most of the proposed techniques are not able to deal with the class imbalance problem efficiently. Therefore, it is necessary to develop a prediction model which consists of small simple and comprehensible rules. Considering these facts, in this paper, the authors propose a novel defect prediction approach named GUHA based Classification Association Rule Mining algorithm (G-CARM) where “GUHA” stands for General Unary Hypothesis Automaton. G-CARM approach is primarily based on Classification Association Rule Mining, and deploys a two stage process involving attribute discretization, and rule generation using GUHA. GUHA is oldest yet very powerful method of pattern mining. The basic idea of GUHA procedure is to mine the interesting attribute patterns that indicate defect proneness. The new method has been compared against five other models reported in recent literature viz. Naive Bayes, Support Vector Machine, RIPPER, J48 and Nearest Neighbour classifier by using several measures, including AUC and probability of detection. The experimental results indicate that the prediction performance of G-CARM approach is better than other prediction approaches. The authors' approach achieved 76% mean recall and 83% mean precision for defective modules and 93% mean recall and 83% mean precision for non-defective modules on CM1, KC1, KC2 and Eclipse data sets. Further defect rule generation process often generates a large number of rules which require considerable efforts while using these rules as a defect predictor, hence, a rule sub-set selection process is also proposed to select best set of rules according to the requirements. Evolution criteria for defect prediction like sensitivity, specificity, precision often compete against each other. It is therefore, important to use multi-objective optimization algorithms for selecting prediction rules. In this paper the authors report prediction rules that are Pareto efficient in the sense that no further improvements in the rule set is possible without sacrificing some performance criteria. Non-Dominated Sorting Genetic Algorithm has been used to find Pareto front and defect prediction rules.

Download Full-text

INTEGRATING ACTION-BASED DEFECT PREDICTION TO PROVIDE RECOMMENDATIONS FOR DEFECT ACTION CORRECTION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194013500022 ◽

2013 ◽

Vol 23 (02) ◽

pp. 147-172

Author(s):

CHING-PAO CHANG

Keyword(s):

Prediction Model ◽

Association Rule ◽

Association Rule Mining ◽

Software Process ◽

Negative Association ◽

Defect Prediction ◽

Rule Mining ◽

Mining Technique ◽

Software Defect ◽

Recommendations For Action

Reducing software defects is an essential activity for Software Process Improvement. The Action-Based Defect Prediction (ABDP) approach fragments the software process into actions, and builds software defect prediction models using data collected from the execution of actions and reported defects. Though the ABDP approach can be applied to predict possible defects in subsequent actions, the efficiency of corrections is dependent on the skill and knowledge of the stakeholders. To address this problem, this study proposes the Action Correction Recommendation (ACR) model to provide recommendations for action correction, using the Negative Association Rule mining technique. In addition to applying the association rule mining technique to build a High Defect Prediction Model (HDPM) to identify high defect action, the ACR builds a Low Defect Prediction Model (LDPM). For a submitted action, each HDPM rule used to predict the action as a high defect action and the LDPM rules are analyzed using negative association rule mining to spot the rule items with different characteristics in HDPM and LDPM rules. This information not only identifies the attributes required for corrections, but also provides a range (or a value) to facilitate the high defect action corrections. This study applies the ACR approach to a business software project to validate the efficiency of the proposed approach. The results show that the recommendations obtained can be applied to decrease software defect removal efforts.

Download Full-text

Analysis of Various Interestingness Measures in Class Association Rule Mining

SICE Journal of Control Measurement and System Integration ◽

10.9746/jcmsi.4.295 ◽

2011 ◽

Vol 4 (4) ◽

pp. 295-304 ◽

Cited By ~ 6

Author(s):

Xianneng LI ◽

Shingo MABU ◽

Huiyu ZHOU ◽

Kaoru SHIMADA ◽

Kotaro HIRASAWA

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Interestingness Measures ◽

Class Association Rule

Download Full-text

Class Association Rule Mining with Multiple Imbalanced Attributes

AI 2007: Advances in Artificial Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-540-76928-6_100 ◽

2007 ◽

pp. 827-831 ◽

Cited By ~ 2

Author(s):

Huaifeng Zhang ◽

Yanchang Zhao ◽

Longbing Cao ◽

Chengqi Zhang

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Rule Mining ◽

Class Association Rule

Download Full-text

NEW DISCRETE CROW SEARCH ALGORITHM FOR CLASS ASSOCIATION RULE MINING

International Journal of Swarm Intelligence Research ◽

10.4018/ijsir.2022010109 ◽

2022 ◽

Vol 13 (1) ◽

pp. 0-0

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Search Algorithm ◽

Classification Problem ◽

Discrete Version ◽

Classification Algorithms ◽

Associative Classification ◽

Rule Mining ◽

Rule Based ◽

Class Association Rule

Associative Classification (AC) or Class Association Rule (CAR) mining is a very efficient method for the classification problem. It can build comprehensible classification models in the form of a list of simple IF-THEN classification rules from the available data. In this paper, we present a new, and improved discrete version of the Crow Search Algorithm (CSA) called NDCSA-CAR to mine the Class Association Rules. The goal of this article is to improve the data classification accuracy and the simplicity of classifiers. The authors applied the proposed NDCSA-CAR algorithm on eleven benchmark dataset and compared its result with traditional algorithms and recent well known rule-based classification algorithms. The experimental results show that the proposed algorithm outperformed other rule-based approaches in all evaluated criteria.

Download Full-text

Rare Class Association Rule Mining with Multiple Imbalanced Attributes

Rare Association Rule Mining and Knowledge Discovery ◽

10.4018/978-1-60566-754-6.ch005 ◽

2010 ◽

pp. 66-75 ◽

Cited By ~ 2

Author(s):

Huaifeng Zhang ◽

Yanchang Zhao ◽

Longbing Cao ◽

Chengqi Zhang ◽

Hans Bohlscheid

Keyword(s):

Association Rule ◽

Association Rule Mining ◽

Uniform Space ◽

Rule Mining ◽

Target Class ◽

Left Hand ◽

The Social ◽

Rare Class ◽

Class Association Rule ◽

The Right

In this chapter, the authors propose a novel framework for rare class association rule mining. In each class association rule, the right-hand is a target class while the left-hand may contain one or more attributes. This algorithm is focused on the multiple imbalanced attributes on the left-hand. In the proposed framework, the rules with and without imbalanced attributes are processed in parallel. The rules without imbalanced attributes are mined through a standard algorithm while the rules with imbalanced attributes are mined based on newly defined measurements. Through simple transformation, these measurements can be in a uniform space so that only a few parameters need to be specified by user. In the case study, the proposed algorithm is applied in the social security field. Although some attributes are severely imbalanced, rules with a minority of imbalanced attributes have been mined efficiently.

Download Full-text