A Practical Software Quality Classification Model Using Genetic Programming

Advances in Machine Learning Applications in Software Engineering ◽

10.4018/978-1-59140-941-1.ch009 ◽

2011 ◽

pp. 208-236

Author(s):

Yi Liu ◽

Taghi M. Khoshgoftaar

Keyword(s):

Genetic Programming ◽

Software Quality ◽

Measurement Data ◽

Classification Model ◽

Quality Model ◽

Estimation Model ◽

Model Classification ◽

Quality Classification ◽

Industrial Software ◽

Program Modules

A software quality estimation model is an important tool for a given software quality assurance initiative. Software quality classification models can be used to indicate which program modules are fault-prone (FP) and not fault-prone (NFP). Such models assume that enough resources are available for quality improvement of all the modules predicted as FP. In conjunction with a software quality classification model, a quality-based ranking of program modules has practical benefits since priority can be given to modules that are more FP. However, such a ranking cannot be achieved by traditional classification techniques. We present a novel software quality classification model based on multi-objective optimization with genetic programming (GP). More specifically, the GP-based model provides both a classification (FP or NFP) and a quality-based ranking for the program modules. The quality factor used to rank the modules is typically the number of faults or defects associated with a module. Genetic programming is ideally suited for optimizing multiple criteria simultaneously. In our study, three performance criteria are used to evolve a GP-based software quality model: classification performance, module ranking, and size of the GP tree. The third criterion addresses a commonly observed phenomena in GP,that is, bloating. The proposed model is investigated with case studies of software measurement data obtained from two industrial software systems.

Download Full-text

A Multi-Objective Software Quality Classification Model Using Genetic Programming

IEEE Transactions on Reliability ◽

10.1109/tr.2007.896763 ◽

2007 ◽

Vol 56 (2) ◽

pp. 237-245 ◽

Cited By ~ 25

Author(s):

Taghi M. Khoshgoftaar ◽

Yi Liu

Keyword(s):

Genetic Programming ◽

Software Quality ◽

Classification Model ◽

Multi Objective ◽

Quality Classification

Download Full-text

A Practical Software Quality Classification Model Using Genetic Programming

Advances in Machine Learning Applications in Software Engineering ◽

10.4018/9781591409411.ch009 ◽

2011 ◽

Author(s):

Yi Liu ◽

Taghi M. Khoshgoftaar

Keyword(s):

Genetic Programming ◽

Software Quality ◽

Classification Model ◽

Quality Classification

Download Full-text

Software quality classification model based on McCabe's complexity measure

Journal of Systems and Software ◽

10.1016/s0164-1212(97)00060-5 ◽

1997 ◽

Vol 38 (1) ◽

pp. 61-69 ◽

Cited By ~ 8

Author(s):

Ryouei Takahashi

Keyword(s):

Software Quality ◽

Complexity Measure ◽

Classification Model ◽

Model Based ◽

Quality Classification

Download Full-text

Genetic programming model for software quality classification

Proceedings Sixth IEEE International Symposium on High Assurance Systems Engineering. Special Topic: Impact of Networking ◽

10.1109/hase.2001.966814 ◽

2002 ◽

Cited By ~ 14

Author(s):

Yi Liu ◽

T.M. Khoshgoftaar

Keyword(s):

Genetic Programming ◽

Software Quality ◽

Programming Model ◽

Quality Classification

Download Full-text

A RULE-BASED SOFTWARE QUALITY CLASSIFICATION MODEL

International Journal of Reliability Quality and Safety Engineering ◽

10.1142/s0218539308003064 ◽

2008 ◽

Vol 15 (03) ◽

pp. 247-259

Author(s):

TAGHI M. KHOSHGOFTAAR ◽

LOFTON A. BULLARD ◽

KEHAN GAO

Keyword(s):

Software Quality ◽

Rough Set Theory ◽

Classification Model ◽

Equal Frequency ◽

Rule Based ◽

Functional Aspects ◽

Quality Classification ◽

Proposed Model ◽

Software Modules

A rule-based classification model is presented to identify high-risk software modules. It utilizes the power of rough set theory to reduce the number of attributes, and the equal frequency binning algorithm to partition the values of the attributes. As a result, a set of conjuncted Boolean predicates are formed. The model is inherently influenced by the practical needs of the system being modeled, thus allowing the analyst to determine which rules are to be used for classifying the fault-prone and not fault-prone modules. The proposed model also enables the analyst to control the number of rules that constitute the model. Empirical validation of the model is accomplished through a case study of a large legacy telecommunications system. The ease of rule interpretation and the transparency of the functional aspects of the model are clearly demonstrated. It is concluded that the new model is effective in achieving the software quality classification.

Download Full-text

Building Decision Tree Software Quality Classification Models Using Genetic Programming

Genetic and Evolutionary Computation — GECCO 2003 - Lecture Notes in Computer Science ◽

10.1007/3-540-45110-2_75 ◽

2003 ◽

pp. 1808-1809 ◽

Cited By ~ 1

Author(s):

Yi Liu ◽

Taghi M. Khoshgoftaar

Keyword(s):

Decision Tree ◽

Genetic Programming ◽

Software Quality ◽

Classification Models ◽

Quality Classification

Download Full-text

AN EMPIRICAL STUDY OF FEATURE RANKING TECHNIQUES FOR SOFTWARE QUALITY PREDICTION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194012400013 ◽

2012 ◽

Vol 22 (02) ◽

pp. 161-183 ◽

Cited By ~ 30

Author(s):

TAGHI M. KHOSHGOFTAAR ◽

KEHAN GAO ◽

AMRI NAPOLITANO

Keyword(s):

Software Quality ◽

Performance Metrics ◽

Signal To Noise Ratio ◽

Statistical Tests ◽

Classification Model ◽

Feature Ranking ◽

Data Sets ◽

Software Metric ◽

Quality Program ◽

Program Modules

The primary goal of software quality engineering is to produce a high quality software product through the use of some specific techniques and processes. One strategy is applying data mining techniques to software metric and defect data collected during the software development process to identify potential low-quality program modules. In this paper, we investigate the use of feature selection in the context of software quality estimation (also referred to as software defect prediction), where a classification model is used to predict whether program modules (instances) are fault-prone or not-fault-prone. Seven filter-based feature ranking techniques are examined. Among them, six are commonly used, and the other one, named signal to noise ratio (SNR), is rarely employed. The objective of the paper is to compare these seven techniques for various software data sets and assess their effectiveness for software quality modeling. A case study is performed on 16 software data sets, and classification models are built with five different learners and evaluated with two performance metrics. Our experimental results are summarized based on statistical tests for significance. The main conclusion is that the SNR technique performs as well as the best performer of the six commonly used techniques.

Download Full-text