Reducing overfitting in genetic programming models for software quality classification

A software quality estimation model is an important tool for a given software quality assurance initiative. Software quality classification models can be used to indicate which program modules are fault-prone (FP) and not fault-prone (NFP). Such models assume that enough resources are available for quality improvement of all the modules predicted as FP. In conjunction with a software quality classification model, a quality-based ranking of program modules has practical benefits since priority can be given to modules that are more FP. However, such a ranking cannot be achieved by traditional classification techniques. We present a novel software quality classification model based on multi-objective optimization with genetic programming (GP). More specifically, the GP-based model provides both a classification (FP or NFP) and a quality-based ranking for the program modules. The quality factor used to rank the modules is typically the number of faults or defects associated with a module. Genetic programming is ideally suited for optimizing multiple criteria simultaneously. In our study, three performance criteria are used to evolve a GP-based software quality model: classification performance, module ranking, and size of the GP tree. The third criterion addresses a commonly observed phenomena in GP,that is, bloating. The proposed model is investigated with case studies of software measurement data obtained from two industrial software systems.

Download Full-text

Building Decision Tree Software Quality Classification Models Using Genetic Programming

Genetic and Evolutionary Computation — GECCO 2003 - Lecture Notes in Computer Science ◽

10.1007/3-540-45110-2_75 ◽

2003 ◽

pp. 1808-1809 ◽

Cited By ~ 1

Author(s):

Yi Liu ◽

Taghi M. Khoshgoftaar

Keyword(s):

Decision Tree ◽

Genetic Programming ◽

Software Quality ◽

Classification Models ◽

Quality Classification

Download Full-text

Tree-Based Software Quality Classification Using Genetic Programming

Series on Quality, Reliability and Engineering Statistics - Reliability Modeling, Analysis and Optimization ◽

10.1142/9789812707147_0010 ◽

2006 ◽

pp. 201-224

Author(s):

Taghi M. Khoshgoftaar ◽

Yi Liu ◽

Naeem Seliya

Keyword(s):

Genetic Programming ◽

Software Quality ◽

Quality Classification

Download Full-text

Genetic programming-based decision trees for software quality classification

Proceedings. 15th IEEE International Conference on Tools with Artificial Intelligence ◽

10.1109/tai.2003.1250214 ◽

2004 ◽

Cited By ~ 20

Author(s):

T.M. Khoshgoftaar ◽

N. Seliya ◽

Yi Liu

Keyword(s):

Genetic Programming ◽

Decision Trees ◽

Software Quality ◽

Quality Classification

Download Full-text

A Practical Software Quality Classification Model Using Genetic Programming

Advances in Machine Learning Applications in Software Engineering ◽

10.4018/9781591409411.ch009 ◽

2011 ◽

Author(s):

Yi Liu ◽

Taghi M. Khoshgoftaar

Keyword(s):

Genetic Programming ◽

Software Quality ◽

Classification Model ◽

Quality Classification

Download Full-text

Software Defect Prediction Using Genetic Programming and Neural Networks

Deep Learning and Neural Networks ◽

10.4018/978-1-7998-0414-7.ch088 ◽

2020 ◽

pp. 1577-1597

Author(s):

Mohammed Akour ◽

Wasen Yahya Melhem

Keyword(s):

Neural Networks ◽

Genetic Programming ◽

Detailed Analysis ◽

Software Quality ◽

The Other ◽

Defect Prediction ◽

Data Repository ◽

Software Defect Prediction ◽

Classification Methods ◽

Software Defect

This article describes how classification methods on software defect prediction is widely researched due to the need to increase the software quality and decrease testing efforts. However, findings of past researches done on this issue has not shown any classifier which proves to be superior to the other. Additionally, there is a lack of research that studies the effects and accuracy of genetic programming on software defect prediction. To find solutions for this problem, a comparative software defect prediction experiment between genetic programming and neural networks are performed on four datasets from the NASA Metrics Data repository. Generally, an interesting degree of accuracy is detected, which shows how the metric-based classification is useful. Nevertheless, this article specifies that the application and usage of genetic programming is highly recommended due to the detailed analysis it provides, as well as an important feature in this classification method which allows the viewing of each attributes impact in the dataset.

Download Full-text