Nonlinear Geometric Framework for Software Defect Prediction

Humans use the software in every walk of life thus it is essential to have the best quality software. Software defect prediction models assist in identifying defect prone modules with the help of historical data, which in turn improves software quality. Historical data consists of data related to modules /files/classes which are labeled as buggy or clean. As the number of buggy artifacts as less as compared to clean artifacts, the nature of historical data becomes imbalance. Due to this uneven distribution of the data, it difficult for classification algorithms to build highly effective SDP models. The objective of this study is to propose a new nonlinear geometric framework based on SMOTE and ensemble learning to improve the performance of SDP models. The study combines the traditional SMOTE algorithm and the novel ensemble Support Vector Machine (SVM) is used to develop the proposed framework called SMEnsemble. SMOTE algorithm handles the class imbalance problem by generating synthetic instances of the minority class. Ensemble learning generates multiple classification models to select the best performing SDP model. For experimentation, datasets from three different software repositories that contain both open source as well as proprietary projects are used in the study. The results show that SMEnsemble performs better than traditional methods for identifying the minority class i.e. buggy artifacts. Also, the proposed model performance is better than the latest state of Art SDP model- SMOTUNED. The proposed model is capable of handling imbalance classes when compared with traditional methods. Also, by carefully selecting the number of ensembles high performance can be achieved in less time.

Download Full-text

Software Defect Prediction Incremental Model using Ensemble Learning

International Journal of Performability Engineering ◽

10.23940/ijpe.20.11.p9.17711780 ◽

2020 ◽

Vol 16 (11) ◽

pp. 1771

Author(s):

Wang Shibo ◽

Li Yong ◽

Mi Wenbo ◽

Liu Ying

Keyword(s):

Ensemble Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Incremental Model ◽

Software Defect

Download Full-text

Software Defect Prediction and Localization with Attention-Based Models and Ensemble Learning

2020 27th Asia-Pacific Software Engineering Conference (APSEC) ◽

10.1109/apsec51365.2020.00016 ◽

2020 ◽

Author(s):

Tianhang Zhang ◽

Qingfeng Du ◽

Jincheng Xu ◽

Jiechu Li ◽

Xiaojun Li

Keyword(s):

Ensemble Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Prediction of Software Defect Using Ensemble Learning Based Improved Sparrow Search Algorithm To Optimize Extreme Learning Machine

10.21203/rs.3.rs-1100298/v1 ◽

2021 ◽

Author(s):

Yu Tang ◽

Qi Dai ◽

Mengyuan Yang ◽

Lifang Chen

Keyword(s):

Extreme Learning Machine ◽

Ensemble Learning ◽

Learning Algorithm ◽

Search Algorithm ◽

Defect Prediction ◽

Software Defect Prediction ◽

Prediction Ability ◽

Software Defect ◽

Ensemble Learning Algorithm ◽

Learning Machine

Abstract For the traditional ensemble learning algorithm of software defect prediction, the base predictor exists the problem that too many parameters are difficult to optimize, resulting in the optimized performance of the model unable to be obtained. An ensemble learning algorithm for software defect prediction that is proposed by using the improved sparrow search algorithm to optimize the extreme learning machine, which divided into three parts. Firstly, the improved sparrow search algorithm (ISSA) is proposed to improve the optimization ability and convergence speed, and the performance of the improved sparrow search algorithm is tested by using eight benchmark test functions. Secondly, ISSA is used to optimize extreme learning machine (ISSA-ELM) to improve the prediction ability. Finally, the optimized ensemble learning algorithm (ISSA-ELM-Bagging) is presented in the Bagging algorithm which improve the prediction performance of ELM in software defect datasets. Experiments are carried out in six groups of software defect datasets. The experimental results show that ISSA-ELM-Bagging ensemble learning algorithm is significantly better than the other four comparison algorithms under the six evaluation indexes of Precision, Recall, F-measure, MCC, Accuracy and G-mean, which has better stability and generalization ability.

Download Full-text

Software Defect Prediction Based on Ensemble Learning

Proceedings of the 2019 2nd International Conference on Data Science and Information Technology ◽

10.1145/3352411.3352412 ◽

2019 ◽

Cited By ~ 4

Author(s):

Ran Li ◽

Lijuan Zhou ◽

Shudong Zhang ◽

Hui Liu ◽

Xiangyang Huang ◽

...

Keyword(s):

Ensemble Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Handling Imbalanced Data using Ensemble Learning in Software Defect Prediction

2020 10th International Conference on Cloud Computing, Data Science & Engineering (Confluence) ◽

10.1109/confluence47617.2020.9058124 ◽

2020 ◽

Cited By ~ 1

Author(s):

Ruchika Malhotra ◽

Juhi Jain

Keyword(s):

Ensemble Learning ◽

Imbalanced Data ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Omni-Ensemble Learning (OEL): Utilizing Over-Bagging, Static and Dynamic Ensemble Selection Approaches for Software Defect Prediction

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213018500240 ◽

2018 ◽

Vol 27 (06) ◽

pp. 1850024 ◽

Cited By ~ 4

Author(s):

Reza Mousavi ◽

Mahdi Eftekhari ◽

Farhad Rahdari

Keyword(s):

Ensemble Learning ◽

Class Imbalance ◽

Defect Prediction ◽

Learning Approaches ◽

Software Defect Prediction ◽

Multiple Classifier Systems ◽

Classifier Systems ◽

Software Defect ◽

Ensemble Selection ◽

Dynamic Ensemble Selection

Machine learning methods in software engineering are becoming increasingly important as they can improve quality and testing efficiency by constructing models to predict defects in software modules. The existing datasets for software defect prediction suffer from an imbalance of class distribution which makes the learning problem in such a task harder. In this paper, we propose a novel approach by integrating Over-Bagging, static and dynamic ensemble selection strategies. The proposed method utilizes most of ensemble learning approaches called Omni-Ensemble Learning (OEL). This approach exploits a new Over-Bagging method for class imbalance learning in which the effect of three different methods of assigning weight to training samples is investigated. The proposed method first specifies the best classifiers along with their combiner for all test samples through Genetic Algorithm as the static ensemble selection approach. Then, a subset of the selected classifiers is chosen for each test sample as the dynamic ensemble selection. Our experiments confirm that the proposed OEL can provide better overall performance (in terms of G-mean, balance, and AUC measures) comparing with other six related works and six multiple classifier systems over seven NASA datasets. We generally recommend OEL to improve the performance of software defect prediction and the similar problem based on these experimental results.

Download Full-text

A Framework for Software Defect Prediction Using Feature Selection and Ensemble Learning Techniques

International Journal of Modern Education and Computer Science ◽

10.5815/ijmecs.2019.12.01 ◽

2019 ◽

Vol 11 (12) ◽

pp. 14-20

Author(s):

Faseeha Matloob ◽

◽

Shabib Aftab ◽

Ahmed Iqbal

Keyword(s):

Feature Selection ◽

Ensemble Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Learning Techniques

Download Full-text

Using Coding-Based Ensemble Learning to Improve Software Defect Prediction

IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews) ◽

10.1109/tsmcc.2012.2226152 ◽

2012 ◽

Vol 42 (6) ◽

pp. 1806-1817 ◽

Cited By ~ 88

Author(s):

Zhongbin Sun ◽

Qinbao Song ◽

Xiaoyan Zhu

Keyword(s):

Ensemble Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Software defect prediction using ensemble learning on selected features

Information and Software Technology ◽

10.1016/j.infsof.2014.07.005 ◽

2015 ◽

Vol 58 ◽

pp. 388-402 ◽

Cited By ~ 122

Author(s):

Issam H. Laradji ◽

Mohammad Alshayeb ◽

Lahouari Ghouti

Keyword(s):

Ensemble Learning ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect

Download Full-text

Research and Appalication of Software Defect Predictionn based on BP-Migration learning

MATEC Web of Conferences ◽

10.1051/matecconf/201823203017 ◽

2018 ◽

Vol 232 ◽

pp. 03017

Author(s):

Jie Zhang ◽

Gang Wang ◽

Haobo Jiang ◽

Fangzheng Zhao ◽

Guilin Tian

Keyword(s):

Prediction Model ◽

Historical Data ◽

Defect Prediction ◽

Software Project ◽

Data Sets ◽

Software Defect Prediction ◽

Software Module ◽

Data Set ◽

Software Defect ◽

Project Data

Software Defect Prediction has been an important part of Software engineering research since the 1970s. This technique is used to calculate and analyze the measurement and defect information of the historical software module to complete the defect prediction of the new software module. Currently, most software defect prediction model is established on the basis of the same software project data set. The training date sets used to construct the model and the test data sets used to validate the model are from the same software projects. But in practice, for those has less historical data of a software project or new projects, the defect of traditional prediction method shows lower forecast performance. For the traditional method, when the historical data is insufficient, the software defect prediction model cannot be fully studied. It is difficult to achieve high prediction accuracy. In the process of cross-project prediction, the problem that we will faced is data distribution differences. For the above problems, this paper presents a software defect prediction model based on migration learning and traditional software defect prediction model. This model uses the existing project data sets to predict software defects across projects. The main work of this article includes: 1) Data preprocessing. This section includes data feature correlation analysis, noise reduction and so on, which effectively avoids the interference of over-fitting problem and noise data on prediction results. 2) Migrate learning. This section analyzes two different but related project data sets and reduces the impact of data distribution differences. 3) Artificial neural networks. According to class imbalance problems of the data set, using artificial neural network and dynamic selection training samples reduce the influence of prediction results because of the positive and negative samples data. The data set of the Relink project and AEEEM is studied to evaluate the performance of the f-measure and the ROC curve and AUC calculation. Experiments show that the model has high predictive performance.

Download Full-text