EMPIRICAL EVALUATION OF CLASSIFIERS FOR SOFTWARE RISK MANAGEMENT

Software development involves plenty of risks, and errors exist in software modules represent a major kind of risk. Software defect prediction techniques and tools that identify software errors play a crucial role in software risk management. Among software defect prediction techniques, classification is a commonly used approach. Various types of classifiers have been applied to software defect prediction in recent years. How to select an adequate classifier (or set of classifiers) to identify error prone software modules is an important task for software development organizations. There are many different measures for classifiers and each measure is intended for assessing different aspect of a classifier. This paper developed a performance metric that combines various measures to evaluate the quality of classifiers for software defect prediction. The performance metric is analyzed experimentally using 13 classifiers on 11 public domain software defect datasets. The results of the experiment indicate that support vector machines (SVM), C4.5 algorithm, and K-nearest-neighbor algorithm ranked the top three classifiers.

Download Full-text

Towards Design and Feasibility Analysis of DePaaS: AI Based Global Unified Software Defect Prediction Framework

Applied Sciences ◽

10.3390/app12010493 ◽

2022 ◽

Vol 12 (1) ◽

pp. 493

Author(s):

Mahesha Pandit ◽

Deepali Gupta ◽

Divya Anand ◽

Nitin Goyal ◽

Hani Moaiteq Aljahdali ◽

...

Keyword(s):

Software Development ◽

Prediction Models ◽

Easy Access ◽

Defect Prediction ◽

Software Defect Prediction ◽

Research And Practice ◽

Software Defect ◽

Software Modules ◽

Software Development Teams ◽

Defect Prediction Models

Using artificial intelligence (AI) based software defect prediction (SDP) techniques in the software development process helps isolate defective software modules, count the number of software defects, and identify risky code changes. However, software development teams are unaware of SDP and do not have easy access to relevant models and techniques. The major reason for this problem seems to be the fragmentation of SDP research and SDP practice. To unify SDP research and practice this article introduces a cloud-based, global, unified AI framework for SDP called DePaaS—Defects Prediction as a Service. The article describes the usage context, use cases and detailed architecture of DePaaS and presents the first response of the industry practitioners to DePaaS. In a first of its kind survey, the article captures practitioner’s belief into SDP and ability of DePaaS to solve some of the known challenges of the field of software defect prediction. This article also provides a novel process for SDP, detailed description of the structure and behaviour of DePaaS architecture components, six best SDP models offered by DePaaS, a description of algorithms that recommend SDP models, feature sets and tunable parameters, and a rich set of challenges to build, use and sustain DePaaS. With the contributions of this article, SDP research and practice could be unified enabling building and using more pragmatic defect prediction models leading to increase in the efficiency of software testing.

Download Full-text

Statistical assessment of nonlinear manifold detection-based software defect prediction techniques

International Journal of Intelligent Systems Technologies and Applications ◽

10.1504/ijista.2019.102667 ◽

2019 ◽

Vol 18 (6) ◽

pp. 579 ◽

Cited By ~ 4

Author(s):

Soumi Ghosh ◽

Ajay Rana ◽

Vineet Kansal

Keyword(s):

Defect Prediction ◽

Software Defect Prediction ◽

Statistical Assessment ◽

Software Defect ◽

Prediction Techniques

Download Full-text

Software Defect Prediction Based on Cost-Sensitive Dictionary Learning

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194019500384 ◽

2019 ◽

Vol 29 (09) ◽

pp. 1219-1243 ◽

Cited By ~ 1

Author(s):

Hongyan Wan ◽

Guoqing Wu ◽

Mali Yu ◽

Mengting Yuan

Keyword(s):

Sparse Representation ◽

Dictionary Learning ◽

Class Imbalance ◽

Imbalanced Data ◽

Prediction Method ◽

Elastic Net ◽

Defect Prediction ◽

Software Defect Prediction ◽

Software Defect ◽

Software Modules

Software defect prediction technology has been widely used in improving the quality of software system. Most real software defect datasets tend to have fewer defective modules than defective-free modules. Highly class-imbalanced data typically make accurate predictions difficult. The imbalanced nature of software defect datasets makes the prediction model classifying a defective module as a defective-free one easily. As there exists the similarity during the different software modules, one module can be represented by the sparse representation coefficients over the pre-defined dictionary which consists of historical software defect datasets. In this study, we make use of dictionary learning method to predict software defect. We optimize the classifier parameters and the dictionary atoms iteratively, to ensure that the extracted features (sparse representation) are optimal for the trained classifier. We prove the optimal condition of the elastic net which is used to solve the sparse coding coefficients and the regularity of the elastic net solution. Due to the reason that the misclassification of defective modules generally incurs much higher cost risk than the misclassification of defective-free ones, we take the different misclassification costs into account, increasing the punishment on misclassification defective modules in the procedure of dictionary learning, making the classification inclining to classify a module as a defective one. Thus, we propose a cost-sensitive software defect prediction method using dictionary learning (CSDL). Experimental results on the 10 class-imbalance datasets of NASA show that our method is more effective than several typical state-of-the-art defect prediction methods.

Download Full-text

An Enhanced Evolutionary Software Defect Prediction Method Using Island Moth Flame Optimization

Mathematics ◽

10.3390/math9151722 ◽

2021 ◽

Vol 9 (15) ◽

pp. 1722

Author(s):

Ruba Abu Khurma ◽

Hamad Alsawalqah ◽

Ibrahim Aljarah ◽

Mohamed Abd Elaziz ◽

Robertas Damaševičius

Keyword(s):

Swarm Intelligence ◽

Optimization Problems ◽

Prediction Method ◽

Search Space ◽

Defect Prediction ◽

Support Vector ◽

Software Defect Prediction ◽

Running Time ◽

Software Defect ◽

Software Modules

Software defect prediction (SDP) is crucial in the early stages of defect-free software development before testing operations take place. Effective SDP can help test managers locate defects and defect-prone software modules. This facilitates the allocation of limited software quality assurance resources optimally and economically. Feature selection (FS) is a complicated problem with a polynomial time complexity. For a dataset with N features, the complete search space has 2N feature subsets, which means that the algorithm needs an exponential running time to traverse all these feature subsets. Swarm intelligence algorithms have shown impressive performance in mitigating the FS problem and reducing the running time. The moth flame optimization (MFO) algorithm is a well-known swarm intelligence algorithm that has been used widely and proven its capability in solving various optimization problems. An efficient binary variant of MFO (BMFO) is proposed in this paper by using the island BMFO (IsBMFO) model. IsBMFO divides the solutions in the population into a set of sub-populations named islands. Each island is treated independently using a variant of BMFO. To increase the diversification capability of the algorithm, a migration step is performed after a specific number of iterations to exchange the solutions between islands. Twenty-one public software datasets are used for evaluating the proposed method. The results of the experiments show that FS using IsBMFO improves the classification results. IsBMFO followed by support vector machine (SVM) classification is the best model for the SDP problem over other compared models, with an average G-mean of 78%.

Download Full-text

Software Defect Prediction for Healthcare Big Data: An Empirical Evaluation of Machine Learning Techniques

Journal of Healthcare Engineering ◽

10.1155/2021/8899263 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Bilal Khan ◽

Rashid Naseem ◽

Muhammad Arif Shah ◽

Karzan Wakil ◽

Atif Khan ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Software Development ◽

Absolute Error ◽

Defect Prediction ◽

Support Vector ◽

Software Defect Prediction ◽

Software Defect ◽

Squared Error ◽

Average Accuracy

Software defect prediction (SDP) in the initial period of the software development life cycle (SDLC) remains a critical and important assignment. SDP is essentially studied during few last decades as it leads to assure the quality of software systems. The quick forecast of defective or imperfect artifacts in software development may serve the development team to use the existing assets competently and more effectively to provide extraordinary software products in the given or narrow time. Previously, several canvassers have industrialized models for defect prediction utilizing machine learning (ML) and statistical techniques. ML methods are considered as an operative and operational approach to pinpoint the defective modules, in which moving parts through mining concealed patterns amid software metrics (attributes). ML techniques are also utilized by several researchers on healthcare datasets. This study utilizes different ML techniques software defect prediction using seven broadly used datasets. The ML techniques include the multilayer perceptron (MLP), support vector machine (SVM), decision tree (J48), radial basis function (RBF), random forest (RF), hidden Markov model (HMM), credal decision tree (CDT), K-nearest neighbor (KNN), average one dependency estimator (A1DE), and Naïve Bayes (NB). The performance of each technique is evaluated using different measures, for instance, relative absolute error (RAE), mean absolute error (MAE), root mean squared error (RMSE), root relative squared error (RRSE), recall, and accuracy. The inclusive outcome shows the best performance of RF with 88.32% average accuracy and 2.96 rank value, second-best performance is achieved by SVM with 87.99% average accuracy and 3.83 rank values. Moreover, CDT also shows 87.88% average accuracy and 3.62 rank values, placed on the third position. The comprehensive outcomes of research can be utilized as a reference point for new research in the SDP domain, and therefore, any assertion concerning the enhancement in prediction over any new technique or model can be benchmarked and proved.

Download Full-text

EMPIRICAL ASSESSMENT OF MACHINE LEARNING BASED SOFTWARE DEFECT PREDICTION TECHNIQUES

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213008003947 ◽

2008 ◽

Vol 17 (02) ◽

pp. 389-400 ◽

Cited By ~ 41

Author(s):

VENKATA UDAYA B. CHALLAGULLA ◽

FAROKH B. BASTANI ◽

I-LING YEN ◽

RAYMOND A. PAUL

Keyword(s):

Machine Learning ◽

Feature Subset Selection ◽

Defect Prediction ◽

Data Sets ◽

Feature Subset ◽

Software Defect Prediction ◽

Software Defect ◽

Intelligent Software ◽

Instance Based Learning ◽

Prediction Techniques

Automated reliability assessment is essential for systems that entail dynamic adaptation based on runtime mission-specific requirements. One approach along this direction is to monitor and assess the system using machine learning-based software defect prediction techniques. Due to the dynamic nature of software data collected, Instance-based learning algorithms are proposed for the above purposes. To evaluate the accuracy of these methods, the paper presents an empirical analysis of four different real-time software defect data sets using different predictor models. The results show that a combination of 1R and Instance-based learning along with Consistency-based subset evaluation technique provides a relatively better consistency in achieving accurate predictions as compared with other models. No direct relationship is observed between the skewness present in the data sets and the prediction accuracy of these models. Principal Component Analysis (PCA) does not show a consistent advantage in improving the accuracy of the predictions. While random reduction of attributes gave poor accuracy results, simple Feature Subset Selection methods performed better than PCA for most prediction models. Based on these results, the paper presents a high-level design of an Intelligent Software Defect Analysis tool (ISDAT) for dynamic monitoring and defect assessment of software modules.

Download Full-text

Software defect prediction techniques using metrics based on neural network classifier

Cluster Computing ◽

10.1007/s10586-018-1730-1 ◽

2018 ◽

Vol 22 (S1) ◽

pp. 77-88 ◽

Cited By ~ 10

Author(s):

R. Jayanthi ◽

Lilly Florence

Keyword(s):

Neural Network ◽

Defect Prediction ◽

Software Defect Prediction ◽

Neural Network Classifier ◽

Software Defect ◽

Prediction Techniques

Download Full-text

A Review on Software Defect Prediction Techniques Using Product Metrics

International Journal of Database Theory and Application ◽

10.14257/ijdta.2017.10.1.15 ◽

2017 ◽

Vol 10 (1) ◽

pp. 163-174

Author(s):

Jayanthi R ◽

Lilly Florence ◽

Arti Arya

Keyword(s):

Defect Prediction ◽

Software Defect Prediction ◽

Product Metrics ◽

Software Defect ◽

Prediction Techniques

Download Full-text

Software Defect Prediction Using Heterogeneous Ensemble Classification Based on Segmented Patterns

Applied Sciences ◽

10.3390/app10051745 ◽

2020 ◽

Vol 10 (5) ◽

pp. 1745 ◽

Cited By ~ 4

Author(s):

Hamad Alsawalqah ◽

Neveen Hijazi ◽

Mohammed Eshtay ◽

Hossam Faris ◽

Ahmed Al Radaideh ◽

...

Keyword(s):

Main Idea ◽

Ensemble Classification ◽

Defect Prediction ◽

Software Defect Prediction ◽

Ensemble Classifiers ◽

Software Developers ◽

Software Defect ◽

Software Modules ◽

Benchmark Datasets ◽

Heterogeneous Ensemble

Software defect prediction is a promising approach aiming to improve software quality and testing efficiency by providing timely identification of defect-prone software modules before the actual testing process begins. These prediction results help software developers to effectively allocate their limited resources to the modules that are more prone to defects. In this paper, a hybrid heterogeneous ensemble approach is proposed for the purpose of software defect prediction. Heterogeneous ensembles consist of set of classifiers of different learning base methods in which each of them has its own strengths and weaknesses. The main idea of the proposed approach is to develop expert and robust heterogeneous classification models. Two versions of the proposed approach are developed and experimented. The first is based on simple classifiers, and the second is based on ensemble ones. For evaluation, 21 publicly available benchmark datasets are selected to conduct the experiments and benchmark the proposed approach. The evaluation results show the superiority of the ensemble version over other well-regarded basic and ensemble classifiers.

Download Full-text

A Framework for Software Risk Management

Journal of Information Technology ◽

10.1177/026839629601100402 ◽

1996 ◽

Vol 11 (4) ◽

pp. 275-285 ◽

Cited By ~ 5

Author(s):

Kalle Lyytinen ◽

Lars Mathiassen ◽

Janne Ropponen

Keyword(s):

Risk Management ◽

Software Development ◽

Empirical Analysis ◽

Practical Tool ◽

Software Risk Management ◽

Software Risk ◽

Management Approaches ◽

Risk Managers ◽

Analytical Device

We present a simple, but powerful framework for software risk management. The framework synthesizes, refines, and extends current approaches to managing software risks. We illustrate its usefulness through an empirical analysis of two software development episodes involving high risks. The framework can be used as an analytical device to evaluate and improve risk management approaches and as a practical tool to shape the attention and guide the actions of risk managers.

Download Full-text