Increasing the Accuracy of Software Fault Prediction using Majority Ranking Fuzzy Clustering

Despite proposing many software fault prediction models, this area has yet to be explored as still there is a room for stable and consistent model with better performance. In this paper, a new method is proposed to increase the accuracy of fault prediction based on the notion of fuzzy clustering and majority ranking. The authors investigated the effect of irrelevant and inconsistent modules on software fault prediction and tried to decrease it by designing a new framework, in which the entire project modules are clustered. The obtained results showed that fuzzy clustering could decrease the negative effect of irrelevant modules on prediction performance. Eight data sets from NASA and Turkish white-goods software is employed to evaluate our model. Performance evaluation in terms of false positive rate, false negative rate, and overall error showed the superiority of our model compared to other predicting models. The authors proposed majority ranking fuzzy clustering approach showed between 3% to 18% and 1% to 4% improvement in false negative rate and overall error, respectively, compared with other available proposed models (ACF and ACN) in more than half of the testing cases. According to the results, our systems can be used to guide testing effort by identifying fault prone modules to improve the quality of software development and software testing in a limited time and budget.

Download Full-text

Application of Artificial Immune Systems Paradigm for Developing Software Fault Prediction Models

Machine Learning ◽

10.4018/978-1-60960-818-7.ch302 ◽

2012 ◽

pp. 371-387 ◽

Cited By ~ 2

Author(s):

Cagatay Catal ◽

Soumya Banerjee

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Software Metrics ◽

Prediction Models ◽

Artificial Immune Systems ◽

Fault Prediction ◽

Artificial Immune ◽

Software Fault Prediction ◽

Immune Systems ◽

Software Fault

Artificial Immune Systems, a biologically inspired computing paradigm such as Artificial Neural Networks, Genetic Algorithms, and Swarm Intelligence, embody the principles and advantages of vertebrate immune systems. It has been applied to solve several complex problems in different areas such as data mining, computer security, robotics, aircraft control, scheduling, optimization, and pattern recognition. There is an increasing interest in the use of this paradigm and they are widely used in conjunction with other methods such as Artificial Neural Networks, Swarm Intelligence and Fuzzy Logic. In this chapter, we demonstrate the procedure for applying this paradigm and bio-inspired algorithm for developing software fault prediction models. The fault prediction unit is to identify the modules, which are likely to contain the faults at the next release in a large software system. Software metrics and fault data belonging to a previous software version are used to build the model. Fault-prone modules of the next release are predicted by using this model and current software metrics. From machine learning perspective, this type of modeling approach is called supervised learning. A sample fault dataset is used to show the elaborated approach of working of Artificial Immune Recognition Systems (AIRS).

Download Full-text

Estimating Prevalence, False-Positive Rate, and False-Negative Rate with Use of Repeated Testing When True Responses Are Unknown

The American Journal of Human Genetics ◽

10.1086/521582 ◽

2007 ◽

Vol 81 (5) ◽

pp. 1111-1113

Author(s):

Johanna Jakobsdottir ◽

Daniel E. Weeks

Keyword(s):

False Positive ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Repeated Testing ◽

Negative Rate ◽

Positive Rate

Download Full-text

Utility of ultrasonography in the diagnosis of autosomal dominant polycystic kidney disease in children.

Journal of the American Society of Nephrology ◽

10.1681/asn.v81105 ◽

1997 ◽

Vol 8 (1) ◽

pp. 105-110

Author(s):

P A Gabow ◽

W J Kimberling ◽

J D Strain ◽

M L Manco-Johnson ◽

A M Johnson

Keyword(s):

Kidney Disease ◽

Polycystic Kidney Disease ◽

Autosomal Dominant ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

Polycystic Kidney ◽

Negative Rate ◽

Affected Parent ◽

Autosomal Dominant Polycystic Kidney

To determine the utility of ultrasonography (US) in diagnosing autosomal dominant polycystic kidney disease (ADPKD) in children, this study examined 106 children who were at 50% risk for the disease. The children underwent a history, physical examination, abdominal US, and gene linkage analysis (GLA) with tightly linked markers for ADPKD1 and ADPKD2 genes. Only ADPKD1 children were studied. A child was considered affected by US if any cysts were detected and affected by GLA if he or she shared the same haplotype as the affected parent. Forty-two children (40%) were considered to be unaffected by both GLA and US. Forty-eight children (45%) were considered affected by both modalities. Only two of these children had a single cyst. Fourteen children (13%) were considered affected by GLA with normal initial US. These children tended to have larger kidneys than children who were unaffected by GLA. Eight of these 14 children had subsequent positive ultrasonograms. Two children had a positive ultrasonogram with GLA showing them to be unaffected; in one of these children, a subsequent ultrasonogram was interpreted to be normal with a medullary pyramid. Thus, overall the false negative rate was 25%, and the false positive rate was 2%. The false negative rate was highest in the children who were 3 months to 5 years of age (38%). Clinicians must understand the utility of US in diagnosing ADPKD in at-risk children and must not interpret a normal study as absence of disease in this population.

Download Full-text

Important Issues in Software Fault Prediction

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Handbook of Research on Emerging Advancements and Technologies in Software Engineering ◽

10.4018/978-1-4666-6026-7.ch023 ◽

2014 ◽

pp. 510-539 ◽

Cited By ~ 1

Author(s):

Golnoush Abaei ◽

Ali Selamat

Keyword(s):

Software Quality ◽

Software Metrics ◽

Prediction Models ◽

Research Field ◽

Verification And Validation ◽

Fault Prediction ◽

Machine Learning Techniques ◽

Software Fault Prediction ◽

Learning Techniques ◽

Software Fault

Quality assurance tasks such as testing, verification and validation, fault tolerance, and fault prediction play a major role in software engineering activities. Fault prediction approaches are used when a software company needs to deliver a finished product while it has limited time and budget for testing it. In such cases, identifying and testing parts of the system that are more defect prone is reasonable. In fact, prediction models are mainly used for improving software quality and exploiting available resources. Software fault prediction is studied in this chapter based on different criteria that matters in this research field. Usually, there are certain issues that need to be taken care of such as different machine-learning techniques, artificial intelligence classifiers, variety of software metrics, distinctive performance evaluation metrics, and some statistical analysis. In this chapter, the authors present a roadmap for those researchers who are interested in working in this area. They illustrate problems along with objectives related to each mentioned criterion, which could assist researchers to build the finest software fault prediction model.

Download Full-text

Ensemble Techniques-Based Software Fault Prediction in an Open-Source Project

Research Anthology on Usage and Development of Open Source Software ◽

10.4018/978-1-7998-9158-1.ch036 ◽

2021 ◽

pp. 693-709

Author(s):

Wasiur Rhmann ◽

Gufran Ahmad Ansari

Keyword(s):

Machine Learning ◽

Open Source ◽

Software Testing ◽

Prediction Models ◽

Fault Prediction ◽

Machine Learning Techniques ◽

Data Repository ◽

Software Fault Prediction ◽

Ensemble Models ◽

Software Fault

Software engineering repositories have been attracted by researchers to mine useful information about the different quality attributes of the software. These repositories have been helpful to software professionals to efficiently allocate various resources in the life cycle of software development. Software fault prediction is a quality assurance activity. In fault prediction, software faults are predicted before actual software testing. As exhaustive software testing is impossible, the use of software fault prediction models can help the proper allocation of testing resources. Various machine learning techniques have been applied to create software fault prediction models. In this study, ensemble models are used for software fault prediction. Change metrics-based data are collected for an open-source android project from GIT repository and code-based metrics data are obtained from PROMISE data repository and datasets kc1, kc2, cm1, and pc1 are used for experimental purpose. Results showed that ensemble models performed better compared to machine learning and hybrid search-based algorithms. Bagging ensemble was found to be more effective in the prediction of faults in comparison to soft and hard voting.

Download Full-text

High Resolution CT and Bronchography in the Assessment of Bronchiectasis

Acta Radiologica ◽

10.1177/028418519103200601 ◽

1991 ◽

Vol 32 (6) ◽

pp. 439-441 ◽

Cited By ~ 31

Author(s):

K. Young ◽

F. Aspestrand ◽

A. Kolbenstvedt

Keyword(s):

Retrospective Study ◽

High Resolution ◽

False Positive ◽

False Positive Rate ◽

False Negative ◽

False Negative Rate ◽

High Resolution Ct ◽

Negative Rate ◽

Different Types ◽

Positive Rate

To elucidate the reliability of CT in the assessment of bronchiectasis, a retrospective study of high resolution CT and bronchography was carried out. A segment by segment comparison of 259 segmental bronchi from 70 lobes of 27 lungs in 19 patients was performed using bronchography as standard. CT was positive in 87 of 89 segmental bronchi with bronchiectasis giving a false-negative rate of 2%. CT was negative in 169 of 170 segmental bronchi without bronchiectasis at bronchography, giving a false-positive rate of 1%. There was agreement between the two modalities in identifying the different types of bronchiectasis.

Download Full-text

Investigating Associative Classification for Software Fault Prediction: An Experimental Perspective

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s021819401450003x ◽

2014 ◽

Vol 24 (01) ◽

pp. 61-90 ◽

Cited By ~ 12

Author(s):

Baojun Ma ◽

Huaping Zhang ◽

Guoqing Chen ◽

Yanping Zhao ◽

Bart Baesens

Keyword(s):

Prediction Models ◽

Prediction Performance ◽

Fault Prediction ◽

Machine Learning Techniques ◽

Classification Methods ◽

Associative Classification ◽

Production Environment ◽

Software Fault Prediction ◽

Software Fault ◽

Real World Datasets

It is a recurrent finding that software development is often troubled by considerable delays as well as budget overruns and several solutions have been proposed in answer to this observation, software fault prediction being a prime example. Drawing upon machine learning techniques, software fault prediction tries to identify upfront software modules that are most likely to contain faults, thereby streamlining testing efforts and improving overall software quality. When deploying fault prediction models in a production environment, both prediction performance and model comprehensibility are typically taken into consideration, although the latter is commonly overlooked in the academic literature. Many classification methods have been suggested to conduct fault prediction; yet associative classification methods remain uninvestigated in this context. This paper proposes an associative classification (AC)-based fault prediction method, building upon the CBA2 algorithm. In an empirical comparison on 12 real-world datasets, the AC-based classifier is shown to achieve a predictive performance competitive to those of models induced by five other tree/rule-based classification techniques. In addition, our findings also highlight the comprehensibility of the AC-based models, while achieving similar prediction performance. Furthermore, the possibilities of cross project prediction are investigated, strengthening earlier findings on the feasibility of such approach when insufficient data on the target project is available.

Download Full-text