Bayesian Software Prediction Models. Volume 1. An Imperfect Debugging Model for Reliability and other Quantitative Measures of Software Systems

The information about which modules of a future version of a software system will be defect-prone is a valuable planning aid for quality managers and testers. Defect prediction promises to indicate these defect-prone modules. In this chapter, building a defect prediction model from data is characterized as an instance of a data-mining task, and key questions and consequences arising when establishing defect prediction in a large software development project are discussed. Special emphasis is put on discussions on how to choose a learning algorithm, select features from different data sources, deal with noise and data quality issues, as well as model evaluation for evolving systems. These discussions are accompanied by insights and experiences gained by projects on data mining and defect prediction in the context of large software systems conducted by the authors over the last couple of years. One of these projects has been selected to serve as an illustrative use case throughout the chapter.

Download Full-text

METRIC SELECTION FOR SOFTWARE DEFECT PREDICTION

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194011005256 ◽

2011 ◽

Vol 21 (02) ◽

pp. 237-257 ◽

Cited By ~ 15

Author(s):

HUANJING WANG ◽

TAGHI M. KHOSHGOFTAAR ◽

JASON VAN HULSE ◽

KEHAN GAO

Keyword(s):

Empirical Study ◽

Prediction Models ◽

Characteristic Curve ◽

Life Cycles ◽

Defect Prediction ◽

Software Systems ◽

Software Projects ◽

Ensemble Technique ◽

Defect Prediction Models ◽

Software Practitioners

Real-world software systems are becoming larger, more complex, and much more unpredictable. Software systems face many risks in their life cycles. Software practitioners strive to improve software quality by constructing defect prediction models using metric (feature) selection techniques. Finding faulty components in a software system can lead to a more reliable final system and reduce development and maintenance costs. This paper presents an empirical study of six commonly used filter-based software metric rankers and our proposed ensemble technique using rank ordering of the features (mean or median), applied to three large software projects using five commonly used learners. The classification accuracy was evaluated in terms of the AUC (Area Under the ROC (Receiver Operating Characteristic) Curve) performance metric. Results demonstrate that the ensemble technique performed better overall than any individual ranker and also possessed better robustness. The empirical study also shows that variations among rankers, learners and software projects significantly impacted the classification outcomes, and that the ensemble method can smooth out performance.

Download Full-text

A Comparative Analysis of Filter-Based Feature Selection Methods for Software Fault Prediction

Research and Development on Information and Communication Technology ◽

10.32913/mic-ict-research-vn.v2021.n1.969 ◽

2021 ◽

pp. 1-7

Author(s):

Thị Minh Phương Hà ◽

Thi My Hanh Le ◽

Thanh Binh Nguyen

Keyword(s):

Feature Selection ◽

Prediction Models ◽

Information Gain ◽

Feature Selection Method ◽

Computation Time ◽

Fault Prediction ◽

Software Systems ◽

Selection Methods ◽

Software Fault Prediction

The rapid growth of data has become a huge challenge for software systems. The quality of fault predictionmodel depends on the quality of software dataset. High-dimensional data is the major problem that affects the performance of the fault prediction models. In order to deal with dimensionality problem, feature selection is proposed by various researchers. Feature selection method provides an effective solution by eliminating irrelevant and redundant features, reducing computation time and improving the accuracy of the machine learning model. In this study, we focus on research and synthesis of the Filter-based feature selection with several search methods and algorithms. In addition, five filter-based feature selection methods are analyzed using five different classifiers over datasets obtained from National Aeronautics and Space Administration (NASA) repository. The experimental results show that Chi-Square and Information Gain methods had the best influence on the results of predictive models over other filter ranking methods.

Download Full-text

Building Defect Prediction Models in Practice

Computer Systems and Software Engineering ◽

10.4018/978-1-5225-3923-0.ch014 ◽

2017 ◽

pp. 324-350

Author(s):

Rudolf Ramler ◽

Johannes Himmelbauer ◽

Thomas Natschläger

Keyword(s):

Data Mining ◽

Prediction Models ◽

Learning Algorithm ◽

Development Project ◽

Defect Prediction ◽

Software Systems ◽

Software Development Project ◽

Future Version ◽

Large Software ◽

Defect Prediction Models

The information about which modules of a future version of a software system will be defect-prone is a valuable planning aid for quality managers and testers. Defect prediction promises to indicate these defect-prone modules. In this chapter, building a defect prediction model from data is characterized as an instance of a data-mining task, and key questions and consequences arising when establishing defect prediction in a large software development project are discussed. Special emphasis is put on discussions on how to choose a learning algorithm, select features from different data sources, deal with noise and data quality issues, as well as model evaluation for evolving systems. These discussions are accompanied by insights and experiences gained by projects on data mining and defect prediction in the context of large software systems conducted by the authors over the last couple of years. One of these projects has been selected to serve as an illustrative use case throughout the chapter.

Download Full-text

Imperfect Debugging in Software Systems by SRGM with TEF

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38805 ◽

2021 ◽

Vol 9 (11) ◽

pp. 423-429

Author(s):

K. Swetha

Keyword(s):

Optimal Policy ◽

Fault Identification ◽

Software Systems ◽

Important Process ◽

High Quality ◽

Imperfect Debugging ◽

Identification Rate

Abstract: In the Proposed work we are going to assimilate two important process called TEF and imperfect debugging in software systems for analyzing FDP and FCP. Byapplying the tools called debuggers we are going to identify the failures and going to correct them in order to attain the high quality reliability. As we know, testingeffort function is predicted during this time by allocating the resources which influences considerably only for the fault identification rate and also for the correction of such faults. Additionally, new faults may be included for evaluating as the feedback. In this technique, first it is proposed to demonstrate for the inclusion of TEF and fault introduction into FDP and later develop FCP as delayedFDP with a proper effort for correction. The FCP as well FCP as paired specific models which are extracted based on the basis of types of assumptions of introducing fault introduction as well as correction effort. In addition, the optimal policy of software releasefor different criteria with examples was also presentedin this work. Keywords: FDP, FCP, TEF, Fault

Download Full-text

Bayesian Software Prediction Models. Volume III. Availability Analysis of Software Systems Under Imperfect Maintenance.

10.21236/ada057872 ◽

1978 ◽

Author(s):

K. Okumoto ◽

Amrit L. Goel

Keyword(s):

Prediction Models ◽

Software Systems ◽

Imperfect Maintenance ◽

Availability Analysis

Download Full-text

Hybrid Representation to Locate Vulnerable Lines of Code

International Journal of Software Innovation ◽

10.4018/ijsi.292020 ◽

2022 ◽

Vol 10 (1) ◽

pp. 0-0

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Precision ◽

Software Metrics ◽

Prediction Models ◽

Software Systems ◽

Vulnerability Prediction ◽

Large Software

Locating vulnerable lines of code in large software systems needs huge efforts from human experts. This explains the high costs in terms of budget and time needed to correct vulnerabilities. To minimize these costs, automatic solutions of vulnerabilities prediction have been proposed. Existing machine learning (ML)-based solutions face difficulties in predicting vulnerabilities in coarse granularity and in defining suitable code features that limit their effectiveness. To addressee these limitations, in the present work, the authors propose an improved ML-based approach using slice-based code representation and the technique of TF-IDF to automatically extract effective features. The obtained results showed that combining these two techniques with ML techniques allows building effective vulnerability prediction models (VPMs) that locate vulnerabilities in a finer granularity and with excellent performances (high precision (>98%), low FNR (<2%) and low FPR (<3%) which outperforms software metrics and are equivalent to the best performing recent deep learning-based approaches.

Download Full-text

Towards more flexible management of software systems development using meta-models

Software Engineering Journal ◽

10.1049/sej.1995.0011 ◽

1995 ◽

Vol 10 (3) ◽

pp. 79 ◽

Cited By ~ 4

Author(s):

C.W. Dawson ◽

R.J. Dawson

Keyword(s):

Systems Development ◽

Software Systems ◽

Flexible Management

Download Full-text

4 Pattern Recognition for Healing Burns to Compute Evidence: Space Syntax and Machine Learning Analysis of Burns Center Karachi

Sir Syed Research Journal of Engineering & Technology ◽

10.33317/ssurj.v1i1.64 ◽

2013 ◽

Vol 1 (1) ◽

pp. 13

Author(s):

Javaria Manzoor Shaikh ◽

JaeSeung Park

Keyword(s):

Pattern Recognition ◽

Prediction Models ◽

Uv Light ◽

Healing Process ◽

Space Syntax ◽

Depth Map ◽

Routine Tasks ◽

Hyper Plane ◽

First Time ◽

Learning Analysis

Usually elongated hospitalization is experienced byBurn patients, and the precise forecast of the placement of patientaccording to the healing acceleration has significant consequenceon healthcare supply administration. Substantial amount ofevidence suggest that sun light is essential to burns healing andcould be exceptionally beneficial for burned patients andworkforce in healthcare building. Satisfactory UV sunlight isfundamental for a calculated amount of burn to heal; this delicaterather complex matrix is achieved by applying patternclassification for the first time on the space syntax map of the floorplan and Browder chart of the burned patient. On the basis of thedata determined from this specific healthcare learning technique,nurse can decide the location of the patient on the floor plan, hencepatient safety first is the priority in the routine tasks by staff inhealthcare settings. Whereas insufficient UV light and vitamin Dcan retard healing process, hence this experiment focuses onmachine learning design in which pattern recognition andtechnology supports patient safety as our primary goal. In thisexperiment we lowered the adverse events from 2012- 2013, andnearly missed errors and prevented medical deaths up to 50%lower, as compared to the data of 2005- 2012 before this techniquewas incorporated.In this research paper, three distinctive phases of clinicalsituations are considered—primarily: admission, secondly: acute,and tertiary: post-treatment according to the burn pattern andhealing rate—and be validated by capable AI- origin forecastingtechniques to hypothesis placement prediction models for eachclinical stage with varying percentage of burn i.e. superficialwound, partial thickness or full thickness deep burn. Conclusivelywe proved that the depth of burn is directly proportionate to thedepth of patient’s placement in terms of window distance. Ourfindings support the hypothesis that the windowed wall is mosthealing wall, here fundamental suggestion is support vectormachines: which is most advantageous hyper plane for linearlydivisible patterns for the burns depth as well as the depth map isused.

Download Full-text