METRIC SELECTION FOR SOFTWARE DEFECT PREDICTION

Author(s):  
HUANJING WANG ◽  
TAGHI M. KHOSHGOFTAAR ◽  
JASON VAN HULSE ◽  
KEHAN GAO

Real-world software systems are becoming larger, more complex, and much more unpredictable. Software systems face many risks in their life cycles. Software practitioners strive to improve software quality by constructing defect prediction models using metric (feature) selection techniques. Finding faulty components in a software system can lead to a more reliable final system and reduce development and maintenance costs. This paper presents an empirical study of six commonly used filter-based software metric rankers and our proposed ensemble technique using rank ordering of the features (mean or median), applied to three large software projects using five commonly used learners. The classification accuracy was evaluated in terms of the AUC (Area Under the ROC (Receiver Operating Characteristic) Curve) performance metric. Results demonstrate that the ensemble technique performed better overall than any individual ranker and also possessed better robustness. The empirical study also shows that variations among rankers, learners and software projects significantly impacted the classification outcomes, and that the ensemble method can smooth out performance.

Author(s):  
Jirayus Jiarpakdee ◽  
Chakkrit Tantithamthavorn ◽  
Hoa Khanh Dam ◽  
John Grundy

Author(s):  
Rudolf Ramler ◽  
Johannes Himmelbauer ◽  
Thomas Natschläger

The information about which modules of a future version of a software system will be defect-prone is a valuable planning aid for quality managers and testers. Defect prediction promises to indicate these defect-prone modules. In this chapter, building a defect prediction model from data is characterized as an instance of a data-mining task, and key questions and consequences arising when establishing defect prediction in a large software development project are discussed. Special emphasis is put on discussions on how to choose a learning algorithm, select features from different data sources, deal with noise and data quality issues, as well as model evaluation for evolving systems. These discussions are accompanied by insights and experiences gained by projects on data mining and defect prediction in the context of large software systems conducted by the authors over the last couple of years. One of these projects has been selected to serve as an illustrative use case throughout the chapter.


Author(s):  
Rudolf Ramler ◽  
Johannes Himmelbauer ◽  
Thomas Natschläger

The information about which modules of a future version of a software system will be defect-prone is a valuable planning aid for quality managers and testers. Defect prediction promises to indicate these defect-prone modules. In this chapter, building a defect prediction model from data is characterized as an instance of a data-mining task, and key questions and consequences arising when establishing defect prediction in a large software development project are discussed. Special emphasis is put on discussions on how to choose a learning algorithm, select features from different data sources, deal with noise and data quality issues, as well as model evaluation for evolving systems. These discussions are accompanied by insights and experiences gained by projects on data mining and defect prediction in the context of large software systems conducted by the authors over the last couple of years. One of these projects has been selected to serve as an illustrative use case throughout the chapter.


2020 ◽  
Vol 25 (6) ◽  
pp. 5047-5083
Author(s):  
Abdul Ali Bangash ◽  
Hareem Sahar ◽  
Abram Hindle ◽  
Karim Ali

Sign in / Sign up

Export Citation Format

Share Document