On the classification of bug reports to improve bug localization

Bug localization represents one of the most expensive, as well as time-consuming, activities during software maintenance and evolution. To alleviate the workload of developers, numerous methods have been proposed to automate this process and narrow down the scope of reviewing buggy files. In this paper, we present a novel buggy source-file localization approach, using the information from both the bug reports and the source files. We leverage the part-of-speech features of bug reports and the invocation relationship among source files. We also integrate an adaptive technique to further optimize the performance of the approach. The adaptive technique discriminates Top 1 and Top N recommendations for a given bug report and consists of two modules. One module is to maximize the accuracy of the first recommended file, and the other one aims at improving the accuracy of the fixed defect file list. We evaluate our approach on six large-scale open source projects, i.e. ASpectJ, Eclipse, SWT, Zxing, Birt and Tomcat. Compared to the previous work, empirical results show that our approach can improve the overall prediction performance in all of these cases. Particularly, in terms of the Top 1 recommendation accuracy, our approach achieves an enhancement from 22.73% to 39.86% for ASpectJ, from 24.36% to 30.76% for Eclipse, from 31.63% to 46.94% for SWT, from 40% to 55% for ZXing, from 7.97% to 21.99% for Birt, and from 33.37% to 38.90% for Tomcat.

Download Full-text

On the Value of Bug Reports for Retrieval-Based Bug Localization

2018 IEEE International Conference on Software Maintenance and Evolution (ICSME) ◽

10.1109/icsme.2018.00048 ◽

2018 ◽

Author(s):

Dawn Lawrie ◽

Dave Binkley

Keyword(s):

Bug Localization ◽

Bug Reports

Download Full-text

Adaptive Ranking Relevant Source Files for Bug Reports Using Genetic Algorithm

10.3233/faia210042 ◽

2021 ◽

Author(s):

Thi Mai Anh Bui ◽

Nhat Hai Nguyen

Keyword(s):

Genetic Algorithm ◽

Software Maintenance ◽

Large Scale ◽

Maintenance Phase ◽

Bug Localization ◽

Software Projects ◽

Source File ◽

Bug Reports ◽

Localization Model ◽

Bug Report

Precisely locating buggy files for a given bug report is a cumbersome and time-consuming task, particularly in a large-scale project with thousands of source files and bug reports. An efficient bug localization module is desirable to improve the productivity of the software maintenance phase. Many previous approaches rank source files according to their relevance to a given bug report based on simple lexical matching scores. However, the lexical mismatches between natural language expressions used to describe bug reports and technical terms of software source code might reduce the bug localization system’s accuracy. Incorporating domain knowledge through some features such as the semantic similarity, the fixing frequency of a source file, the code change history and similar bug reports is crucial to efficiently locating buggy files. In this paper, we propose a bug localization model, BugLocGA that leverages both lexical and semantic information as well as explores the relation between a bug report and a source file through some domain features. Given a bug report, we calculate the ranking score with every source files through a weighted sum of all features, where the weights are trained through a genetic algorithm with the aim of maximizing the performance of the bug localization model using two evaluation metrics: mean reciprocal rank (MRR) and mean average precision (MAP). The empirical results conducted on some widely-used open source software projects have showed that our model outperformed some state of the art approaches by effectively recommending relevant files where the bug should be fixed.

Download Full-text

On the relationship between bug reports and queries for text retrieval-based bug localization

Empirical Software Engineering ◽

10.1007/s10664-020-09823-w ◽

2020 ◽

Vol 25 (5) ◽

pp. 3086-3127

Author(s):

Chris Mills ◽

Esteban Parra ◽

Jevgenija Pantiuchina ◽

Gabriele Bavota ◽

Sonia Haiduc

Keyword(s):

Text Retrieval ◽

Bug Localization ◽

Bug Reports ◽

The Relationship

Download Full-text

A Commit Messages-Based Bug Localization for Android Applications

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194019500207 ◽

2019 ◽

Vol 29 (04) ◽

pp. 457-487 ◽

Cited By ~ 1

Author(s):

Tao Zhang ◽

Wenjun Hu ◽

Xiapu Luo ◽

Xiaobo Ma

Keyword(s):

Open Source ◽

Software Maintenance ◽

State Of The Art ◽

Bug Localization ◽

Two Phase ◽

Android Apps ◽

Bug Reports ◽

Android Applications ◽

The Given ◽

General Method

Recently, there has been consistent growth in Android applications (apps). Under these circumstances, software maintenance for Android apps becomes an essential and important task. The core of software maintenance is to locate bugs in source files. Previous bug localization approaches mainly focus on open-source desktop software (e.g. Eclipse, Mozilla, GCC). Even though a few studies locate the bugs in the Android apps, they are dedicated to a special app named ZXing, without developing a general method to locate the bugs in Android apps by taking into account the unique characteristics of Android apps’ bug reports. Such characteristics include fewer number of historical bug reports, insufficient detailed description, etc. These characteristics hinder existing localization approaches from being directly delivered to Android apps, because lack of enough information degrades the performance of those localization approaches relying on historical bug reports. Commit messages include more informative data which can provide the details of reported bugs. Therefore, in this paper, we propose a novel information retrieval-based approach which utilizes commit messages to locate new bugs in Android apps. This approach not only considers the structured textual similarity between the given bug and the candidate source files, but also computes the unstructured textual similarities between the new bug and the commit messages linked to the corresponding source files. According to the experimental results on 10 popular open-source Android apps managed by GitHub, our approach outperforms the state-of-the-art bug localization methods that include BugLocator, BLUiR, and two-phase model.

Download Full-text

An HMM-based approach for automatic detection and classification of duplicate bug reports

Information and Software Technology ◽

10.1016/j.infsof.2019.05.007 ◽

2019 ◽

Vol 113 ◽

pp. 98-109 ◽

Cited By ~ 11

Author(s):

Neda Ebrahimi ◽

Abdelaziz Trabelsi ◽

Md. Shariful Islam ◽

Abdelwahab Hamou-Lhadj ◽

Kobra Khanmohammadi

Keyword(s):

Automatic Detection ◽

Bug Reports ◽

Duplicate Bug Reports

Download Full-text

Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

International Journal of Systems and Service-Oriented Engineering ◽

10.4018/ijssoe.2021010103 ◽

2021 ◽

Vol 11 (1) ◽

pp. 31-42

Author(s):

Jayalath Bandara Ekanayake

Keyword(s):

Logistic Regression ◽

Decision Tree ◽

Prediction Models ◽

Naive Bayes ◽

Naïve Bayes ◽

Potential Candidate ◽

Priority Level ◽

Bug Reports ◽

Manual Classification

Manual classification of bug reports is time-consuming as the reports are received in large quantities. Alternatively, this project proposed automatic bug prediction models to classify the bug reports. The topics or the candidate keywords are mined from the developer description in bug reports using RAKE algorithm and converted into attributes. These attributes together with the target attribute—priority level—construct the training datasets. Naïve Bayes, logistic regression, and decision tree learner algorithms are trained, and the prediction quality was measured using area under recursive operative characteristics curves (AUC) as AUC does not consider the biasness in datasets. The logistics regression model outperforms the other two models providing the accuracy of 0.86 AUC whereas the naïve Bayes and the decision tree learner recorded 0.79 AUC and 0.81 AUC, respectively. The bugs can be classified without developer involvement and logistic regression is also a potential candidate as naïve Bayes for bug classification.

Download Full-text

Automated Classification of Software Bug Reports

Proceedings of the 9th International Conference on Information Communication and Management - ICICM 2019 ◽

10.1145/3357419.3357424 ◽

2019 ◽

Author(s):

Ahmed Fawzi Otoom ◽

Sara Al-jdaeh ◽

Maen Hammad

Keyword(s):

Automated Classification ◽

Bug Reports ◽

Software Bug

Download Full-text

Empirical Analysis and Automated Classification of Security Bug Reports

10.33915/etd.6843 ◽

2016 ◽

Author(s):

Jacob P. Tyo

Keyword(s):

Empirical Analysis ◽

Automated Classification ◽

Bug Reports

Download Full-text

Bug Reports and Deep Learning Models

International Journal of Computer Science and Mobile Computing ◽

10.47760/ijcsmc.2021.v10i12.003 ◽

2021 ◽

Vol 10 (12) ◽

pp. 21-26

Author(s):

Som Gupta ◽

Sanjai Kumar Gupta

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Research Area ◽

Learning Approaches ◽

Bug Localization ◽

Future Directions ◽

Bug Reports ◽

Bug Report ◽

The Future

Deep Learning is one of the emerging and trending research area of machine learning in various domains. The paper describes the deep learning approaches applied to the domain of Bug Reports. The paper classifies the tasks being performed for mining of Bug Reports into Bug Report Classification, Bug Localization, Bug Report Summarization and Duplicate Bug Report Detection. The paper systematically discusses about the deep learning approaches being used for the mentioned tasks, and the future directions in this field of research.

Download Full-text

On the classification of bug reports to improve bug localization

Augmenting Bug Localization with Part-of-Speech and Invocation

On the Value of Bug Reports for Retrieval-Based Bug Localization

Adaptive Ranking Relevant Source Files for Bug Reports Using Genetic Algorithm

On the relationship between bug reports and queries for text retrieval-based bug localization

A Commit Messages-Based Bug Localization for Android Applications

An HMM-based approach for automatic detection and classification of duplicate bug reports

Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

Automated Classification of Software Bug Reports

Empirical Analysis and Automated Classification of Security Bug Reports

Bug Reports and Deep Learning Models

Export Citation Format

On the classification of bug reports to improve bug localization

Augmenting Bug Localization with Part-of-Speech and Invocation

On the Value of Bug Reports for Retrieval-Based Bug Localization

Adaptive Ranking Relevant Source Files for Bug Reports Using Genetic Algorithm

On the relationship between bug reports and queries for text retrieval-based bug localization

A Commit Messages-Based Bug Localization for Android Applications

An HMM-based approach for automatic detection and classification of duplicate bug reports

Predicting Bug Priority Using Topic Modelling in Imbalanced Learning Environments

Automated Classification of Software Bug Reports

Empirical Analysis and Automated Classification of Security Bug Reports

Bug Reports and Deep Learning Models﻿

Bug Reports and Deep Learning Models