On the classification of bug reports to improve bug localization

2021 ◽  
Author(s):  
Fan Fang ◽  
John Wu ◽  
Yanyan Li ◽  
Xin Ye ◽  
Wajdi Aljedaani ◽  
...  
Keyword(s):  
Author(s):  
Yu Zhou ◽  
Yanxiang Tong ◽  
Taolue Chen ◽  
Jin Han

Bug localization represents one of the most expensive, as well as time-consuming, activities during software maintenance and evolution. To alleviate the workload of developers, numerous methods have been proposed to automate this process and narrow down the scope of reviewing buggy files. In this paper, we present a novel buggy source-file localization approach, using the information from both the bug reports and the source files. We leverage the part-of-speech features of bug reports and the invocation relationship among source files. We also integrate an adaptive technique to further optimize the performance of the approach. The adaptive technique discriminates Top 1 and Top N recommendations for a given bug report and consists of two modules. One module is to maximize the accuracy of the first recommended file, and the other one aims at improving the accuracy of the fixed defect file list. We evaluate our approach on six large-scale open source projects, i.e. ASpectJ, Eclipse, SWT, Zxing, Birt and Tomcat. Compared to the previous work, empirical results show that our approach can improve the overall prediction performance in all of these cases. Particularly, in terms of the Top 1 recommendation accuracy, our approach achieves an enhancement from 22.73% to 39.86% for ASpectJ, from 24.36% to 30.76% for Eclipse, from 31.63% to 46.94% for SWT, from 40% to 55% for ZXing, from 7.97% to 21.99% for Birt, and from 33.37% to 38.90% for Tomcat.


2021 ◽  
Author(s):  
Thi Mai Anh Bui ◽  
Nhat Hai Nguyen

Precisely locating buggy files for a given bug report is a cumbersome and time-consuming task, particularly in a large-scale project with thousands of source files and bug reports. An efficient bug localization module is desirable to improve the productivity of the software maintenance phase. Many previous approaches rank source files according to their relevance to a given bug report based on simple lexical matching scores. However, the lexical mismatches between natural language expressions used to describe bug reports and technical terms of software source code might reduce the bug localization system’s accuracy. Incorporating domain knowledge through some features such as the semantic similarity, the fixing frequency of a source file, the code change history and similar bug reports is crucial to efficiently locating buggy files. In this paper, we propose a bug localization model, BugLocGA that leverages both lexical and semantic information as well as explores the relation between a bug report and a source file through some domain features. Given a bug report, we calculate the ranking score with every source files through a weighted sum of all features, where the weights are trained through a genetic algorithm with the aim of maximizing the performance of the bug localization model using two evaluation metrics: mean reciprocal rank (MRR) and mean average precision (MAP). The empirical results conducted on some widely-used open source software projects have showed that our model outperformed some state of the art approaches by effectively recommending relevant files where the bug should be fixed.


2020 ◽  
Vol 25 (5) ◽  
pp. 3086-3127
Author(s):  
Chris Mills ◽  
Esteban Parra ◽  
Jevgenija Pantiuchina ◽  
Gabriele Bavota ◽  
Sonia Haiduc

Author(s):  
Tao Zhang ◽  
Wenjun Hu ◽  
Xiapu Luo ◽  
Xiaobo Ma

Recently, there has been consistent growth in Android applications (apps). Under these circumstances, software maintenance for Android apps becomes an essential and important task. The core of software maintenance is to locate bugs in source files. Previous bug localization approaches mainly focus on open-source desktop software (e.g. Eclipse, Mozilla, GCC). Even though a few studies locate the bugs in the Android apps, they are dedicated to a special app named ZXing, without developing a general method to locate the bugs in Android apps by taking into account the unique characteristics of Android apps’ bug reports. Such characteristics include fewer number of historical bug reports, insufficient detailed description, etc. These characteristics hinder existing localization approaches from being directly delivered to Android apps, because lack of enough information degrades the performance of those localization approaches relying on historical bug reports. Commit messages include more informative data which can provide the details of reported bugs. Therefore, in this paper, we propose a novel information retrieval-based approach which utilizes commit messages to locate new bugs in Android apps. This approach not only considers the structured textual similarity between the given bug and the candidate source files, but also computes the unstructured textual similarities between the new bug and the commit messages linked to the corresponding source files. According to the experimental results on 10 popular open-source Android apps managed by GitHub, our approach outperforms the state-of-the-art bug localization methods that include BugLocator, BLUiR, and two-phase model.


2019 ◽  
Vol 113 ◽  
pp. 98-109 ◽  
Author(s):  
Neda Ebrahimi ◽  
Abdelaziz Trabelsi ◽  
Md. Shariful Islam ◽  
Abdelwahab Hamou-Lhadj ◽  
Kobra Khanmohammadi

Author(s):  
Jayalath Bandara Ekanayake

Manual classification of bug reports is time-consuming as the reports are received in large quantities. Alternatively, this project proposed automatic bug prediction models to classify the bug reports. The topics or the candidate keywords are mined from the developer description in bug reports using RAKE algorithm and converted into attributes. These attributes together with the target attribute—priority level—construct the training datasets. Naïve Bayes, logistic regression, and decision tree learner algorithms are trained, and the prediction quality was measured using area under recursive operative characteristics curves (AUC) as AUC does not consider the biasness in datasets. The logistics regression model outperforms the other two models providing the accuracy of 0.86 AUC whereas the naïve Bayes and the decision tree learner recorded 0.79 AUC and 0.81 AUC, respectively. The bugs can be classified without developer involvement and logistic regression is also a potential candidate as naïve Bayes for bug classification.


Author(s):  
Som Gupta ◽  
Sanjai Kumar Gupta

Deep Learning is one of the emerging and trending research area of machine learning in various domains. The paper describes the deep learning approaches applied to the domain of Bug Reports. The paper classifies the tasks being performed for mining of Bug Reports into Bug Report Classification, Bug Localization, Bug Report Summarization and Duplicate Bug Report Detection. The paper systematically discusses about the deep learning approaches being used for the mentioned tasks, and the future directions in this field of research.


Sign in / Sign up

Export Citation Format

Share Document