Mining Bug Report Repositories to Identify Significant Information for Software Bug Fixing

Author(s):  
Bancha Luaphol ◽  
Jantima Polpinij ◽  
Manasawee Kaenampornpan

Most studies relating to bug reports aims to automatically identify necessary information from bug reports for software bug fixing. Unfortunately, the study of bug reports focuses only on one issue, but more complete and comprehensive software bug fixing would be facilitated by assessing multiple issues concurrently. This becomes a challenge in this study, where it aims to present a method of identifying bug reports at severe level from a bug report repository, together with assembling their related bug reports to visualize the overall picture of a software problem domain. The proposed method is called “mining bug report repositories”. Two techniques of text mining are applied as the main mechanisms in this method. First, classification is applied for identifying severe bug reports, called “bug severity classification”, while “threshold-based similarity analysis” is then applied to assemble bug reports that are related to a bug report at severe level. Our datasets are from three opensource namely SeaMonkey, Firefox, and Core:Layout downloaded from the Bugzilla. Finally, the best models from the proposed method are selected and compared with two baseline methods. For identifying severe bug reports using classification technique, the results show that our method improved accuracy, F1, and AUC scores over the baseline by 11.39, 11.63, and 19% respectively. Meanwhile, for assembling related bug reports using threshold-based similarity technique, the results show that our method improved precision, and likelihood scores over the other baseline by 15.76, and 9.14% respectively. This demonstrate that our proposed method may help increasing chance to fix bugs completely.

2014 ◽  
Vol 12 (8) ◽  
pp. 3823-3828
Author(s):  
Madhu Kumari ◽  
Meera Sharma ◽  
Nikita Yadav

Prediction of the bug fix time in open source softwares is a challenging job. A software bug consists of many attributes that define the characteristics of the bug. Some of the attributes get filled at the time of reporting and some are  at the time of bug fixing. In this paper, 836 bug reports of two products namely Thunderbird and Webtools of Mozilla open source project have been considered. In  bug report, we see that there is no linear relationship among the bug attributes namely bug fix time, developers, cc count and severity. This paper has analyzed the interdependence among these attributes through graphical representation.The results conclude that :Case 1. 73% of bugs reported for Webtools are fixed by 17% developers and 61% of bugs are fixed by 14% developers for Thundebird.Case 2. We tried to find a relationship between the time taken by a developer in fixing a bug and the corresponding developer. We also observed that there is a significant variation in bug fixing process, bugs may take 1 day to 4 years in fixing.Case 3. There is no linear relationship between cc count i.e. manpower involved in bug fixing process and bug fix time.Case 4. Maximum number of developers are involved in fixing bugs for major severity class.


Author(s):  
B. Luaphol ◽  
J. Polpinij ◽  
M. Kaneampornpan

<p>Bug reports contain essential information for fixing problems that occur in software. Many studies have proposed methods for automatic analysis of bug reports. One such task could affect the completion of software bug fixing, known as “bug dependency”. Although this problem was mentioned by many researches, most of them discussed about the related bugs but not really dealt with dependency issue in bug reports. One possible solution used for addressing this issue is to assemble all relevant/dependent bug reports together before analysis of the next processing stages. This study presents a method of assembling dependent bug reports. The main mechanism is called “threshold-based similarity analysis”, and the three similarity techniques of cosine similarity (CS) multi aspect TF (MATF), and BM25 are compared with feedback, precision and likelihood value. As the BM25 with the threshold as 0.5 gives the best results, it was used to compare with the state of the art method. The results show that our method increases precision and likelihood values by 12% and 12.4% respectively. Therefore, our results can be used to encourage developers to recognize all dependent bugs in the same problem domain.</p>


Author(s):  
Tao Zhang ◽  
Geunseok Yang ◽  
Byungjeong Lee ◽  
Alvin T. S. Chan

An important part of software maintenance is bug report analysis during bug-fixing, especially for large-scale software projects. Since bugs reported to the bug repository need to be fixed, triagers are responsible to identify appropriate developers to execute the fix. Previous research focused on optimizing this process, such as by duplicate detection and use of developer recommendations for reducing the workload of triagers. However, there were scant studies that analyzed developer roles (e.g. reporter and assignee) in the bug-fixing process. Therefore, in this paper, we perform an in-depth empirical study of the different roles that developers perform in bug resolution. By extracting the factors that affect bug resolution from the analysis results, we propose a novel bug triage algorithm to recommend the appropriate developers to fix a given bug. We implement the proposed recommendations on the Eclipse and Mozilla Firefox projects, with the results showing that the new bug triage algorithm can effectively recommend which experts should fix given bugs.


2021 ◽  
Vol 12 (1) ◽  
pp. 338
Author(s):  
Ömer Köksal ◽  
Bedir Tekinerdogan

Software bug report classification is a critical process to understand the nature, implications, and causes of software failures. Furthermore, classification enables a fast and appropriate reaction to software bugs. However, for large-scale projects, one must deal with a broad set of bugs from multiple types. In this context, manually classifying bugs becomes cumbersome and time-consuming. Although several studies have addressed automated bug classification using machine learning techniques, they have mainly focused on academic case studies, open-source software, and unilingual text input. This paper presents our automated bug classification approach applied and validated in an industrial case study. In contrast to earlier studies, our study is applied to a commercial software system based on unstructured bilingual bug reports written in English and Turkish. The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs. The approach has been applied within an industrial case study. Compared to manual classification, our results show that bug classification can be automated and even performs better than manual bug classification. Our study shows that the presented approach and the corresponding tools effectively reduce the manual classification time and effort.


The software defect prediction and assessment plays a significant role in the software development process. Predicting software defects in the earlier stages will increases the software quality, reliability and efficiency, the cost of detecting and eliminating software defects have been the most expensive task during both development and maintenance process, as software demands increase and delivery of the software span decreased, ensuring software quality becomes a challenge. However, due to inadequate testing, no software can pretend to be free from errors. Bug repositories are used for storing and managing bugs in software projects. A bug in the repositories is recorded as a bug report. When a bug is found by a tester its available information is entered in defect tracking systems. During its resolution process a bug enters into various bug states. These defect tracking systems enable user to give the information about the bugs while running the software. However, the severity prediction has recently gained a lot of attention in software maintenance. Bugs with greater severity should be resolved before bugs with lower severity. In this paper an evolutionary interactive scheme to evaluate bug reports and assess the severity is proposed. This paper presents a Software Bug Complexity Cluster (SBCC) using Self Organizing Maps. In this SBCC a feature matrix is built using bug durations and the complexities of software bugs are categorized into distinct clusters including Blocker, Critical, Major, Trivial and Minor by specifying negative impact of the defect using two different techniques, namely k-means and SOM. Bug duration, proximity error and pre-defined distance functions are used to estimate the accuracy of different bug complexities. Our systematic study found that SOM's proximity error and fitness have greater performance and efficiency than K-Means. The collected results showed better performance for the SBCC with respect to fitness and cluster proximity error.


Author(s):  
Fouzi Harrag ◽  
Mokdad Khamliche

In software development, developers received bug reports that describe the software bug. Developers find the cause of bug through reviewing the code and reproducing the abnormal behavior that can be considered as tedious and time-consuming processes. The developers need an automated system that incorporates large domain knowledge and recommends a solution for those bugs to ease on developers rather than spending more manual efforts to fixing the bugs or waiting on Q&amp;A websites for other users to reply to them. Stack Overflow is a popular question-answer site that is focusing on programming issues, thus we can benefit knowledge available in this rich platform. This paper, presents a survey covering the methods in the field of mining software repositories. We propose an architecture to build a recommender System using the learning to rank approach. Deep learning is used to construct a model that solve the problem of learning to rank using stack overflow data. Text mining techniques were invested to extract, evaluate and recommend the answers that have the best relevance with the solution of this bug report.


Author(s):  
Fouzi Harrag ◽  
Mokdad Khamliche

In software development, developers received bug reports that describe the software bug. Developers find the cause of bug through reviewing the code and reproducing the abnormal behavior that can be considered as tedious and time-consuming processes. The developers need an automated system that incorporates large domain knowledge and recommends a solution for those bugs to ease on developers rather than spending more manual efforts to fixing the bugs or waiting on Q&amp;A websites for other users to reply to them. Stack Overflow is a popular question-answer site that is focusing on programming issues, thus we can benefit knowledge available in this rich platform. This paper, presents a survey covering the methods in the field of mining software repositories. We propose an architecture to build a recommender System using the learning to rank approach. Deep learning is used to construct a model that solve the problem of learning to rank using stack overflow data. Text mining techniques were invested to extract, evaluate and recommend the answers that have the best relevance with the solution of this bug report.


Perception ◽  
1995 ◽  
Vol 24 (9) ◽  
pp. 995-1010 ◽  
Author(s):  
Emiel Reith ◽  
Chang Hong Liu

Adult subjects drew the visual projection of two models. One model was a trapezoid placed in the frontoparallel plane. The other was a tilted rectangle which displayed the same projective shape on a frontoparallel plane as the trapezoid. The drawing conditions were varied in two ways: the model remained available for inspection during the drawing task or it was masked after initial inspection; the subjects drew on paper placed flat on the table or on a vertical glass pane placed in front of the model (ie on a da Vinci window). The results were that (i) the projective shape of the frontoparallel trapezoid was reproduced accurately whereas that of the tilted rectangle was systematically distorted in the direction of its actual physical dimensions; (ii) when subjects drew on paper, the presence or absence of a view of the model made no difference to the amount of distortion; (iii) drawing on a da Vinci window improved accuracy even when the model was hidden. These findings provide information about the relative roles of object-centred knowledge, perceptual abilities, and depiction skills in drawing performance.


Author(s):  
Yu Zhou ◽  
Yanxiang Tong ◽  
Taolue Chen ◽  
Jin Han

Bug localization represents one of the most expensive, as well as time-consuming, activities during software maintenance and evolution. To alleviate the workload of developers, numerous methods have been proposed to automate this process and narrow down the scope of reviewing buggy files. In this paper, we present a novel buggy source-file localization approach, using the information from both the bug reports and the source files. We leverage the part-of-speech features of bug reports and the invocation relationship among source files. We also integrate an adaptive technique to further optimize the performance of the approach. The adaptive technique discriminates Top 1 and Top N recommendations for a given bug report and consists of two modules. One module is to maximize the accuracy of the first recommended file, and the other one aims at improving the accuracy of the fixed defect file list. We evaluate our approach on six large-scale open source projects, i.e. ASpectJ, Eclipse, SWT, Zxing, Birt and Tomcat. Compared to the previous work, empirical results show that our approach can improve the overall prediction performance in all of these cases. Particularly, in terms of the Top 1 recommendation accuracy, our approach achieves an enhancement from 22.73% to 39.86% for ASpectJ, from 24.36% to 30.76% for Eclipse, from 31.63% to 46.94% for SWT, from 40% to 55% for ZXing, from 7.97% to 21.99% for Birt, and from 33.37% to 38.90% for Tomcat.


2021 ◽  
pp. 1-14
Author(s):  
M. Amsaprabhaa ◽  
Y. Nancy Jane ◽  
H. Khanna Nehemiah

Due to the COVID-19 pandemic, countries across the globe has enforced lockdown restrictions that influence the people’s socio-economic lifecycle. The objective of this paper is to predict the communal emotion of people from different locations during the COVID-19 lockdown. The proposed work aims in developing a deep spatio-temporal analysis framework of geo-tagged tweets to predict the emotions of different topics based on location. An optimized Latent Dirichlet Allocation (LDA) approach is presented for finding the optimal hyper-parameters using grid search. A multi-class emotion classification model is then built via a Recurrent Neural Network (RNN) to predict emotions for each topic based on locations. The proposed work is experimented with the twitter streaming API dataset. The experimental results prove that the presented LDA model-using grid search along with the RNN model for emotion classification outperforms the other state of art methods with an improved accuracy of 94.6%.


Sign in / Sign up

Export Citation Format

Share Document