Impact of Software Bug Report Preprocessing and Vectorization on Bug Assignment Accuracy

2021 ◽  
pp. 153-162
Author(s):  
Lukasz Chmielowski ◽  
Michal Kucharzak
Author(s):  
Bancha Luaphol ◽  
Jantima Polpinij ◽  
Manasawee Kaenampornpan

Most studies relating to bug reports aims to automatically identify necessary information from bug reports for software bug fixing. Unfortunately, the study of bug reports focuses only on one issue, but more complete and comprehensive software bug fixing would be facilitated by assessing multiple issues concurrently. This becomes a challenge in this study, where it aims to present a method of identifying bug reports at severe level from a bug report repository, together with assembling their related bug reports to visualize the overall picture of a software problem domain. The proposed method is called “mining bug report repositories”. Two techniques of text mining are applied as the main mechanisms in this method. First, classification is applied for identifying severe bug reports, called “bug severity classification”, while “threshold-based similarity analysis” is then applied to assemble bug reports that are related to a bug report at severe level. Our datasets are from three opensource namely SeaMonkey, Firefox, and Core:Layout downloaded from the Bugzilla. Finally, the best models from the proposed method are selected and compared with two baseline methods. For identifying severe bug reports using classification technique, the results show that our method improved accuracy, F1, and AUC scores over the baseline by 11.39, 11.63, and 19% respectively. Meanwhile, for assembling related bug reports using threshold-based similarity technique, the results show that our method improved precision, and likelihood scores over the other baseline by 15.76, and 9.14% respectively. This demonstrate that our proposed method may help increasing chance to fix bugs completely.


2021 ◽  
Author(s):  
Anuj Shastri ◽  
Naveen Saini ◽  
Sriparna Saha ◽  
Santosh Kumar Mishra

2014 ◽  
Vol 12 (8) ◽  
pp. 3823-3828
Author(s):  
Madhu Kumari ◽  
Meera Sharma ◽  
Nikita Yadav

Prediction of the bug fix time in open source softwares is a challenging job. A software bug consists of many attributes that define the characteristics of the bug. Some of the attributes get filled at the time of reporting and some are  at the time of bug fixing. In this paper, 836 bug reports of two products namely Thunderbird and Webtools of Mozilla open source project have been considered. In  bug report, we see that there is no linear relationship among the bug attributes namely bug fix time, developers, cc count and severity. This paper has analyzed the interdependence among these attributes through graphical representation.The results conclude that :Case 1. 73% of bugs reported for Webtools are fixed by 17% developers and 61% of bugs are fixed by 14% developers for Thundebird.Case 2. We tried to find a relationship between the time taken by a developer in fixing a bug and the corresponding developer. We also observed that there is a significant variation in bug fixing process, bugs may take 1 day to 4 years in fixing.Case 3. There is no linear relationship between cc count i.e. manpower involved in bug fixing process and bug fix time.Case 4. Maximum number of developers are involved in fixing bugs for major severity class.


2018 ◽  
Vol 9 (4) ◽  
pp. 20-46 ◽  
Author(s):  
Madhu Kumari ◽  
Meera Sharma ◽  
V. B. Singh

An accurate bug severity assessment is an important factor in bug fixing. Bugs are reported on the bug tracking system by different users with a fast speed. The size of software repositories is also increasing at an enormous rate. This increased size often has much uncertainty and irregularities. The factors that cause uncertainty are biases, noise and abnormality in data. The authors consider that software bug report phenomena on the bug tracking system keeps an irregular state. Without proper handling of these uncertainties and irregularities, the performance of learning strategies can be significantly reduced. To incorporate and consider these two phenomena, they have used entropy as an attribute to assess bug severity. The authors have predicted the bug severity by using machine learning techniques, namely KNN, J48, RF, RNG, NB, CNN and MLR. They have validated the classifiers using PITS, Eclipse and Mozilla projects. The results show that the proposed entropy-based approaches improves the performance as compared to the state of the art approach considered in this article.


2021 ◽  
Vol 12 (1) ◽  
pp. 338
Author(s):  
Ömer Köksal ◽  
Bedir Tekinerdogan

Software bug report classification is a critical process to understand the nature, implications, and causes of software failures. Furthermore, classification enables a fast and appropriate reaction to software bugs. However, for large-scale projects, one must deal with a broad set of bugs from multiple types. In this context, manually classifying bugs becomes cumbersome and time-consuming. Although several studies have addressed automated bug classification using machine learning techniques, they have mainly focused on academic case studies, open-source software, and unilingual text input. This paper presents our automated bug classification approach applied and validated in an industrial case study. In contrast to earlier studies, our study is applied to a commercial software system based on unstructured bilingual bug reports written in English and Turkish. The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs. The approach has been applied within an industrial case study. Compared to manual classification, our results show that bug classification can be automated and even performs better than manual bug classification. Our study shows that the presented approach and the corresponding tools effectively reduce the manual classification time and effort.


The software defect prediction and assessment plays a significant role in the software development process. Predicting software defects in the earlier stages will increases the software quality, reliability and efficiency, the cost of detecting and eliminating software defects have been the most expensive task during both development and maintenance process, as software demands increase and delivery of the software span decreased, ensuring software quality becomes a challenge. However, due to inadequate testing, no software can pretend to be free from errors. Bug repositories are used for storing and managing bugs in software projects. A bug in the repositories is recorded as a bug report. When a bug is found by a tester its available information is entered in defect tracking systems. During its resolution process a bug enters into various bug states. These defect tracking systems enable user to give the information about the bugs while running the software. However, the severity prediction has recently gained a lot of attention in software maintenance. Bugs with greater severity should be resolved before bugs with lower severity. In this paper an evolutionary interactive scheme to evaluate bug reports and assess the severity is proposed. This paper presents a Software Bug Complexity Cluster (SBCC) using Self Organizing Maps. In this SBCC a feature matrix is built using bug durations and the complexities of software bugs are categorized into distinct clusters including Blocker, Critical, Major, Trivial and Minor by specifying negative impact of the defect using two different techniques, namely k-means and SOM. Bug duration, proximity error and pre-defined distance functions are used to estimate the accuracy of different bug complexities. Our systematic study found that SOM's proximity error and fitness have greater performance and efficiency than K-Means. The collected results showed better performance for the SBCC with respect to fitness and cluster proximity error.


Sign in / Sign up

Export Citation Format

Share Document