Mining Bug Report Repositories to Identify Significant Information for Software Bug Fixing

Applied Science and Engineering Progress ◽

10.14416/j.asep.2021.03.005 ◽

2021 ◽

Author(s):

Bancha Luaphol ◽

Jantima Polpinij ◽

Manasawee Kaenampornpan

Keyword(s):

The Other ◽

Problem Domain ◽

Significant Information ◽

Bug Reports ◽

Bug Fixing ◽

Classification Technique ◽

Bug Report ◽

Multiple Issues ◽

Improved Accuracy ◽

Software Bug

Most studies relating to bug reports aims to automatically identify necessary information from bug reports for software bug fixing. Unfortunately, the study of bug reports focuses only on one issue, but more complete and comprehensive software bug fixing would be facilitated by assessing multiple issues concurrently. This becomes a challenge in this study, where it aims to present a method of identifying bug reports at severe level from a bug report repository, together with assembling their related bug reports to visualize the overall picture of a software problem domain. The proposed method is called “mining bug report repositories”. Two techniques of text mining are applied as the main mechanisms in this method. First, classification is applied for identifying severe bug reports, called “bug severity classification”, while “threshold-based similarity analysis” is then applied to assemble bug reports that are related to a bug report at severe level. Our datasets are from three opensource namely SeaMonkey, Firefox, and Core:Layout downloaded from the Bugzilla. Finally, the best models from the proposed method are selected and compared with two baseline methods. For identifying severe bug reports using classification technique, the results show that our method improved accuracy, F1, and AUC scores over the baseline by 11.39, 11.63, and 19% respectively. Meanwhile, for assembling related bug reports using threshold-based similarity technique, the results show that our method improved precision, and likelihood scores over the other baseline by 15.76, and 9.14% respectively. This demonstrate that our proposed method may help increasing chance to fix bugs completely.

Download Full-text

UNDERSTANDING THE DEVELOPER PARTICIPATION IN BUG FIX PROCESS

INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY ◽

10.24297/ijct.v12i8.3000 ◽

2014 ◽

Vol 12 (8) ◽

pp. 3823-3828

Author(s):

Madhu Kumari ◽

Meera Sharma ◽

Nikita Yadav

Keyword(s):

Linear Relationship ◽

Open Source ◽

Significant Variation ◽

Open Source Project ◽

Bug Reports ◽

Bug Fixing ◽

Bug Report ◽

Severity Class ◽

Software Bug

Prediction of the bug fix time in open source softwares is a challenging job. A software bug consists of many attributes that define the characteristics of the bug. Some of the attributes get filled at the time of reporting and some areÂ at the time of bug fixing. In this paper, 836 bug reports of two products namely Thunderbird and Webtools of Mozilla open source project have been considered. InÂ bug report, we see that there is no linear relationship among the bug attributes namely bug fix time, developers, cc count and severity. This paper has analyzed the interdependence among these attributes through graphical representation.The results conclude that :Case 1. 73% of bugs reported for Webtools are fixed by 17% developers and 61% of bugs are fixed by 14% developers for Thundebird.Case 2. We tried to find a relationship between the time taken by a developer in fixing a bug and the corresponding developer. We also observed that there is a significant variation in bug fixing process, bugs may take 1 day to 4 years in fixing.Case 3. There is no linear relationship between cc count i.e. manpower involved in bug fixing process and bug fix time.Case 4. Maximum number of developers are involved in fixing bugs for major severity class.

Download Full-text

Automatic dependent bug reports assembly for bug tracking systems by threshold-based similarity

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v23.i3.pp1620-1633 ◽

2021 ◽

Vol 23 (3) ◽

pp. 1620

Author(s):

B. Luaphol ◽

J. Polpinij ◽

M. Kaneampornpan

Keyword(s):

State Of The Art ◽

Cosine Similarity ◽

Similarity Analysis ◽

Tracking Systems ◽

Problem Domain ◽

Essential Information ◽

Bug Reports ◽

Bug Fixing ◽

Bug Tracking ◽

Software Bug

<p>Bug reports contain essential information for fixing problems that occur in software. Many studies have proposed methods for automatic analysis of bug reports. One such task could affect the completion of software bug fixing, known as “bug dependency”. Although this problem was mentioned by many researches, most of them discussed about the related bugs but not really dealt with dependency issue in bug reports. One possible solution used for addressing this issue is to assemble all relevant/dependent bug reports together before analysis of the next processing stages. This study presents a method of assembling dependent bug reports. The main mechanism is called “threshold-based similarity analysis”, and the three similarity techniques of cosine similarity (CS) multi aspect TF (MATF), and BM25 are compared with feedback, precision and likelihood value. As the BM25 with the threshold as 0.5 gives the best results, it was used to compare with the state of the art method. The results show that our method increases precision and likelihood values by 12% and 12.4% respectively. Therefore, our results can be used to encourage developers to recognize all dependent bugs in the same problem domain.</p>

Download Full-text

Guiding Bug Triage through Developer Analysis in Bug Reports

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194016500170 ◽

2016 ◽

Vol 26 (03) ◽

pp. 405-431 ◽

Cited By ~ 1

Author(s):

Tao Zhang ◽

Geunseok Yang ◽

Byungjeong Lee ◽

Alvin T. S. Chan

Keyword(s):

Empirical Study ◽

Software Maintenance ◽

Large Scale ◽

Software Projects ◽

Bug Reports ◽

Bug Fixing ◽

Bug Report ◽

Report Analysis ◽

Mozilla Firefox ◽

Triage Algorithm

An important part of software maintenance is bug report analysis during bug-fixing, especially for large-scale software projects. Since bugs reported to the bug repository need to be fixed, triagers are responsible to identify appropriate developers to execute the fix. Previous research focused on optimizing this process, such as by duplicate detection and use of developer recommendations for reducing the workload of triagers. However, there were scant studies that analyzed developer roles (e.g. reporter and assignee) in the bug-fixing process. Therefore, in this paper, we perform an in-depth empirical study of the different roles that developers perform in bug resolution. By extracting the factors that affect bug resolution from the analysis results, we propose a novel bug triage algorithm to recommend the appropriate developers to fix a given bug. We implement the proposed recommendations on the Eclipse and Mozilla Firefox projects, with the results showing that the new bug triage algorithm can effectively recommend which experts should fix given bugs.

Download Full-text

Automated Classification of Unstructured Bilingual Software Bug Reports: An Industrial Case Study Research

Applied Sciences ◽

10.3390/app12010338 ◽

2021 ◽

Vol 12 (1) ◽

pp. 338

Author(s):

Ömer Köksal ◽

Bedir Tekinerdogan

Keyword(s):

Machine Learning ◽

Industrial Case Study ◽

Software Bugs ◽

Text Input ◽

Bug Reports ◽

Bug Report ◽

Software Bug ◽

Manual Classification

Software bug report classification is a critical process to understand the nature, implications, and causes of software failures. Furthermore, classification enables a fast and appropriate reaction to software bugs. However, for large-scale projects, one must deal with a broad set of bugs from multiple types. In this context, manually classifying bugs becomes cumbersome and time-consuming. Although several studies have addressed automated bug classification using machine learning techniques, they have mainly focused on academic case studies, open-source software, and unilingual text input. This paper presents our automated bug classification approach applied and validated in an industrial case study. In contrast to earlier studies, our study is applied to a commercial software system based on unstructured bilingual bug reports written in English and Turkish. The presented approach adopts and integrates machine learning (ML), text mining, and natural language processing (NLP) techniques to support the classification of software bugs. The approach has been applied within an industrial case study. Compared to manual classification, our results show that bug classification can be automated and even performs better than manual bug classification. Our study shows that the presented approach and the corresponding tools effectively reduce the manual classification time and effort.

Download Full-text

Assessment of Software Bug Complexity and Severity using Evolutionary SOM Scheme

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f9257.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 3152-3158

Keyword(s):

Software Quality ◽

Software Maintenance ◽

Negative Impact ◽

Distance Functions ◽

Tracking Systems ◽

Software Defects ◽

Bug Reports ◽

Bug Report ◽

Severity Prediction ◽

Software Bug

The software defect prediction and assessment plays a significant role in the software development process. Predicting software defects in the earlier stages will increases the software quality, reliability and efficiency, the cost of detecting and eliminating software defects have been the most expensive task during both development and maintenance process, as software demands increase and delivery of the software span decreased, ensuring software quality becomes a challenge. However, due to inadequate testing, no software can pretend to be free from errors. Bug repositories are used for storing and managing bugs in software projects. A bug in the repositories is recorded as a bug report. When a bug is found by a tester its available information is entered in defect tracking systems. During its resolution process a bug enters into various bug states. These defect tracking systems enable user to give the information about the bugs while running the software. However, the severity prediction has recently gained a lot of attention in software maintenance. Bugs with greater severity should be resolved before bugs with lower severity. In this paper an evolutionary interactive scheme to evaluate bug reports and assess the severity is proposed. This paper presents a Software Bug Complexity Cluster (SBCC) using Self Organizing Maps. In this SBCC a feature matrix is built using bug durations and the complexities of software bugs are categorized into distinct clusters including Blocker, Critical, Major, Trivial and Minor by specifying negative impact of the defect using two different techniques, namely k-means and SOM. Bug duration, proximity error and pre-defined distance functions are used to estimate the accuracy of different bug complexities. Our systematic study found that SOM's proximity error and fitness have greater performance and efficiency than K-Means. The collected results showed better performance for the SBCC with respect to fitness and cluster proximity error.

Download Full-text

Mining Stack Overflow: a Recommender Systems-Based Model

10.20944/preprints202008.0265.v2 ◽

2020 ◽

Author(s):

Fouzi Harrag ◽

Mokdad Khamliche

Keyword(s):

Deep Learning ◽

Domain Knowledge ◽

Learning To Rank ◽

Automated System ◽

Abnormal Behavior ◽

Large Domain ◽

Stack Overflow ◽

Bug Reports ◽

Bug Report ◽

Software Bug

In software development, developers received bug reports that describe the software bug. Developers find the cause of bug through reviewing the code and reproducing the abnormal behavior that can be considered as tedious and time-consuming processes. The developers need an automated system that incorporates large domain knowledge and recommends a solution for those bugs to ease on developers rather than spending more manual efforts to fixing the bugs or waiting on Q&A websites for other users to reply to them. Stack Overflow is a popular question-answer site that is focusing on programming issues, thus we can benefit knowledge available in this rich platform. This paper, presents a survey covering the methods in the field of mining software repositories. We propose an architecture to build a recommender System using the learning to rank approach. Deep learning is used to construct a model that solve the problem of learning to rank using stack overflow data. Text mining techniques were invested to extract, evaluate and recommend the answers that have the best relevance with the solution of this bug report.

Download Full-text

Mining Stack Overflow: a Recommender Systems-Based Model

10.20944/preprints202008.0265.v1 ◽

2020 ◽

Author(s):

Fouzi Harrag ◽

Mokdad Khamliche

Keyword(s):

Deep Learning ◽

Domain Knowledge ◽

Learning To Rank ◽

Automated System ◽

Abnormal Behavior ◽

Large Domain ◽

Stack Overflow ◽

Bug Reports ◽

Bug Report ◽

Software Bug

Download Full-text

What Hinders Accurate Depiction of Projective Shape?

Perception ◽

10.1068/p240995 ◽

1995 ◽

Vol 24 (9) ◽

pp. 995-1010 ◽

Cited By ~ 6

Author(s):

Emiel Reith ◽

Chang Hong Liu

Keyword(s):

Da Vinci ◽

The Other ◽

Frontoparallel Plane ◽

Glass Pane ◽

Drawing Task ◽

Visual Projection ◽

Vertical Glass ◽

Perceptual Abilities ◽

Improved Accuracy ◽

Projective Shape

Adult subjects drew the visual projection of two models. One model was a trapezoid placed in the frontoparallel plane. The other was a tilted rectangle which displayed the same projective shape on a frontoparallel plane as the trapezoid. The drawing conditions were varied in two ways: the model remained available for inspection during the drawing task or it was masked after initial inspection; the subjects drew on paper placed flat on the table or on a vertical glass pane placed in front of the model (ie on a da Vinci window). The results were that (i) the projective shape of the frontoparallel trapezoid was reproduced accurately whereas that of the tilted rectangle was systematically distorted in the direction of its actual physical dimensions; (ii) when subjects drew on paper, the presence or absence of a view of the model made no difference to the amount of distortion; (iii) drawing on a da Vinci window improved accuracy even when the model was hidden. These findings provide information about the relative roles of object-centred knowledge, perceptual abilities, and depiction skills in drawing performance.

Download Full-text

Augmenting Bug Localization with Part-of-Speech and Invocation

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194017500346 ◽

2017 ◽

Vol 27 (06) ◽

pp. 925-949 ◽

Cited By ~ 5

Author(s):

Yu Zhou ◽

Yanxiang Tong ◽

Taolue Chen ◽

Jin Han

Keyword(s):

Software Maintenance ◽

Large Scale ◽

Bug Localization ◽

Bug Reports ◽

Part Of Speech ◽

Adaptive Technique ◽

Bug Report ◽

Software Maintenance And Evolution ◽

Speech Features ◽

Localization Approach

Bug localization represents one of the most expensive, as well as time-consuming, activities during software maintenance and evolution. To alleviate the workload of developers, numerous methods have been proposed to automate this process and narrow down the scope of reviewing buggy files. In this paper, we present a novel buggy source-file localization approach, using the information from both the bug reports and the source files. We leverage the part-of-speech features of bug reports and the invocation relationship among source files. We also integrate an adaptive technique to further optimize the performance of the approach. The adaptive technique discriminates Top 1 and Top N recommendations for a given bug report and consists of two modules. One module is to maximize the accuracy of the first recommended file, and the other one aims at improving the accuracy of the fixed defect file list. We evaluate our approach on six large-scale open source projects, i.e. ASpectJ, Eclipse, SWT, Zxing, Birt and Tomcat. Compared to the previous work, empirical results show that our approach can improve the overall prediction performance in all of these cases. Particularly, in terms of the Top 1 recommendation accuracy, our approach achieves an enhancement from 22.73% to 39.86% for ASpectJ, from 24.36% to 30.76% for Eclipse, from 31.63% to 46.94% for SWT, from 40% to 55% for ZXing, from 7.97% to 21.99% for Birt, and from 33.37% to 38.90% for Tomcat.

Download Full-text

Deep spatio-temporal emotion analysis of geo-tagged tweets for predicting location based communal emotion during COVID-19 Lock-down

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210544 ◽

2021 ◽

pp. 1-14

Author(s):

M. Amsaprabhaa ◽

Y. Nancy Jane ◽

H. Khanna Nehemiah

Keyword(s):

Latent Dirichlet Allocation ◽

Temporal Analysis ◽

The Other ◽

Classification Model ◽

Grid Search ◽

Analysis Framework ◽

Emotion Classification ◽

Spatio Temporal ◽

Improved Accuracy ◽

State Of Art

Due to the COVID-19 pandemic, countries across the globe has enforced lockdown restrictions that influence the people’s socio-economic lifecycle. The objective of this paper is to predict the communal emotion of people from different locations during the COVID-19 lockdown. The proposed work aims in developing a deep spatio-temporal analysis framework of geo-tagged tweets to predict the emotions of different topics based on location. An optimized Latent Dirichlet Allocation (LDA) approach is presented for finding the optimal hyper-parameters using grid search. A multi-class emotion classification model is then built via a Recurrent Neural Network (RNN) to predict emotions for each topic based on locations. The proposed work is experimented with the twitter streaming API dataset. The experimental results prove that the presented LDA model-using grid search along with the RNN model for emotion classification outperforms the other state of art methods with an improved accuracy of 94.6%.

Download Full-text