A Replication Package for It Takes Two to TANGO: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports

During software maintenance, bug reports are widely employed to improve the software project’s quality. A developer often refers to stowed bug reports in a repository for bug resolution. However, this reference process often requires a developer to pursue a substantial amount of textual information in bug reports which is lengthy and tedious. Automatic summarization of bug reports is one way to overcome this problem. Both supervised and unsupervised methods are effectively proposed for the automatic summary generation of bug reports. However, existing methods disregard the significance of duplicate bug reports in summarizing bug reports. In this study, we propose a PageRank-based Summarization Technique (PRST), which utilizes the textual information contained in bug reports and additional information in associated duplicate bug reports. PRST uses three variants of PageRank-based on Vector Space Model (VSM), Jaccard, and WordNet similarity metrics. These variants are utilized to calculate the textual similarity of the sentences between the master bug reports and their duplicates. PRST further trains a regression model and predicts the probability of sentences belonging to the summary. Finally, we combine the values of PageRank and regression model scores to rank the sentences and produce the summary for the master bug reports. In addition, we construct two corpora of bug reports and duplicates, i.e. MBRC and OSCAR. Empirical results suggest that PRST outperforms the state-of-the-art method BRC in terms of Precision, Recall, F-score, and Pyramid Precision. Meanwhile, PRST with WordNet achieves the best results against PRST with VSM and Jaccard.

Download Full-text

Severity Prediction for Bug Reports Using Multi-Aspect Features: A Deep Learning Approach

Mathematics ◽

10.3390/math9141644 ◽

2021 ◽

Vol 9 (14) ◽

pp. 1644

Author(s):

Anh-Hien Dao ◽

Cheng-Zen Yang

Keyword(s):

Deep Learning ◽

Matthews Correlation Coefficient ◽

State Of The Art ◽

Textual Information ◽

Learning Framework ◽

Bug Reports ◽

Average Accuracy ◽

Severity Prediction ◽

Quality Aspect ◽

Software Bug

The severity of software bug reports plays an important role in maintaining software quality. Many approaches have been proposed to predict the severity of bug reports using textual information. In this research, we propose a deep learning framework called MASP that uses convolutional neural networks (CNN) and the content-aspect, sentiment-aspect, quality-aspect, and reporter-aspect features of bug reports to improve prediction performance. We have performed experiments on datasets collected from Eclipse and Mozilla. The results show that the MASP model outperforms the state-of-the-art CNN model in terms of average Accuracy, Precision, Recall, F1-measure, and the Matthews Correlation Coefficient (MCC) by 1.83%, 0.46%, 3.23%, 1.72%, and 6.61%, respectively.

Download Full-text

The empirical derivation of equations for predicting subjective textual information.

PsycEXTRA Dataset ◽

10.1037/e419942004-001 ◽

1974 ◽

Author(s):

Dan Kauffman ◽

Mike Johnson ◽

Gene Knight

Keyword(s):

Textual Information

Download Full-text

Method of semantic search and analysis of information

Informatization and communication ◽

10.34219/2078-8320-2020-11-1-75-80 ◽

2020 ◽

pp. 75-80

Author(s):

A.L. Ogarok

Keyword(s):

Computer Systems ◽

Linguistic Analysis ◽

Semantic Search ◽

Textual Information ◽

Formalized Description

The methodology of semantic search and analysis of information is considered. The results of the analysis of various approaches to solving the problem of a complete linguistic analysis of textual information in computer systems are presented. A formalized description of the method of semantic search and analysis of information is given.

Download Full-text

Requirements and Their Impact Downstream: Improving Casual Analysis Processes Through Measurement and Analysis of Textual Information

10.21236/ada488178 ◽

2008 ◽

Cited By ~ 1

Author(s):

Ira A. Monarch ◽

Dennis R. Goldenson ◽

Lawrence T. Osiecki

Keyword(s):

Textual Information ◽

Measurement And Analysis

Download Full-text

Feature generation for textual information retrieval using world knowledge

ACM SIGIR Forum ◽

10.1145/1328964.1328988 ◽

2007 ◽

Vol 41 (2) ◽

pp. 123-123 ◽

Cited By ~ 9

Author(s):

Evgeniy Gabrilovich

Keyword(s):

Information Retrieval ◽

World Knowledge ◽

Feature Generation ◽

Textual Information

Download Full-text

Shall I Work with Them? A Knowledge Graph-Based Approach for Predicting Future Research Collaborations

Entropy ◽

10.3390/e23060664 ◽

2021 ◽

Vol 23 (6) ◽

pp. 664

Author(s):

Nikos Kanakaris ◽

Nikolaos Giarelis ◽

Ilias Siachos ◽

Nikos Karacapilidis

Keyword(s):

Language Processing ◽

Scientific Knowledge ◽

Link Prediction ◽

Performance Metrics ◽

Future Research ◽

Knowledge Graph ◽

Prediction Problem ◽

Textual Information ◽

Research Collaborations ◽

Processing Techniques

We consider the prediction of future research collaborations as a link prediction problem applied on a scientific knowledge graph. To the best of our knowledge, this is the first work on the prediction of future research collaborations that combines structural and textual information of a scientific knowledge graph through a purposeful integration of graph algorithms and natural language processing techniques. Our work: (i) investigates whether the integration of unstructured textual data into a single knowledge graph affects the performance of a link prediction model, (ii) studies the effect of previously proposed graph kernels based approaches on the performance of an ML model, as far as the link prediction problem is concerned, and (iii) proposes a three-phase pipeline that enables the exploitation of structural and textual information, as well as of pre-trained word embeddings. We benchmark the proposed approach against classical link prediction algorithms using accuracy, recall, and precision as our performance metrics. Finally, we empirically test our approach through various feature combinations with respect to the link prediction problem. Our experimentations with the new COVID-19 Open Research Dataset demonstrate a significant improvement of the abovementioned performance metrics in the prediction of future research collaborations.

Download Full-text