The unreasonable effectiveness of traditional information retrieval in crash report deduplication

10.7287/peerj.preprints.1705 ◽

2016 ◽

Author(s):

Joshua Charles Campbell ◽

Eddie Antonio Santos ◽

Abram Hindle

Keyword(s):

Information Retrieval ◽

Reporting System ◽

Software Systems ◽

Trade Off ◽

Unreasonable Effectiveness ◽

Using Data

Organizations like Mozilla, Microsoft, and Apple are flooded with thousands of automated crash reports per day. Although crash reports contain valuable information for debugging, there are often too many for developers to examine individually. Therefore, in industry, crash reports are often automatically grouped together in buckets. Ubuntu’s repository contains crashes from hundreds of software systems available with Ubuntu. A variety of crash report bucketing methods are evaluated using data collected by Ubuntu’s Apport automated crash reporting system. The trade-off between precision and recall of numerous scalable crash 7 deduplication techniques is explored. A set of criteria that a crash deduplication method must meet is presented and several methods that meet these criteria are evaluated on a new dataset. The evaluations presented in this paper show that using off-the-shelf information retrieval techniques, that were not designed to be used with crash reports, outperform other techniques which are specifically designed for the task of crash bucketing at realistic industrial scales. This research indicates that automated crash bucketing still has a lot of room for improvement, especially in terms of identifier tokenization.

Download Full-text

Predicting the cost-quality trade-off for information retrieval queries

Proceedings of the tenth international conference on Information and knowledge management - CIKM'01 ◽

10.1145/502585.502621 ◽

2001 ◽

Cited By ~ 5

Author(s):

Henk Ernst Blok ◽

Djoerd Hiemstra ◽

Sunil Choenni ◽

Franciska de Jong ◽

Henk M. Blanken ◽

...

Keyword(s):

Information Retrieval ◽

Trade Off ◽

The Cost

Download Full-text

Metropolitan governance structure and growth–inequality dynamics in the United States

Environment and Planning A Economy and Space ◽

10.1177/0308518x18810002 ◽

2018 ◽

Vol 51 (3) ◽

pp. 598-616 ◽

Cited By ~ 1

Author(s):

Jaewoo Cho ◽

Jae Hong Kim ◽

Yonsu Kim

Keyword(s):

Economic Growth ◽

Metropolitan Areas ◽

Governance Structure ◽

The United States ◽

Governance Structures ◽

Least Squares Regression ◽

Metropolitan Governance ◽

Trade Off ◽

Using Data ◽

Nonlinear Fashion

While much scholarly attention has been paid to ways in which metropolitan areas are politically structured and operated to achieve a dual goal, economic growth, and equality, relatively less is known about the complex relationship between metropolitan governance structures and growth–inequality dynamics. This study investigates how and to what extent metropolitan governance structures shape regional economic growth and inequality trajectories using data for 267 US metropolitan areas from 1990 to 2010. Findings from a two-stage least squares regression analysis suggest that economic growth is associated with governance structures in a nonlinear fashion, with relatively more rapid growth rates in both highly centralized and decentralized metropolitan areas. However, these regions are also found to experience a larger increase in income inequality, indicating an important trade-off to be considered carefully in exploring ways to reform existing governance settings. These findings further suggest that the so-called growth–inequality trade-off may exist not only in their direct interactions but through their connections via governance or other variables.

Download Full-text

PCN67 Time to Onset Analysis of Sildenafil Associated Malignant Melanoma Using DATA from FDA Adverse Event Reporting System (FAERS) Database

Value in Health Regional Issues ◽

10.1016/j.vhri.2020.07.117 ◽

2020 ◽

Vol 22 ◽

pp. S17-S18

Author(s):

P. Dsouza ◽

J. Yalamanchili ◽

S. K Viswam ◽

N. Ravindra Reddy ◽

F. Mazhar

Keyword(s):

Adverse Event ◽

Malignant Melanoma ◽

Reporting System ◽

Adverse Event Reporting System ◽

Adverse Event Reporting ◽

Event Reporting ◽

Using Data ◽

Time To Onset

Download Full-text

Thrombocytopenia with Tedizolid and Linezolid

Antimicrobial Agents and Chemotherapy ◽

10.1128/aac.01453-17 ◽

2017 ◽

Vol 62 (1) ◽

Cited By ~ 15

Author(s):

Erica Yookyung Lee ◽

Aisling R. Caffrey

Keyword(s):

Adverse Event ◽

Confidence Interval ◽

Drug Administration ◽

Odds Ratio ◽

Reporting System ◽

Adverse Event Reporting System ◽

Adverse Event Reporting ◽

Reporting Odds Ratio ◽

Event Reporting ◽

Using Data

ABSTRACT Several studies have suggested the risk of thrombocytopenia with tedizolid, a second-in-class oxazolidinone antibiotic (approved June 2014), is less than that observed with linezolid (first-in-class oxazolidinone). Using data from the Food and Drug Administration adverse event reporting system (July 2014 through December 2016), we observed significantly increased risks of thrombocytopenia of similar magnitudes with both antibiotics: linezolid reporting odds ratio [ROR], 37.9 (95% confidence interval [CI], 20.78 to 69.17); tedizolid ROR, 34.0 (95% CI, 4.67 to 247.30).

Download Full-text

ALBIS

Sociotechnical Enterprise Information Systems Design and Integration ◽

10.4018/978-1-4666-3664-4.ch012 ◽

2013 ◽

pp. 188-206

Author(s):

Lerina Aversano ◽

Carmine Grasso ◽

Maria Tortorella

Keyword(s):

Information Retrieval ◽

Software Maintenance ◽

Business Processes ◽

Semantic Analysis ◽

Software Systems ◽

Complex Task ◽

Process Performance ◽

Definition Of ◽

Support Software

The evaluation of the alignment level existing between a business process and the supporting software systems is a critical concern for an organization, as the higher the alignment level is, the better the process performance is. Monitoring the alignment implies the characterization of all the items it involves and definition of measures for evaluating it. This is a complex task, and the availability of automatic tools for supporting evaluation and evolution activities may be precious. This chapter presents the ALBIS Environment (Aligning Business Processes and Information Systems), designed to support software maintenance tasks. In particular, the proposed environment allows the modeling and tracing between business and software entities and the measurement of their alignment degree. An information retrieval approach is embedded in ALBIS based on two processing phases including syntactic and semantic analysis. The usefulness of the environment is discussed through two case studies.

Download Full-text

Distributed Systems

Cryptographic Primitives in Blockchain Technology ◽

10.1093/oso/9780198862840.003.0005 ◽

2020 ◽

pp. 143-198

Author(s):

Andreas Bolfing

Keyword(s):

Distributed Systems ◽

Fault Tolerant ◽

Deterministic Algorithm ◽

Impossibility Result ◽

Software Systems ◽

Trade Off ◽

Byzantine Failures ◽

The Individual ◽

Special Case

Chapter 5 considers distributed systems by their properties. The first section studies the classification of software systems, which is usually distinguished in centralized, decentralized and distributed systems. It studies the differences between these three major approaches, showing there is a rather multidimensional classification instead of a linear one. The most important case are distributed systems that enable spreading of computational tasks across several autonomous, independently acting computational entities. A very important result of this case is the CAP theorem that considers the trade-off between consistency, availability and partition tolerance. The last section deals with the possibility to reach consensus in distributed systems, discussing how fault tolerant consensus mechanisms enable mutual agreement among the individual entities in presence of failures. One very special case are so-called Byzantine failures that are discussed in great detail. The main result is the so-called FLP Impossibility Result which states that there is no deterministic algorithm that guarantees solution to the consensus problem in the asynchronous case. The chapter concludes by considering practical solutions that circumvent the impossibility result in order to reach consensus.

Download Full-text