Using Multiple Query Aspects to Build Test Collections without Human Relevance Judgments

When to stop making relevance judgments? A study of stopping methods for building information retrieval test collections

Journal of the Association for Information Science and Technology ◽

10.1002/asi.24077 ◽

2018 ◽

Vol 70 (1) ◽

pp. 49-60 ◽

Cited By ~ 3

Author(s):

David E. Losada ◽

Javier Parapar ◽

Alvaro Barreiro

Keyword(s):

Information Retrieval ◽

Test Collections ◽

Relevance Judgments ◽

Building Information

Download Full-text

An Evaluation of Multiple Query Representations for the Relevance Judgments used to Build a Biomedical Test Collection

Healthcare Informatics Research ◽

10.4258/hir.2012.18.1.65 ◽

2012 ◽

Vol 18 (1) ◽

pp. 65 ◽

Cited By ~ 2

Author(s):

Borim Ryu ◽

Jinwook Choi

Keyword(s):

Test Collection ◽

Relevance Judgments ◽

Multiple Query

Download Full-text

Prioritizing relevance judgments to improve the construction of IR test collections

Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11 ◽

10.1145/2063576.2063671 ◽

2011 ◽

Cited By ~ 3

Author(s):

Mehdi Hosseini ◽

Ingemar J. Cox ◽

Natasa Milic-Frayling ◽

Trevor Sweeting ◽

Vishwa Vinay

Keyword(s):

Test Collections ◽

Relevance Judgments

Download Full-text

A Comparative Analysis of System Features Used in the TREC-COVID Information Retrieval Challenge

10.1101/2020.10.15.20213645 ◽

2020 ◽

Author(s):

Jimmy Chen ◽

William R. Hersh

Keyword(s):

Information Retrieval ◽

System Performance ◽

Fine Tuning ◽

Search Performance ◽

Journal Articles ◽

Scientific Publications ◽

Test Collections ◽

Relevance Judgments ◽

Improved Performance ◽

Information Retrieval Methods

AbstractThe COVID-19 pandemic has resulted in a rapidly growing quantity of scientific publications from journal articles, preprints, and other sources. The TREC-COVID Challenge was created to evaluate information retrieval methods and systems for this quickly expanding corpus. Based on the COVID-19 Open Research Dataset (CORD-19), several dozen research teams participated in over 5 rounds of the TREC-COVID Challenge. While previous work has compared IR techniques used on other test collections, there are no studies that have analyzed the methods used by participants in the TREC-COVID Challenge. We manually reviewed team run reports from Rounds 2 and 5, extracted features from the documented methodologies, and used a univariate and multivariate regression-based analysis to identify features associated with higher retrieval performance. We observed that fine-tuning datasets with relevance judgments, MS-MARCO, and CORD-19 document vectors was associated with improved performance in Round 2 but not in Round 5. Though the relatively decreased heterogeneity of runs in Round 5 may explain the lack of significance in that round, fine-tuning has been found to improve search performance in previous challenge evaluations by improving a system’s ability to map relevant queries and phrases to documents. Furthermore, term expansion was associated with improvement in system performance, and the use of the narrative field in the TREC-COVID topics was associated with decreased system performance in both rounds. These findings emphasize the need for clear queries in search. While our study has some limitations in its generalizability and scope of techniques analyzed, we identified some IR techniques that may be useful in building search systems for COVID-19 using the TREC-COVID test collections.

Download Full-text

Relevance Modeling with Multiple Query Variations

Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval - ICTIR '19 ◽

10.1145/3341981.3344224 ◽

2019 ◽

Cited By ~ 1

Author(s):

Xiaolu Lu ◽

Oren Kurland ◽

J. Shane Culpepper ◽

Nick Craswell ◽

Ofri Rom

Keyword(s):

Multiple Query

Download Full-text

Report on CLEF 2019

ACM SIGIR Forum ◽

10.1145/3458553.3458571 ◽

2019 ◽

Vol 53 (2) ◽

pp. 108-118

Author(s):

Martin Braschler ◽

Linda Cappellato ◽

Fabio Crestani ◽

Nicola Ferro ◽

Gundula Heinatz Bürki ◽

...

Keyword(s):

Standard Test ◽

Future Perspectives ◽

Research Papers ◽

Test Collections ◽

Lessons Learnt ◽

Wide Range ◽

The Future ◽

20Th Anniversary

This is a report on the tenth edition of the Conference and Labs of the Evaluation Forum (CLEF 2019), held from September 9--12, 2019, in Lugano, Switzerland. CLEF was a four day event combining a Conference and an Evaluation Forum. The Conference featured keynotes by Bruce Croft, Yair Neuman, and Miguel Martínez, and presentation of peer reviewed research papers covering a wide range of topics in addition to many posters. The Evaluation Forum consisted to nine Labs: CENTRE, CheckThat, eHealth, eRisk, ImageCLEF, LifeCLEF, PAN, PIR-CLEF, and ProtestNews, addressing a wide range of tasks, media, languages, and ways to go beyond standard test collections. CLEF 2019 marked the 20th anniversary of CLEF, which was celebrated with a dedicated session and a book on the lessons learnt in twenty years of evaluation activities and the future perspectives for CLEF. CLEF 2019 also introduced the Industry Days to further extend the reach and impact of CLEF.

Download Full-text