Methods for ranking information retrieval systems without relevance judgments

Author(s):  
Shengli Wu ◽  
Fabio Crestani
2020 ◽  
Vol 28 (3) ◽  
pp. 148-168
Author(s):  
Jin Zhang ◽  
Yuehua Zhao ◽  
Xin Cai ◽  
Taowen Le ◽  
Wei Fei ◽  
...  

Relevance judgment plays an extremely significant role in information retrieval. This study investigates the differences between American users and Chinese users in relevance judgment during the information retrieval process. 384 sets of relevance scores with 50 scores in each set were collected from 16 American users and 16 Chinese users as they judged retrieval records from two major search engines based on 24 predefined search tasks from 4 domain categories. Statistical analyses reveal that there are significant differences between American assessors and Chinese assessors in relevance judgments. Significant gender differences also appear within both the American and the Chinese assessor groups. The study also revealed significant interactions among cultures, genders, and subject categories. These findings can enhance the understanding of cultural impact on information retrieval and can assist in the design of effective cross-language information retrieval systems.


Author(s):  
Theresa Dirndorfer Anderson

This chapter uses a study of human assessments of relevance to demonstrate how individual relevance judgments and retrieval practices embody collaborative elements that contribute to the overall progress of that person’s individual work. After discussing key themes of the conceptual framework, the chapter will discuss two case studies that serve as powerful illustrations of these themes for researchers and practitioners alike. These case studies, outcomes of a two-year ethnographic exploration of research practices, illustrate the theoretical position presented in part one of the chapter, providing lessons for the ways that people work with information systems to generate knowledge and the conditions that will support these practices. The chapter shows that collaboration does not have to be explicit to influence searcher behavior. It seeks to present both a theoretical framework and case studies that can be applied to the design, development and evaluation of collaborative information retrieval systems.


2014 ◽  
Vol 2014 ◽  
pp. 1-13 ◽  
Author(s):  
Parnia Samimi ◽  
Sri Devi Ravana

Test collection is used to evaluate the information retrieval systems in laboratory-based evaluation experimentation. In a classic setting, generating relevance judgments involves human assessors and is a costly and time consuming task. Researchers and practitioners are still being challenged in performing reliable and low-cost evaluation of retrieval systems. Crowdsourcing as a novel method of data acquisition is broadly used in many research fields. It has been proven that crowdsourcing is an inexpensive and quick solution as well as a reliable alternative for creating relevance judgments. One of the crowdsourcing applications in IR is to judge relevancy of query document pair. In order to have a successful crowdsourcing experiment, the relevance judgment tasks should be designed precisely to emphasize quality control. This paper is intended to explore different factors that have an influence on the accuracy of relevance judgments accomplished by workers and how to intensify the reliability of judgments in crowdsourcing experiment.


2019 ◽  
Vol 71 (1) ◽  
pp. 2-17
Author(s):  
Prabha Rajagopal ◽  
Sri Devi Ravana ◽  
Yun Sing Koh ◽  
Vimala Balakrishnan

Purpose The effort in addition to relevance is a major factor for satisfaction and utility of the document to the actual user. The purpose of this paper is to propose a method in generating relevance judgments that incorporate effort without human judges’ involvement. Then the study determines the variation in system rankings due to low effort relevance judgment in evaluating retrieval systems at different depth of evaluation. Design/methodology/approach Effort-based relevance judgments are generated using a proposed boxplot approach for simple document features, HTML features and readability features. The boxplot approach is a simple yet repeatable approach in classifying documents’ effort while ensuring outlier scores do not skew the grading of the entire set of documents. Findings The retrieval systems evaluation using low effort relevance judgments has a stronger influence on shallow depth of evaluation compared to deeper depth. It is proved that difference in the system rankings is due to low effort documents and not the number of relevant documents. Originality/value Hence, it is crucial to evaluate retrieval systems at shallow depth using low effort relevance judgments.


1967 ◽  
Vol 06 (02) ◽  
pp. 45-51 ◽  
Author(s):  
A. Kent ◽  
J. Belzer ◽  
M. Kuhfeerst ◽  
E. D. Dym ◽  
D. L. Shirey ◽  
...  

An experiment is described which attempts to derive quantitative indicators regarding the potential relevance predictability of the intermediate stimuli used to represent documents in information retrieval systems. In effect, since the decision to peruse an entire document is often predicated upon the examination of one »level of processing« of the document (e.g., the citation and/or abstract), it became interesting to analyze the properties of what constitutes »relevance«. However, prior to such an analysis, an even more elementary step had to be made, namely, to determine what portions of a document should be examined.An evaluation of the ability of intermediate response products (IRPs), functioning as cues to the information content of full documents, to predict the relevance determination that would be subsequently made on these documents by motivated users of information retrieval systems, was made under controlled experimental conditions. The hypothesis that there might be other intermediate response products (selected extracts from the document, i.e., first paragraph, last paragraph, and the combination of first and last paragraph), that would be as representative of the full document as the traditional IRPs (citation and abstract) was tested systematically. The results showed that:1. there is no significant difference among the several IRP treatment groups on the number of cue evaluations of relevancy which match the subsequent user relevancy decision on the document;2. first and last paragraph combinations have consistently predicted relevancy to a higher degree than the other IRPs;3. abstracts were undistinguished as predictors; and4. the apparent high predictability rating for citations was not substantive.Some of these results are quite different than would be expected from previous work with unmotivated subjects.


2005 ◽  
Vol 14 (5) ◽  
pp. 335-346
Author(s):  
Por Carlos Benito Amat ◽  
Por Carlos Benito Amat

Libri ◽  
2020 ◽  
Vol 70 (3) ◽  
pp. 227-237
Author(s):  
Mahdi Zeynali-Tazehkandi ◽  
Mohsen Nowkarizi

AbstractEvaluation of information retrieval systems is a fundamental topic in Library and Information Science. The aim of this paper is to connect the system-oriented and the user-oriented approaches to relevant philosophical schools. By reviewing the related literature, it was found that the evaluation of information retrieval systems is successful if it benefits from both system-oriented and user-oriented approaches (composite). The system-oriented approach is rooted in Parmenides’ philosophy of stability (immovable) which Plato accepts and attributes to the world of forms; the user-oriented approach is rooted in Heraclitus’ flux philosophy (motion) which Plato defers and attributes to the tangible world. Thus, using Plato’s theory is a comprehensive approach for recognizing the concept of relevance. The theoretical and philosophical foundations determine the type of research methods and techniques. Therefore, Plato’s dialectical method is an appropriate composite method for evaluating information retrieval systems.


Sign in / Sign up

Export Citation Format

Share Document