relevance assessments Latest Research Papers

The availability of test collections in Cranfield paradigm has significantly benefited the development of models, methods and tools in information retrieval. Such test collections typically consist of a set of topics, a document collection and a set of relevance assessments. Constructing these test collections requires effort of various perspectives such as topic selection, document selection, relevance assessment, and relevance label aggregation etc. The work in the thesis provides a fundamental way of constructing and utilizing test collections in information retrieval in an effective, efficient and reliable manner. To that end, we have focused on four aspects. We first study the document selection issue when building test collections. We devise an active sampling method for efficient large-scale evaluation [Li and Kanoulas, 2017]. Different from past sampling-based approaches, we account for the fact that some systems are of higher quality than others, and we design the sampling distribution to over-sample documents from these systems. At the same time, the estimated evaluation measures are unbiased, and assessments can be used to evaluate new, novel systems without introducing any systematic error. Then a natural further step is determining when to stop the document selection and assessment procedure. This is an important but understudied problem in the construction of test collections. We consider both the gain of identifying relevant documents and the cost of assessing documents as the optimization goals. We handle the problem under the continuous active learning framework by jointly training a ranking model to rank documents, and estimating the total number of relevant documents in the collection using a "greedy" sampling method [Li and Kanoulas, 2020]. The next stage of constructing a test collection is assessing relevance. We study how to denoise relevance assessments by aggregating from multiple crowd annotation sources to obtain high-quality relevance assessments. This helps to boost the quality of relevance assessments acquired in a crowdsourcing manner. We assume a Gaussian process prior on query-document pairs to model their correlation. The proposed model shows good performance in terms of interring true relevance labels. Besides, it allows predicting relevance labels for new tasks that has no crowd annotations, which is a new functionality of CrowdGP. Ablation studies demonstrate that the effectiveness is attributed to the modelling of task correlation based on the axillary information of tasks and the prior relevance information of documents to queries. After a test collection is constructed, it can be used to either evaluate retrieval systems or train a ranking model. We propose to use it to optimize the configuration of retrieval systems. We use Bayesian optimization approach to model the effect of a δ -step in the configuration space to the effectiveness of the retrieval system, by suggesting to use different similarity functions (covariance functions) for continuous and categorical values, and examine their ability to effectively and efficiently guide the search in the configuration space [Li and Kanoulas, 2018]. Beyond the algorithmic and empirical contributions, work done as part of this thesis also contributed to the research community as the CLEF Technology Assisted Reviews in Empirical Medicine Tracks in 2017, 2018, and 2019 [Kanoulas et al., 2017, 2018, 2019]. Awarded by: University of Amsterdam, Amsterdam, The Netherlands. Supervised by: Evangelos Kanoulas. Available at: https://dare.uva.nl/search?identifier=3438a2b6-9271-4f2c-add5-3c811cc48d42.

Download Full-text

Randomised vs. Prioritised Pools for Relevance Assessments: Sample Size Considerations

Information Retrieval Technology - Lecture Notes in Computer Science ◽

10.1007/978-3-030-42835-8_9 ◽

2020 ◽

pp. 94-105 ◽

Cited By ~ 1

Author(s):

Tetsuya Sakai ◽

Peng Xiao

Keyword(s):

Sample Size ◽

Relevance Assessments

Download Full-text

Speech Engines

10.31228/osf.io/chvgu ◽

2018 ◽

Cited By ~ 3

Author(s):

James Grimmelmann

Keyword(s):

Search Engine ◽

Search Engines ◽

The Internet ◽

Internet Search ◽

High Quality ◽

New Approach ◽

To Receive ◽

Relevance Assessments

98 Minnesota Law Review 868 (2014)Academic and regulatory debates about Google are dominated by two opposing theories of what search engines are and how law should treat them. Some describe search engines as passive, neutral conduits for websites’ speech; others describe them as active, opinionated editors: speakers in their own right. The conduit and editor theories give dramatically different policy prescriptions in areas ranging from antitrust to copyright. But they both systematically discount search users’ agency, regarding users merely as passive audiences.A better theory is that search engines are not primarily conduits or editors, but advisors. They help users achieve their diverse and individualized information goals by sorting through the unimaginable scale and chaos of the Internet. Search users are active listeners, affirmatively seeking out the speech they wish to receive. Search engine law can help them by ensuring two things: access to high-quality search engines, and loyalty from those search engines.The advisor theory yields fresh insights into long-running disputes about Google. It suggests, for example, a new approach to deciding when Google should be liable for giving a website the “wrong” ranking. Users’ goals are too subjective for there to be an absolute standard of correct and incorrect rankings; different search engines necessarily assess relevance differently. But users are also entitled to complain when a search engine deliberately misleads them about its own relevance assessments. The result is a sensible, workable compromise between the conduit and editor theories.

Download Full-text

A rank fusion approach based on score distributions for prioritizing relevance assessments in information retrieval evaluation

Information Fusion ◽

10.1016/j.inffus.2017.04.001 ◽

2018 ◽

Vol 39 ◽

pp. 56-71 ◽

Cited By ~ 9

Author(s):

David E. Losada ◽

Javier Parapar ◽

Alvaro Barreiro

Keyword(s):

Information Retrieval ◽

Information Retrieval Evaluation ◽

Fusion Approach ◽

Score Distributions ◽

Relevance Assessments

Download Full-text

Evaluating crowdsourced relevance assessments using self-reported traits and task speed

Proceedings of the 29th Australian Conference on Computer-Human Interaction - OZCHI '17 ◽

10.1145/3152771.3156146 ◽

2017 ◽

Author(s):

Christopher Chow ◽

Tom Gedeon

Keyword(s):

And Task ◽

Relevance Assessments

Download Full-text

Social relevance assessments for virtual worlds

Journal of Documentation ◽

10.1108/jd-07-2016-0096 ◽

2017 ◽

Vol 73 (6) ◽

pp. 1209-1227 ◽

Cited By ~ 8

Author(s):

Kaitlin Light Costello

Keyword(s):

Kidney Disease ◽

Virtual Worlds ◽

Information Seeking ◽

Information Exchange ◽

Information Science ◽

Comparative Method ◽

Social Characteristics ◽

Social Relevance ◽

Content Type ◽

Relevance Assessments

Purpose The purpose of this paper is to introduce the concept of social relevance assessments, which are judgments made by individuals when they seek out information within virtual social worlds such as online support groups (OSGs). Design/methodology/approach Constructivist grounded theory was employed to examine the phenomenon of information exchange in OSGs for chronic kidney disease. In-depth interviews were conducted with 12 participants, and their posts in three OSGs were also harvested. Data were analyzed using inductive content analysis and the constant comparative method. Theoretical sampling was conducted until saturation was reached. Member checking, peer debriefing, and triangulation were used to verify results. Findings There are two levels of relevance assessment that occur when people seek out information in OSGs. First, participants evaluate the OSG to determine whether or not the group is an appropriate place for information exchange about kidney disease. Second, participants evaluate individual users on the OSG to see if they are appropriate people with whom to exchange information. This often takes the form of similarity assessment, whereby people try to determine whether or not they are similar to specific individuals on the forums. They use a variety of heuristics to assess similarity as part of this process. Originality/value This paper extends the author’s understanding of relevance in information science in two fundamental ways. Within the context of social information exchange, relevance is socially constructed and is based on social characteristics, such as age, shared beliefs, and experience. Moreover, relevance is assessed both when participants seek out information and when they disclose information, suggesting that the conception of relevance as a process that occurs primarily during information seeking is limited.

Download Full-text