scholarly journals Video Test Collection with Graded Relevance Assessments

Author(s):  
Weng Qiying ◽  
Martin Halvey ◽  
Robert Villa
2020 ◽  
Vol 54 (2) ◽  
pp. 1-2
Author(s):  
Dan Li

The availability of test collections in Cranfield paradigm has significantly benefited the development of models, methods and tools in information retrieval. Such test collections typically consist of a set of topics, a document collection and a set of relevance assessments. Constructing these test collections requires effort of various perspectives such as topic selection, document selection, relevance assessment, and relevance label aggregation etc. The work in the thesis provides a fundamental way of constructing and utilizing test collections in information retrieval in an effective, efficient and reliable manner. To that end, we have focused on four aspects. We first study the document selection issue when building test collections. We devise an active sampling method for efficient large-scale evaluation [Li and Kanoulas, 2017]. Different from past sampling-based approaches, we account for the fact that some systems are of higher quality than others, and we design the sampling distribution to over-sample documents from these systems. At the same time, the estimated evaluation measures are unbiased, and assessments can be used to evaluate new, novel systems without introducing any systematic error. Then a natural further step is determining when to stop the document selection and assessment procedure. This is an important but understudied problem in the construction of test collections. We consider both the gain of identifying relevant documents and the cost of assessing documents as the optimization goals. We handle the problem under the continuous active learning framework by jointly training a ranking model to rank documents, and estimating the total number of relevant documents in the collection using a "greedy" sampling method [Li and Kanoulas, 2020]. The next stage of constructing a test collection is assessing relevance. We study how to denoise relevance assessments by aggregating from multiple crowd annotation sources to obtain high-quality relevance assessments. This helps to boost the quality of relevance assessments acquired in a crowdsourcing manner. We assume a Gaussian process prior on query-document pairs to model their correlation. The proposed model shows good performance in terms of interring true relevance labels. Besides, it allows predicting relevance labels for new tasks that has no crowd annotations, which is a new functionality of CrowdGP. Ablation studies demonstrate that the effectiveness is attributed to the modelling of task correlation based on the axillary information of tasks and the prior relevance information of documents to queries. After a test collection is constructed, it can be used to either evaluate retrieval systems or train a ranking model. We propose to use it to optimize the configuration of retrieval systems. We use Bayesian optimization approach to model the effect of a δ -step in the configuration space to the effectiveness of the retrieval system, by suggesting to use different similarity functions (covariance functions) for continuous and categorical values, and examine their ability to effectively and efficiently guide the search in the configuration space [Li and Kanoulas, 2018]. Beyond the algorithmic and empirical contributions, work done as part of this thesis also contributed to the research community as the CLEF Technology Assisted Reviews in Empirical Medicine Tracks in 2017, 2018, and 2019 [Kanoulas et al., 2017, 2018, 2019]. Awarded by: University of Amsterdam, Amsterdam, The Netherlands. Supervised by: Evangelos Kanoulas. Available at: https://dare.uva.nl/search?identifier=3438a2b6-9271-4f2c-add5-3c811cc48d42.


2007 ◽  
Vol 41 (2) ◽  
pp. 42-45 ◽  
Author(s):  
Peter Bailey ◽  
Nick Craswell ◽  
Ian Soboroff ◽  
Arjen P. de Vries

Author(s):  
Cathal Gurrin ◽  
Klaus Schoeffmann ◽  
Hideo Joho ◽  
Bernd Munzer ◽  
Rami Albatal ◽  
...  
Keyword(s):  

2009 ◽  
pp. 3040-3041
Author(s):  
Ben Carterette
Keyword(s):  

2021 ◽  
Author(s):  
Arabzadehghahyazi Negar

file:///C:/Users/MWF/Downloads/Arabzadehghahyazi, Negar.Pre-retrieval Query Performance Prediction (QPP) methods are oblivious to the performance of the retrieval model as they predict query difficulty prior to observing the set of documents retrieved for the query. Among pre-retrieval query performance predictors, specificity-based metrics investigate how corpus, query and corpus-query level statistics can be used to predict the performance of the query. In this thesis, we explore how neural embeddings can be utilized to define corpus-independent and semantics-aware specificity metrics. Our metrics are based on the intuition that a term that is closely surrounded by other terms in the embedding space is more likely to be specific while a term surrounded by less closely related terms is more likely to be generic. On this basis, we leverage geometric properties between embedded terms to define four groups of metrics: (1) neighborhood-based, (2) graph-based, (3) cluster-based and (4) vector-based metrics. Moreover, we employ learning-to-rank techniques to analyze the importance of individual specificity metrics. To evaluate the proposed metrics, we have curated and publicly share a test collection of term specificity measurements defined based on Wikipedia category hierarchy and DMOZ taxonomy. We report on our extensive experiments on the effectiveness of our metrics through metric comparison, ablation study and comparison against the state-of-the-art baselines. We have shown that our proposed set of pre-retrieval QPP metrics based on the properties of pre-trained neural embeddings are more effective for performance prediction compared to the state-of-the-art methods. We report our findings based on Robust04, ClueWeb09 and Gov2 corpora and their associated TREC topics.


Author(s):  
Shaghayegh Bahramiabdolmalaki ◽  
Alireza Homayouni ◽  
Masoud Aliyali

Introduction: Psychosomatic experts have tried to associate mental disorders to physical illnesses. The vulnerability of different parts of the body is thought to depend on fundamental differences between individuals. One of the methods that seems to affect the psychological problems of asthma patients is acceptance and commitment therapy. Therefore, the aim of this study was to evaluate the effectiveness of acceptance- and commitment-based therapy on resilience, psychological well-being, and life expectancy in asthmatic patients. Methods: This quasi-experimental pre-test and post-test study was conducted on 30 asthmatic patients who were randomly assigned to the experimental (n = 15) and control (n = 15) groups according to the inclusion criteria. Acceptance and commitment therapy sessions were based on the treatment package of Hayes et al. in 8 sessions of 60 minutes on the experimental group and no intervention was performed on the control group. All participants took part in the pre-test and post-test. Collection tools included Conner-Davidson Resilience Questionnaire, Schneider Life expectancy, and Ryf Psychological Well-being. Results: The results showed a significant difference in the components of resilience, psychological well-being, and life expectancy in asthmatic patients before and after the experiment (p <0.05). In other words, acceptance and commitment-based therapy had a positive effect on resilience, psychological well-being and life expectancy in asthmatic patients and these components have increased in patients. Conclusion: Findings showed that acceptance- and commitment-based therapy was effective on resilience, psychological well-being, and life expectancy of asthmatic patients. This treatment is suggested to be used in conjunction with drug therapy to improve the psychological symptoms of asthmatic patients.


Sign in / Sign up

Export Citation Format

Share Document