Ranking retrieval systems using pseudo relevance judgments

2015 ◽  
Vol 67 (6) ◽  
pp. 700-714 ◽  
Author(s):  
Sri Devi Ravana ◽  
Prabha Rajagopal ◽  
Vimala Balakrishnan

Purpose – In a system-based approach, replicating the web would require large test collections, and judging the relevancy of all documents per topic in creating relevance judgment through human assessors is infeasible. Due to the large amount of documents that requires judgment, there are possible errors introduced by human assessors because of disagreements. The paper aims to discuss these issues. Design/methodology/approach – This study explores exponential variation and document ranking methods that generate a reliable set of relevance judgments (pseudo relevance judgments) to reduce human efforts. These methods overcome problems with large amounts of documents for judgment while avoiding human disagreement errors during the judgment process. This study utilizes two key factors: number of occurrences of each document per topic from all the system runs; and document rankings to generate the alternate methods. Findings – The effectiveness of the proposed method is evaluated using the correlation coefficient of ranked systems using mean average precision scores between the original Text REtrieval Conference (TREC) relevance judgments and pseudo relevance judgments. The results suggest that the proposed document ranking method with a pool depth of 100 could be a reliable alternative to reduce human effort and disagreement errors involved in generating TREC-like relevance judgments. Originality/value – Simple methods proposed in this study show improvement in the correlation coefficient in generating alternate relevance judgment without human assessors while contributing to information retrieval evaluation.

2019 ◽  
Vol 71 (1) ◽  
pp. 2-17
Author(s):  
Prabha Rajagopal ◽  
Sri Devi Ravana ◽  
Yun Sing Koh ◽  
Vimala Balakrishnan

Purpose The effort in addition to relevance is a major factor for satisfaction and utility of the document to the actual user. The purpose of this paper is to propose a method in generating relevance judgments that incorporate effort without human judges’ involvement. Then the study determines the variation in system rankings due to low effort relevance judgment in evaluating retrieval systems at different depth of evaluation. Design/methodology/approach Effort-based relevance judgments are generated using a proposed boxplot approach for simple document features, HTML features and readability features. The boxplot approach is a simple yet repeatable approach in classifying documents’ effort while ensuring outlier scores do not skew the grading of the entire set of documents. Findings The retrieval systems evaluation using low effort relevance judgments has a stronger influence on shallow depth of evaluation compared to deeper depth. It is proved that difference in the system rankings is due to low effort documents and not the number of relevant documents. Originality/value Hence, it is crucial to evaluate retrieval systems at shallow depth using low effort relevance judgments.


2020 ◽  
Vol 28 (3) ◽  
pp. 148-168
Author(s):  
Jin Zhang ◽  
Yuehua Zhao ◽  
Xin Cai ◽  
Taowen Le ◽  
Wei Fei ◽  
...  

Relevance judgment plays an extremely significant role in information retrieval. This study investigates the differences between American users and Chinese users in relevance judgment during the information retrieval process. 384 sets of relevance scores with 50 scores in each set were collected from 16 American users and 16 Chinese users as they judged retrieval records from two major search engines based on 24 predefined search tasks from 4 domain categories. Statistical analyses reveal that there are significant differences between American assessors and Chinese assessors in relevance judgments. Significant gender differences also appear within both the American and the Chinese assessor groups. The study also revealed significant interactions among cultures, genders, and subject categories. These findings can enhance the understanding of cultural impact on information retrieval and can assist in the design of effective cross-language information retrieval systems.


2011 ◽  
Vol 67 (2) ◽  
pp. 264-278 ◽  
Author(s):  
Heting Chu

PurposeThis study intends to identify factors that affect relevance judgment of retrieved information as part of the 2007 TREC Legal track interactive task.Design/methodology/approachData were gathered and analyzed from the participants of the 2007 TREC Legal track interactive task using a questionnaire which includes not only a list of 80 relevance factors identified in prior research, but also a space for expressing their thoughts on relevance judgment in the process.FindingsThis study finds that topicality remains a primary criterion, out of various options, for determining relevance, while specificity of the search request, task, or retrieved results also helps greatly in relevance judgment.Research limitations/implicationsRelevance research should focus on the topicality and specificity of what is being evaluated as well as conducted in real environments.Practical implicationsIf multiple relevance factors are presented to assessors, the total number in a list should be below ten to take account of the limited processing capacity of human beings' short‐term memory. Otherwise, the assessors might either completely ignore or inadequately consider some of the relevance factors when making judgment decisions.Originality/valueThis study presents a method for reducing the artificiality of relevance research design, an apparent limitation in many related studies. Specifically, relevance judgment was made in this research as part of the 2007 TREC Legal track interactive task rather than a study devised for the sake of it. The assessors also served as searchers so that their searching experience would facilitate their subsequent relevance judgments.


2014 ◽  
Vol 2014 ◽  
pp. 1-13 ◽  
Author(s):  
Parnia Samimi ◽  
Sri Devi Ravana

Test collection is used to evaluate the information retrieval systems in laboratory-based evaluation experimentation. In a classic setting, generating relevance judgments involves human assessors and is a costly and time consuming task. Researchers and practitioners are still being challenged in performing reliable and low-cost evaluation of retrieval systems. Crowdsourcing as a novel method of data acquisition is broadly used in many research fields. It has been proven that crowdsourcing is an inexpensive and quick solution as well as a reliable alternative for creating relevance judgments. One of the crowdsourcing applications in IR is to judge relevancy of query document pair. In order to have a successful crowdsourcing experiment, the relevance judgment tasks should be designed precisely to emphasize quality control. This paper is intended to explore different factors that have an influence on the accuracy of relevance judgments accomplished by workers and how to intensify the reliability of judgments in crowdsourcing experiment.


2019 ◽  
Vol 23 (4) ◽  
pp. 340-354
Author(s):  
Asim Kumar Roy Choudhury ◽  
Biswajit Naskar

Purpose This paper aims to compare visual (Munsell) and instrumental (CIELAB) attributes of SCOTDIC colour standards. Design/methodology/approach SCOTDIC cotton and polyester standards of defined hue, value and chroma were subjected to spectrophotometric assessment for finding the corresponding instrumental parameters. The visual and instrumental parameters were compared. Findings The correlation between SCOTDIC value and CIELAB lightness is quite high. Correlation coefficient between SCOTDIC hue and CIELAB hue angle and the correlation between SCOTDIC chroma and CIELAB chroma were only moderate because the CIELAB chroma varied widely at higher chroma. When the standards of SCOTDIC hues having erratic hue angles at two extremes are excluded, the Correlation coefficients between SCOTDIC hue and CIELAB hue angle become high. Research limitations/implications The psychophysical data (visual) are difficult to match with physical data (instrumental). Originality/value The object of the present research is to study and compare visual (Munsell) and instrumental (CIELAB) colorimetric parameters. Munsell scale is physically exemplified by SCOTDIC fabric samples available in two sets, namely, cotton and polyester sets.


2017 ◽  
Vol 35 (2) ◽  
pp. 180-191 ◽  
Author(s):  
Felix Septianto ◽  
Bambang Soegianto

Purpose Although previous research has established that moral emotion, moral judgment, and moral identity influence consumer intention to engage in prosocial behavior (e.g. donating, volunteering) under some circumstances, these factors, in reality, can concurrently influence judgment process. Therefore, it is important to get a more nuanced understanding of how the combinations of each factor can lead to a high intention to engage in prosocial behavior. The paper aims to discuss these issues. Design/methodology/approach This research employs fuzzy-set qualitative comparative analysis to explore different configurations of moral emotion, judgment, and identity that lead to a high consumer intention to engage in prosocial behavior. Findings Findings indicate four configurations of moral emotion, moral judgment, and moral identity that lead to a high intention to engage in prosocial behavior. Research limitations/implications This research focuses on the case of a hospital in Indonesia; thus, it is important not to overgeneralize the findings. Nonetheless, from a methodological standpoint, opportunity emerges to broaden the examinations in other service and cultural contexts. Practical implications The findings of this research can help the hospital to develop effective combinations of advertising and marketing strategies to promote prosocial behavior among its customers. Originality/value This paper provides the first empirical evidence on the existence of multiple pathways of moral emotion, judgment, and identity that lead to a high consumer intention to engage in prosocial behavior. The implications of this research also highlight the importance of cultural context in understanding consumer behavior.


2006 ◽  
Vol 27 (6/7) ◽  
pp. 505-514 ◽  
Author(s):  
Jie Huang ◽  
Katherine Wong

PurposeFrom the cataloging librarians' point of view, this paper aims to present how technical services, especially the cataloging department, can play important roles in the improvement of user services.Design/methodology/approachThe paper examines the practices of the University of Oklahoma Libraries.FindingsThe paper identifies several aspects in which technical services can enhance the quality of user services, especially in the cataloging department. A library's online catalog becomes the first point of access to the library's information resources. Its quality can be improved and enriched in many ways to raise users’ satisfaction. Aside from the improvement in technical aspects, efforts should also be made to promote collaboration between technical and public services so as to ensure efficient processing of materials and to meet the needs of library users.Originality/valueThe value of the paper is in showing that the quality of an online catalog and the cooperation between public and technical services are two of the key factors in achieving high quality of user services.


2016 ◽  
Vol 29 (6) ◽  
pp. 1100-1131 ◽  
Author(s):  
Lee D. Parker ◽  
Deryl Northcott

Purpose – The purpose of this paper is to identify and articulate concepts and approaches to qualitative generalisation that will offer qualitative accounting researchers avenues for enhancing and justifying the general applicability of their research findings and conclusions. Design/methodology/approach – The study and arguments draw from multidisciplinary approaches to this issue. The analysis and theorising is based on published qualitative research literatures from the fields of education, health sciences, sociology, information systems, management and marketing, as well as accounting. Findings – The paper develops two overarching generalisation concepts for application by qualitative accounting researchers. These are built upon a number of qualitative generalisation concepts that have emerged in the multidisciplinary literatures. It also articulates strategies for enhancing the generalisability of qualitative accounting research findings. Research limitations/implications – The paper provides qualitative accounting researchers with understandings, arguments and justifications for the generalisability of their research and the related potential for wider accounting and societal contributions. It also articulates the key factors that impact on the quality of research generalisation that qualitative researchers can offer. Originality/value – This paper presents the most comprehensively sourced and developed approach to the concepts, strategies and unique deliverables of qualitative generalising hitherto available in the accounting research literature.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
James R. Barth ◽  
Yanfei Sun ◽  
Shen Zhang

Purpose The exact criteria used by state governors for choosing opportunity zones (OZs) are not publicly available. This paper aims to examine whether state governors selected the most distressed communities, or those with the highest proportions of minorities, as OZs. Design/methodology/approach This paper compares the distressed communities chosen as OZs in states throughout the country to an equal number of those eligible distressed communities but not selected. Moreover, this paper uses regression analysis to determine whether the poverty rate, median family income, population, percentage of population that is minority and the percentage of population that is African American are significant explanatory factors in the choice of OZs. Findings After describing the tax incentives for investing in OZs, this paper documents that governors did not select many of the most distressed communities, or those with high proportions of minorities, in their individual states. Originality/value This paper describes in some detail the way in which investors may generate tax benefits by investing in eligible property or businesses in OZs. It also examines the extent to which the degree of poverty and the percentage of the population that is minority (and African American) were key factors in the selection of OZs. It arises an issue that the chosen communities are not necessarily those most in need of more investment or those heavily populated by minorities, particularly African Americans.


2019 ◽  
Vol 13 (2) ◽  
pp. 342-362
Author(s):  
Xiaodong Yuan ◽  
Xiaotao Li

Purpose The purpose of this paper is to explore how an organization can combine different types of open innovations and what are the key factors that may influence the combination of different open innovations. Design/methodology/approach The basic methodology of this paper is the longitudinal inductive analysis within the conceptual framework of the open innovation proposed by Dahlander and Gann (2010). In this case study of Xiaomi Tech Inc., the open innovation combination is investigated through examining 25 new products created between August 2010 and December 2016 in terms of four general types: acquiring, sourcing, selling and revealing open innovation. Findings In practice, the combination of different types of open innovations can be realized. A firm may combine different open innovations at three levels: a single product level, a related product cluster level and a company level. In addition, different open innovations can be combined in diverse modes. The purpose of combining different types of open innovations is to overcome the disadvantages of each type and to exploit the advantages of all different types. Many factors may affect a firm’s option of how to combine open innovations. At different development stages, a firm may make and implement corresponding strategic direction based on its innovation capacity and internal resource. For a given strategy, the firm needs to create profits and manage intellectual property in the implementation of open innovations. These factors are interacted each other, rather than isolated. Originality/value The findings of this paper are helpful for better understanding how and why an organization can combine different types of open innovations. From a managerial point of view, an organization may combine different types of open innovations to leverage advantages and avoid disadvantages of each certain type of open innovation. An appropriate combination of different open innovations can effectively improve new product development.


Sign in / Sign up

Export Citation Format

Share Document