scholarly journals A Semantic Framework for Evaluating Topical Search Methods

2011 ◽  
Vol 14 (1) ◽  
Author(s):  
Rocío L. Cecchini ◽  
Carlos M. Lorenzetti ◽  
Ana G. Maguitman ◽  
Filippo Menczer

The absence of reliable and efficient techniques to evaluate information retrieval systems has become a bottleneck in the development of novel retrieval methods. In traditional approaches users or hired evaluators provide manual assessments of relevance. However these approaches are neither efficient nor reliable since they do not scale with the complexity and heterogeneity of available digital information. Automatic approaches, on the other hand, could be efficient but disregard semantic data, which is usually important to assess the actual performance of the evaluated methods. This article proposes to use topic ontologies and semantic similarity data derived from these ontologies to implement an automatic semantic evaluation framework for information retrieval systems. The use of semantic simi- larity data allows to capture the notion of partial relevance, generalizing traditional evaluation metrics, and giving rise to novel performance measures such as semantic precision and semantic harmonic mean. The validity of the approach is supported by user studies and the application of the proposed framework is illustrated with the evaluation of topical retrieval systems. The evaluated systems include a baseline, a supervised version of the Bo1 query refinement method and two multi-objective evolutionary algorithms for context-based retrieval. Finally, we discuss the advantages of ap- plying evaluation metrics that account for semantic similarity data and partial relevance over existing metrics based on the notion of total relevance.

2018 ◽  
Vol 36 (1) ◽  
pp. 55-70 ◽  
Author(s):  
Sanjeev K. Sunny ◽  
Mallikarjun Angadi

Purpose The purpose of this study is to carry out a systematic literature review for evidence-based assessment of the effectiveness of thesaurus in digital information retrieval systems. It also aimed to identify the evaluation methods, evaluation measures and data collection tools which may be used in evaluating digital information retrieval systems. Design/methodology/approach A systematic literature review (SLR) of 344 publications from LISA and 238 from Scopus has been carried out to identify the evaluation studies for analysis, and 15 evaluation studies have been analyzed. Findings This study presents evidences for the effectiveness of thesaurus in digital information retrieval systems. Various methods for evaluating digital information systems have been identified. Also, a wide range of evaluation measures and data collection tools have been identified. Research limitations/implications The study was limited to the literature published in English language and indexed in LISA and Scopus. The evaluation methods, evaluation measures and data collection tools identified in this study may be used to design more cognizant evaluation studies for digital information retrieval systems. Practical implications The findings have significant implications for the administrators of any type of digital information retrieval systems in making more informed decisions toward implementation of thesaurus in resource description and access to digital collections. Originality/value This study extends our knowledge on the potentials of thesauri in digital information retrieval systems. It also provides cues for designing more cognizant evaluation studies for digital information systems.


2022 ◽  
Vol 59 (1) ◽  
pp. 102747
Author(s):  
Peng Zhang ◽  
Hui Gao ◽  
Zeting Hu ◽  
Meng Yang ◽  
Dawei Song ◽  
...  

Author(s):  
A. V. Kulikova

The author continues with her study initially presented in the article “The possibilities of information search in electronic platforms of Russian libraries” (A. V. Kulikova. The possibilities of information search in electronic platforms of russian libraries // The Journal of Encyclopaedic Studies. – 2019. – No 2. – P. 30–52). She demonstrates the methods to be applied for business information search related to local encyclopaedic book publications and identifies the principles to find recent publications promptly and to satisfy user demands most effectively. The bibliographic search success depends upon how the user understands the system. Optimum query formulation saves time and excludes information noise. The key characteristics of library digital information retrieval systems are discussed. The computer systems of 113 regional libraries were analyzed within the study. The following automated library information were tested objectively: IRBIS, RUSLAN, OPAC-Global, Foliant, MacWeb. The author does not intend to advertise or subvertise any ALIS. Her main goal is to reveal the convenient and speedy retrieval methods with existing functionalities.


2021 ◽  
Vol 28 (1) ◽  
pp. 37-48
Author(s):  
Thoriq Tri Prabowo

Today's digital library is a necessity. A system that provides all-digital information and services requires that all aspects of it should be accessed effectively. In the context of information retrieval in digital libraries, the information retrieval systems are important instruments. The system becomes a link between relevant information and its users. Evaluation of the information retrieval system to determine its effectiveness is important to ensure that users receive good retrieval services. Recall and precision are approaches to measure the effectiveness of information retrieval systems that are widely used. This study aims to determine the effectiveness of the ISI Yogyakarta digital library retrieval system based on recall and precision approaches. This study will provide benefits for librarians in knowing the effectiveness of the information retrieval system and the extent of their accuracy in indexing. This research uses an experimental method with a quantitative approach. The researcher chose a sample of the searching keywords purposively and then tested them by searching on the portal http://digilib.isi.ac.id/. The data obtained were analyzed using the formula recall and precision. In this study the subjects tested were interior design subjects. The precision measurement of 10 keywords on the subject of interior design gets 92.37% results while the recall measurement gets 80.79% results. The result stated that precision is higher than recall, it showed that the information retrieval system of ISI Yogyakarta’s digital library is quite effective.


Author(s):  
Ana Gabriela Maguitman ◽  
Carlos M. Lorenzetti ◽  
Rocío L. Cecchini

Performance evaluation plays a crucial role in the development and improvement of search systems in general and context-based systems in particular. In order to evaluate search systems, test collections are needed. These test collections typically involve a corpus of documents, a set of queries and a series of relevance assessments. In traditional approaches users or hired evaluators provide manual assessments of relevance. However this is difficult and expensive, and does not scale with the complexity and heterogeneity of available digital information. This chapter proposes a semantic evaluation framework that takes advantages of topic ontologies and semantic similarity data derived from these ontologies. The structure and content of the Open Directory Project topic ontology is used to derive semantic relations among a massive number of topics and to implement classical and ad hoc retrieval performance evaluation metrics. In addition, this chapter describes an incremental method for context-based retrieval, which is based on the notions of topic descriptors and topic discriminators. The incremental context-based retrieval method is used to illustrate the application of the proposed semantic evaluation framework. Finally, the chapter discusses the advantages of applying the proposed framework.


1967 ◽  
Vol 06 (02) ◽  
pp. 45-51 ◽  
Author(s):  
A. Kent ◽  
J. Belzer ◽  
M. Kuhfeerst ◽  
E. D. Dym ◽  
D. L. Shirey ◽  
...  

An experiment is described which attempts to derive quantitative indicators regarding the potential relevance predictability of the intermediate stimuli used to represent documents in information retrieval systems. In effect, since the decision to peruse an entire document is often predicated upon the examination of one »level of processing« of the document (e.g., the citation and/or abstract), it became interesting to analyze the properties of what constitutes »relevance«. However, prior to such an analysis, an even more elementary step had to be made, namely, to determine what portions of a document should be examined.An evaluation of the ability of intermediate response products (IRPs), functioning as cues to the information content of full documents, to predict the relevance determination that would be subsequently made on these documents by motivated users of information retrieval systems, was made under controlled experimental conditions. The hypothesis that there might be other intermediate response products (selected extracts from the document, i.e., first paragraph, last paragraph, and the combination of first and last paragraph), that would be as representative of the full document as the traditional IRPs (citation and abstract) was tested systematically. The results showed that:1. there is no significant difference among the several IRP treatment groups on the number of cue evaluations of relevancy which match the subsequent user relevancy decision on the document;2. first and last paragraph combinations have consistently predicted relevancy to a higher degree than the other IRPs;3. abstracts were undistinguished as predictors; and4. the apparent high predictability rating for citations was not substantive.Some of these results are quite different than would be expected from previous work with unmotivated subjects.


Sign in / Sign up

Export Citation Format

Share Document