PSWG: An automatic stop-word list generator for Persian information retrieval systems based on similarity function & POS information

Author(s):  
Mohammad-Ali Yaghoub-Zadeh-Fard ◽  
Behrouz Minaei-Bidgoli ◽  
Saeed Rahmani ◽  
Saeed Shahrivari
Author(s):  
Veronica dos Santos ◽  
Sérgio Lifschitz

Information Retrieval Systems usually employ syntactic search techniques to match a set of keywords with the indexed content to retrieve results. But pure keyword-based matching lacks on capturing user's search intention and context and suffers of natural language ambiguity and vocabulary mismatch. Considering this scenario, the hypothesis raised is that the use of embeddings in a semantic search approach will make search results more meaningfully. Embeddings allow to minimize problems arising from terminology and context mismatch. This work proposes a semantic similarity function to support semantic search based on hyper relational knowledge graphs. This function uses embeddings in order to find the most similar nodes that satisfy a user query.


1967 ◽  
Vol 06 (02) ◽  
pp. 45-51 ◽  
Author(s):  
A. Kent ◽  
J. Belzer ◽  
M. Kuhfeerst ◽  
E. D. Dym ◽  
D. L. Shirey ◽  
...  

An experiment is described which attempts to derive quantitative indicators regarding the potential relevance predictability of the intermediate stimuli used to represent documents in information retrieval systems. In effect, since the decision to peruse an entire document is often predicated upon the examination of one »level of processing« of the document (e.g., the citation and/or abstract), it became interesting to analyze the properties of what constitutes »relevance«. However, prior to such an analysis, an even more elementary step had to be made, namely, to determine what portions of a document should be examined.An evaluation of the ability of intermediate response products (IRPs), functioning as cues to the information content of full documents, to predict the relevance determination that would be subsequently made on these documents by motivated users of information retrieval systems, was made under controlled experimental conditions. The hypothesis that there might be other intermediate response products (selected extracts from the document, i.e., first paragraph, last paragraph, and the combination of first and last paragraph), that would be as representative of the full document as the traditional IRPs (citation and abstract) was tested systematically. The results showed that:1. there is no significant difference among the several IRP treatment groups on the number of cue evaluations of relevancy which match the subsequent user relevancy decision on the document;2. first and last paragraph combinations have consistently predicted relevancy to a higher degree than the other IRPs;3. abstracts were undistinguished as predictors; and4. the apparent high predictability rating for citations was not substantive.Some of these results are quite different than would be expected from previous work with unmotivated subjects.


2005 ◽  
Vol 14 (5) ◽  
pp. 335-346
Author(s):  
Por Carlos Benito Amat ◽  
Por Carlos Benito Amat

Libri ◽  
2020 ◽  
Vol 70 (3) ◽  
pp. 227-237
Author(s):  
Mahdi Zeynali-Tazehkandi ◽  
Mohsen Nowkarizi

AbstractEvaluation of information retrieval systems is a fundamental topic in Library and Information Science. The aim of this paper is to connect the system-oriented and the user-oriented approaches to relevant philosophical schools. By reviewing the related literature, it was found that the evaluation of information retrieval systems is successful if it benefits from both system-oriented and user-oriented approaches (composite). The system-oriented approach is rooted in Parmenides’ philosophy of stability (immovable) which Plato accepts and attributes to the world of forms; the user-oriented approach is rooted in Heraclitus’ flux philosophy (motion) which Plato defers and attributes to the tangible world. Thus, using Plato’s theory is a comprehensive approach for recognizing the concept of relevance. The theoretical and philosophical foundations determine the type of research methods and techniques. Therefore, Plato’s dialectical method is an appropriate composite method for evaluating information retrieval systems.


1985 ◽  
Vol 8 (2) ◽  
pp. 253-267
Author(s):  
S.K.M. Wong ◽  
Wojciech Ziarko

In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The main difficulty with this approach is that the explicit representation of term vectors is not known a priori. For this reason, the vector space model adopted by Salton for the SMART system treats the terms as a set of orthogonal vectors. In such a model it is often necessary to adopt a separate, corrective procedure to take into account the correlations between terms. In this paper, we propose a systematic method (the generalized vector space model) to compute term correlations directly from automatic indexing scheme. We also demonstrate how such correlations can be included with minimal modification in the existing vector based information retrieval systems.


Sign in / Sign up

Export Citation Format

Share Document