On the limitations of document ranking algorithms in information retrieval

In recent years, entity-based ranking models have led to exciting breakthroughs in the research of information retrieval. Compared with traditional retrieval models, entity-based representation enables a better understanding of queries and documents. However, the existing entity-based models neglect the importance of entities in a document. This paper attempts to explore the effects of the importance of entities in a document. Specifically, the dataset analysis is conducted which verifies the correlation between the importance of entities in a document and document ranking. Then, this paper enhances two entity-based models—toy model and Explicit Semantic Ranking model (ESR)—by considering the importance of entities. In contrast to the existing models, the enhanced models assign the weights of entities according to their importance. Experimental results show that the enhanced toy model and ESR can outperform the two baselines by as much as 4.57% and 2.74% on NDCG@20 respectively, and further experiments reveal that the strength of the enhanced models is more evident on long queries and the queries where ESR fails, confirming the effectiveness of taking the importance of entities into account.

Download Full-text

When time meets information retrieval: Past proposals, current plans and future trends

Journal of Information Science ◽

10.1177/0165551515607277 ◽

2016 ◽

Vol 42 (6) ◽

pp. 725-747 ◽

Cited By ~ 4

Author(s):

Bilel Moulahi ◽

Lynda Tamine ◽

Sadok Ben Yahia

Keyword(s):

Information Retrieval ◽

Web Search ◽

Specific Information ◽

Time Dimension ◽

Retrieval Models ◽

Document Ranking ◽

Common Information ◽

Retrieval Effectiveness ◽

Ranking Models ◽

Tremendous Amount

With the advent of Web search and the large amount of data published on the Web sphere, a tremendous amount of documents become strongly time-dependent. In this respect, the time dimension has been extensively exploited as a highly important relevance criterion to improve the retrieval effectiveness of document ranking models. Thus, a compelling research interest is going on the temporal information retrieval realm, which gives rise to several temporal search applications. In this article, we intend to provide a scrutinizing overview of time-aware information retrieval models. We specifically put the focus on the use of timeliness and its impact on the global value of relevance as well as on the retrieval effectiveness. First, we attempt to motivate the importance of temporal signals, whenever combined with other relevance features, in accounting for document relevance. Then, we review the relevant studies standing at the crossroads of both information retrieval and time according to three common information retrieval aspects: the query level, the document content level and the document ranking model level. We organize the related temporal-based approaches around specific information retrieval tasks and regarding the task at hand, we emphasize the importance of results presentation and particularly timelines to the end user. We also report a set of relevant research trends and avenues that can be explored in the future.

Download Full-text

Fast document ranking for large scale information retrieval

Lecture Notes in Computer Science - Applications of Databases ◽

10.1007/3-540-58183-9_53 ◽

1994 ◽

pp. 253-266 ◽

Cited By ~ 1

Author(s):

Michael Persin ◽

Justin Zobel ◽

Ron Sacks-Davis

Keyword(s):

Information Retrieval ◽

Large Scale ◽

Document Ranking

Download Full-text

Analysis of Vector Space Method in Information Retrieval for Smart Answering System

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9099 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4468-4472

Author(s):

Deepa Yogish ◽

T. N. Manjunath ◽

Ravindra S. Hegadi

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Query Term ◽

Frequency Method ◽

Document Ranking ◽

User Intent ◽

Space Model ◽

Relevant Document ◽

User Query

In the world of internet, searching play a vital role to retrieve the relevant answers for the user specific queries. The most promising application of natural language processing and information retrieval system is Question answering system which provides directly the accurate answer instead of set of documents. The main objective of information retrieval is to retrieve relevant document from a huge volume of data sets underlying in the internet using appropriatemodel. There are many models proposed for retrieval process such as Boolean, Vector space and Probabilistic method. Vector space model is best method in information retrieval for document ranking with efficient document representation which combines simplicity and clarity. VSM adopts similarity function to measure the matching between documents and user intent, and assign scores from the biggest to smallest. The documents and query are assigned with weights using term frequency and inverse document frequency method. To retrieve most relevant document to the user query term, document ranking function cosine similarity score is applied for every document and user query. The documents having more similarity scores will be considered as relevant documents to the query term and they are ranked based on these scores. This paper emphasizes on different techniques of information retrieval and Vector Space Model offers a realistic compromise in IR processing. It allows best weighing scheme which ranks the set of documents in order of relevance based on user query.

Download Full-text