scholarly journals When time meets information retrieval: Past proposals, current plans and future trends

2016 ◽  
Vol 42 (6) ◽  
pp. 725-747 ◽  
Author(s):  
Bilel Moulahi ◽  
Lynda Tamine ◽  
Sadok Ben Yahia

With the advent of Web search and the large amount of data published on the Web sphere, a tremendous amount of documents become strongly time-dependent. In this respect, the time dimension has been extensively exploited as a highly important relevance criterion to improve the retrieval effectiveness of document ranking models. Thus, a compelling research interest is going on the temporal information retrieval realm, which gives rise to several temporal search applications. In this article, we intend to provide a scrutinizing overview of time-aware information retrieval models. We specifically put the focus on the use of timeliness and its impact on the global value of relevance as well as on the retrieval effectiveness. First, we attempt to motivate the importance of temporal signals, whenever combined with other relevance features, in accounting for document relevance. Then, we review the relevant studies standing at the crossroads of both information retrieval and time according to three common information retrieval aspects: the query level, the document content level and the document ranking model level. We organize the related temporal-based approaches around specific information retrieval tasks and regarding the task at hand, we emphasize the importance of results presentation and particularly timelines to the end user. We also report a set of relevant research trends and avenues that can be explored in the future.

Information ◽  
2019 ◽  
Vol 10 (2) ◽  
pp. 39
Author(s):  
Zhenyang Li ◽  
Guangluan Xu ◽  
Xiao Liang ◽  
Feng Li ◽  
Lei Wang ◽  
...  

In recent years, entity-based ranking models have led to exciting breakthroughs in the research of information retrieval. Compared with traditional retrieval models, entity-based representation enables a better understanding of queries and documents. However, the existing entity-based models neglect the importance of entities in a document. This paper attempts to explore the effects of the importance of entities in a document. Specifically, the dataset analysis is conducted which verifies the correlation between the importance of entities in a document and document ranking. Then, this paper enhances two entity-based models—toy model and Explicit Semantic Ranking model (ESR)—by considering the importance of entities. In contrast to the existing models, the enhanced models assign the weights of entities according to their importance. Experimental results show that the enhanced toy model and ESR can outperform the two baselines by as much as 4.57% and 2.74% on NDCG@20 respectively, and further experiments reveal that the strength of the enhanced models is more evident on long queries and the queries where ESR fails, confirming the effectiveness of taking the importance of entities into account.


1998 ◽  
Vol 13 (3) ◽  
pp. 263-295 ◽  
Author(s):  
MOUNIA LALMAS ◽  
PETER D. BRUZA

Information retrieval is the science concerned with the efficient and effective storage of information for the later retrieval and use by interested parties. During the last forty years, a plethora of information retrieval models and their variations have emerged. Logic-based models were launched to provide a rich and uniform representation of information and its semantics with the aim to improve information retrieval effectiveness. This approach was first advanced in 1986 by Van Rijsbergen with the so-called logical uncertainty principle. Since then, various logic-based models have been developed. This paper presents an introduction to and a survey of the use of logic for information retrieval modelling.


Author(s):  
Nikolaos Korfiatis ◽  
Miguel-Ángel Sicilia ◽  
Claudia Hess ◽  
Klaus Stein ◽  
Christoph Schlieder

This chapter discusses the integration of information retrieval information from two sources: a social network and a document reference network, for enhancing reference based search engine rankings. In particular, current models of information retrieval are blind to the social context that surrounds information resources thus do not consider the trustworthiness of their authors when they present the query results to the users. Following this point we elaborate on the basic intuitions that highlight the contribution of the social context – as can be mined from social network positions for instance – into the improvement of the rankings provided in reference based search engines. A review on ranking models in web search engine retrieval along with social network metrics of importance such as prestige and centrality is provided as a background. Then a presentation of recent research models that utilize both contexts is provided along with a case study in the internet based encyclopedia Wikipedia based on the social network metrics.


2017 ◽  
Vol 14 (1) ◽  
pp. 123-151 ◽  
Author(s):  
Yanshan Wang ◽  
In-Chan Choi ◽  
Hongfang Liu

A generalized ensemble model (gEnM) for document ranking is proposed in this paper. The gEnM linearly combines the document retrieval models and tries to retrieve relevant documents at high positions. In order to obtain the optimal linear combination of multiple document retrieval models or rankers, an optimization program is formulated by directly maximizing the mean average precision. Both supervised and unsupervised learning algorithms are presented to solve this program. For the supervised scheme, two approaches are considered based on the data setting, namely batch and online setting. In the batch setting, we propose a revised Newton?s algorithm, gEnM.BAT, by approximating the derivative and Hessian matrix. In the online setting, we advocate a stochastic gradient descent (SGD) based algorithm-gEnM.ON. As for the unsupervised scheme, an unsupervised ensemble model (UnsEnM) by iteratively co-learning from each constituent ranker is presented. Experimental study on benchmark data sets verifies the effectiveness of the proposed algorithms. Therefore, with appropriate algorithms, the gEnM is a viable option in diverse practical information retrieval applications.


Author(s):  
Ndengabaganizi Tonny James ◽  
Rajkumar Kannan

It has been long time many people have realized the importance of archiving and finding information. With the advent of computers, it became possible to store large amounts of information; and finding useful information from such collections became a necessity. Over the last forty years, Information Retrieval (IR) has matured considerably. Several IR systems are used on an everyday basis by a wide variety of users. Information retrieval (IR) is generally concerned with the searching and retrieving of knowledge-based information from database. In this paper, we will discuss about the various models and techniques and for information retrieval. We are also providing the overview of traditional IR models.


2021 ◽  
Vol 55 (1) ◽  
pp. 1-2
Author(s):  
Bhaskar Mitra

Neural networks with deep architectures have demonstrated significant performance improvements in computer vision, speech recognition, and natural language processing. The challenges in information retrieval (IR), however, are different from these other application areas. A common form of IR involves ranking of documents---or short passages---in response to keyword-based queries. Effective IR systems must deal with query-document vocabulary mismatch problem, by modeling relationships between different query and document terms and how they indicate relevance. Models should also consider lexical matches when the query contains rare terms---such as a person's name or a product model number---not seen during training, and to avoid retrieving semantically related but irrelevant results. In many real-life IR tasks, the retrieval involves extremely large collections---such as the document index of a commercial Web search engine---containing billions of documents. Efficient IR methods should take advantage of specialized IR data structures, such as inverted index, to efficiently retrieve from large collections. Given an information need, the IR system also mediates how much exposure an information artifact receives by deciding whether it should be displayed, and where it should be positioned, among other results. Exposure-aware IR systems may optimize for additional objectives, besides relevance, such as parity of exposure for retrieved items and content publishers. In this thesis, we present novel neural architectures and methods motivated by the specific needs and challenges of IR tasks. We ground our contributions with a detailed survey of the growing body of neural IR literature [Mitra and Craswell, 2018]. Our key contribution towards improving the effectiveness of deep ranking models is developing the Duet principle [Mitra et al., 2017] which emphasizes the importance of incorporating evidence based on both patterns of exact term matches and similarities between learned latent representations of query and document. To efficiently retrieve from large collections, we develop a framework to incorporate query term independence [Mitra et al., 2019] into any arbitrary deep model that enables large-scale precomputation and the use of inverted index for fast retrieval. In the context of stochastic ranking, we further develop optimization strategies for exposure-based objectives [Diaz et al., 2020]. Finally, this dissertation also summarizes our contributions towards benchmarking neural IR models in the presence of large training datasets [Craswell et al., 2019] and explores the application of neural methods to other IR tasks, such as query auto-completion.


Sign in / Sign up

Export Citation Format

Share Document