Exploring the Importance of Entities in Semantic Ranking

In recent years, entity-based ranking models have led to exciting breakthroughs in the research of information retrieval. Compared with traditional retrieval models, entity-based representation enables a better understanding of queries and documents. However, the existing entity-based models neglect the importance of entities in a document. This paper attempts to explore the effects of the importance of entities in a document. Specifically, the dataset analysis is conducted which verifies the correlation between the importance of entities in a document and document ranking. Then, this paper enhances two entity-based models—toy model and Explicit Semantic Ranking model (ESR)—by considering the importance of entities. In contrast to the existing models, the enhanced models assign the weights of entities according to their importance. Experimental results show that the enhanced toy model and ESR can outperform the two baselines by as much as 4.57% and 2.74% on NDCG@20 respectively, and further experiments reveal that the strength of the enhanced models is more evident on long queries and the queries where ESR fails, confirming the effectiveness of taking the importance of entities into account.

Download Full-text

When time meets information retrieval: Past proposals, current plans and future trends

Journal of Information Science ◽

10.1177/0165551515607277 ◽

2016 ◽

Vol 42 (6) ◽

pp. 725-747 ◽

Cited By ~ 4

Author(s):

Bilel Moulahi ◽

Lynda Tamine ◽

Sadok Ben Yahia

Keyword(s):

Information Retrieval ◽

Web Search ◽

Specific Information ◽

Time Dimension ◽

Retrieval Models ◽

Document Ranking ◽

Common Information ◽

Retrieval Effectiveness ◽

Ranking Models ◽

Tremendous Amount

With the advent of Web search and the large amount of data published on the Web sphere, a tremendous amount of documents become strongly time-dependent. In this respect, the time dimension has been extensively exploited as a highly important relevance criterion to improve the retrieval effectiveness of document ranking models. Thus, a compelling research interest is going on the temporal information retrieval realm, which gives rise to several temporal search applications. In this article, we intend to provide a scrutinizing overview of time-aware information retrieval models. We specifically put the focus on the use of timeliness and its impact on the global value of relevance as well as on the retrieval effectiveness. First, we attempt to motivate the importance of temporal signals, whenever combined with other relevance features, in accounting for document relevance. Then, we review the relevant studies standing at the crossroads of both information retrieval and time according to three common information retrieval aspects: the query level, the document content level and the document ranking model level. We organize the related temporal-based approaches around specific information retrieval tasks and regarding the task at hand, we emphasize the importance of results presentation and particularly timelines to the end user. We also report a set of relevant research trends and avenues that can be explored in the future.

Download Full-text

A personalized search using a semantic distance measure in a graph-based ranking model

Journal of Information Science ◽

10.1177/0165551511420220 ◽

2011 ◽

Vol 37 (6) ◽

pp. 614-636 ◽

Cited By ~ 10

Author(s):

Mariam Daoud ◽

Lynda Tamine ◽

Mohand Boughanem

Keyword(s):

Distance Measure ◽

Similarity Measures ◽

User Profile ◽

Semantic Distance ◽

User Interest ◽

Document Ranking ◽

Search Results ◽

Ranking Models ◽

Ranking Model ◽

Extended Graph

The goal of search personalization is to tailor search results to individual users by taking into account their profiles, which include their particular interests and preferences. As these latter are multiple and change over time, personalization becomes effective when the search process takes into account the current user interest. This article presents a search personalization approach that models a semantic user profile and focuses on a personalized document ranking model based on an extended graph-based distance measure. Documents and user profiles are both represented by graphs of concepts issued from predefined web ontology, namely, the Open Directory Project directory (ODP). Personalization is then based on reordering the search results of related queries according to a graph-based document ranking model. This former is based on using a graph-based distance measure combining the minimum common supergraph and the maximum common subgraph between the document and the user profile graphs. We extend this measure in order to take into account a semantic recovery at exact and approximate concept-level matching. Experimental results show the effectiveness of our personalized graph-based ranking model compared with Yahoo and different personalized ranking models performed using classical graph-based measures or vector-space similarity measures.

Download Full-text

Generalized ensemble model for document ranking in information retrieval

Computer Science and Information Systems ◽

10.2298/csis160229042w ◽

2017 ◽

Vol 14 (1) ◽

pp. 123-151 ◽

Cited By ~ 2

Author(s):

Yanshan Wang ◽

In-Chan Choi ◽

Hongfang Liu

Keyword(s):

Information Retrieval ◽

Hessian Matrix ◽

Document Retrieval ◽

Stochastic Gradient Descent ◽

Data Sets ◽

Ensemble Model ◽

Retrieval Models ◽

Document Ranking ◽

Optimal Linear ◽

Online Setting

A generalized ensemble model (gEnM) for document ranking is proposed in this paper. The gEnM linearly combines the document retrieval models and tries to retrieve relevant documents at high positions. In order to obtain the optimal linear combination of multiple document retrieval models or rankers, an optimization program is formulated by directly maximizing the mean average precision. Both supervised and unsupervised learning algorithms are presented to solve this program. For the supervised scheme, two approaches are considered based on the data setting, namely batch and online setting. In the batch setting, we propose a revised Newton?s algorithm, gEnM.BAT, by approximating the derivative and Hessian matrix. In the online setting, we advocate a stochastic gradient descent (SGD) based algorithm-gEnM.ON. As for the unsupervised scheme, an unsupervised ensemble model (UnsEnM) by iteratively co-learning from each constituent ranker is presented. Experimental study on benchmark data sets verifies the effectiveness of the proposed algorithms. Therefore, with appropriate algorithms, the gEnM is a viable option in diverse practical information retrieval applications.

Download Full-text

A Survey on Information Retrieval Models, Techniques and Applications

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i7.90 ◽

2017 ◽

Vol 7 (7) ◽

pp. 16 ◽

Cited By ~ 1

Author(s):

Ndengabaganizi Tonny James ◽

Rajkumar Kannan

Keyword(s):

Information Retrieval ◽

Retrieval Models ◽

Knowledge Based ◽

Long Time

It has been long time many people have realized the importance of archiving and finding information. With the advent of computers, it became possible to store large amounts of information; and finding useful information from such collections became a necessity. Over the last forty years, Information Retrieval (IR) has matured considerably. Several IR systems are used on an everyday basis by a wide variety of users. Information retrieval (IR) is generally concerned with the searching and retrieving of knowledge-based information from database. In this paper, we will discuss about the various models and techniques and for information retrieval. We are also providing the overview of traditional IR models.

Download Full-text