Result merging methods in distributed information retrieval with overlapping databases

2007 ◽  
Vol 10 (3) ◽  
pp. 297-319 ◽  
Author(s):  
Shengli Wu ◽  
Sally McClean
Author(s):  
Benjamin Ghansah ◽  
Sheng Li Wu ◽  
Nathaniel Ekow Ghansah

The top-ranked documents from various information sources that are merged together into a unified ranked list may cover the same piece of relevant information, and cannot satisfy different user needs. Result diversification(RD) solves this problem by diversifying results to cover more information needs. In recent times, RD has attracted much attention as a means of increasing user satisfaction in general purpose search engines. A myriad of approaches have been proposed in the related works for the diversification problem. However, no concrete study of search result diversification has been done in a Distributed Information Retrieval(DIR) setting. In this paper, we survey, classify and propose a theoretical framework that aims at improving diversification at the result merging phase of a DIR environment.


Author(s):  
Benjamin Ghansah ◽  
Sheng Li Wu

Opposed to centralized search where Websites are crawled and indexed, Distributed Information Retrieval (DIR), also known as Federated Search, is a powerful way to comprehensively search multiple databases in real-time simultaneously. DIR is preferred to centralized search environments in a number of ways, characteristically among them are: 1. the diversity of resources that are made available; 2. improving scalability and reducing server load and network traffic; 3. the leverage of accessing the hidden or deep Web.There are three major phases/tasks of a DIR (i) resource description or collection representation (ii) resource selection and (iii) result merging. This paper aims at providing a comprehensive review on the various phases of DIR and also some current strategies being recommended in enhancing and improving the smooth implementation of a DIR system.


Sign in / Sign up

Export Citation Format

Share Document