Analysis of retrieval models for cross language information retrieval

Author(s):  
Dasu Ujjwal ◽  
Prakhar Rastogi ◽  
Siril Siddhartha
Author(s):  
María-Dolores Olvera-Lobo

The Web stands today as the world´s largest source of public information. Its magnitude can also be perceived as a drawback in a certain sense, however: nowadays there is a generalized problem in retrieving documents that may be written in any language, but through queries expressed in a single source language. And although Information Retrieval (IR) depends on the availability of digital collections, this key aspect is no longer the only concern. It is time for the multicultural society of Internet to make use of new technologies such as Cross-Language Information Retrieval (CLIR). Whereas classical IR is a field that embraces retrieval models, evaluation, query languages and document indexing involving “small” collections of documents, modern IR tends to focus on Internet search engines, mark-up languages, multimedia contents, the distribution of collections, user interaction and multilingual systems. Thus, CLIR may border on work in the following fields: information retrieval, natural language processing, machine translation and abstracting, speech processing, the interpretation of document images, and human-computer interaction. “Given a query in any medium and any language, select relevant items from a multilingual multimedia collection which can be in any medium and any language, and present them in the style or order most likely to be useful to the querier, with identical or near identical objects in different media or languages appropriately identified” (Hull & Oard, 1997). This sentence sums up the main objective of CLIR, acknowledged as an independent research subfield roughly a decade ago, so that at present a number of international CLIR conferences take place in the world. The most importantof these are TREC (Text REtrieval Conference) in the US; NTCIR (NII-NACSIS Test Collection for IR Systems) in Asia; and CLEF (Cross-Language Evaluation Forum) in Europe. This chapter attempts to characterize the scenario of Cross-Language Information Retrieval as a domain, with special attention to the Web as a resource for multilingual research. The authors also manifest their point of view about some major directions for CLIR research in the future.


Sign in / Sign up

Export Citation Format

Share Document