scholarly journals Inferential language models for information retrieval

2006 ◽  
Vol 5 (4) ◽  
pp. 296-322 ◽  
Author(s):  
Jian-Yun Nie ◽  
Guihong Cao ◽  
Jing Bai
2012 ◽  
pp. 138-173
Author(s):  
Edmond Lassalle ◽  
Emmanuel Lassalle

Robertson and Spärck Jones pioneered experimental probabilistic models (Binary Independence Model) with both a typology generalizing the Boolean model, a frequency counting to calculate elementary weightings, and their combination into a global probabilistic estimation. However, this model did not consider indexing terms dependencies. An extension to mixture models (e.g., using a 2-Poisson law) made it possible to take into account these dependencies from a macroscopic point of view (BM25), as well as a shallow linguistic processing of co-references. New approaches (language models, for example “bag of words” models, probabilistic dependencies between requests and documents, and consequently Bayesian inference using Dirichlet prior conjugate) furnished new solutions for documents structuring (categorization) and for index smoothing. Presently, in these probabilistic models the main issues have been addressed from a formal point of view only. Thus, linguistic properties are neglected in the indexing language. The authors examine how a linguistic and semantic modeling can be integrated in indexing languages and set up a hybrid model that makes it possible to deal with different information retrieval problems in a unified way.


Sign in / Sign up

Export Citation Format

Share Document