Lexical Co-Occurrence and Contextual Window-Based Approach with Semantic Similarity for Query Expansion

2017 ◽  
Vol 13 (3) ◽  
pp. 57-78 ◽  
Author(s):  
Jagendra Singh ◽  
Rakesh Kumar

Query expansion (QE) is an efficient method for enhancing the efficiency of information retrieval system. In this work, we try to capture the limitations of pseudo-feedback based QE approach and propose a hybrid approach for enhancing the efficiency of feedback based QE by combining corpus-based, contextual based information of query terms, and semantic based knowledge of query terms. First of all, this paper explores the use of different corpus-based lexical co-occurrence approaches to select an optimal combination of query terms from a pool of terms obtained using pseudo-feedback based QE. Next, we explore semantic similarity approach based on word2vec for ranking the QE terms obtained from top pseudo-feedback documents. Further, we combine co-occurrence statistics, contextual window statistics, and semantic similarity based approaches together to select the best expansion terms for query reformulation. The experiments were performed on FIRE ad-hoc and TREC-3 benchmark datasets. The statistics of our proposed experimental results show significant improvement over baseline method.

Author(s):  
Jagendra Singh ◽  
Rakesh Kumar

Query expansion (QE) is an efficient method for enhancing the efficiency of information retrieval system. In this work, we try to capture the limitations of pseudo-feedback based QE approach and propose a hybrid approach for enhancing the efficiency of feedback based QE by combining corpus-based, contextual based information of query terms, and semantic based knowledge of query terms. First of all, this paper explores the use of different corpus-based lexical co-occurrence approaches to select an optimal combination of query terms from a pool of terms obtained using pseudo-feedback based QE. Next, we explore semantic similarity approach based on word2vec for ranking the QE terms obtained from top pseudo-feedback documents. Further, we combine co-occurrence statistics, contextual window statistics, and semantic similarity based approaches together to select the best expansion terms for query reformulation. The experiments were performed on FIRE ad-hoc and TREC-3 benchmark datasets. The statistics of our proposed experimental results show significant improvement over baseline method.


Author(s):  
Jiangning Wu ◽  
Hiroki Tanioka ◽  
Shizhu Wang ◽  
Donghua Pan ◽  
Kenichi Yamamoto ◽  
...  

Author(s):  
Bilel Elayeb ◽  
Ibrahim Bounhas ◽  
Oussama Ben Khiroun ◽  
Fabrice Evrard ◽  
Narjès Bellamine-BenSaoud

This paper presents a new possibilistic information retrieval system using semantic query expansion. The work is involved in query expansion strategies based on external linguistic resources. In this case, the authors exploited the French dictionary “Le Grand Robert”. First, they model the dictionary as a graph and compute similarities between query terms by exploiting the circuits in the graph. Second, the possibility theory is used by taking advantage of a double relevance measure (possibility and necessity) between the articles of the dictionary and query terms. Third, these two approaches are combined by using two different aggregation methods. The authors also benefit from an existing approach for reweighting query terms in the possibilistic matching model to improve the expansion process. In order to assess and compare the approaches, the authors performed experiments on the standard ‘LeMonde94’ test collection.


2018 ◽  
Author(s):  
Fabiano Tavares Da Silva ◽  
José Everardo Bessa Maia

This article presents Luppar, an Information Retrieval tool for closed collections of documents which uses a local distributional semantic model associated to each corpus. The system performs automatic query expansion using a combination of distributional semantic model and local context analysis and supports relevancy feedback. The performance of the system was evaluated in databases of different domains and presented results equal to or higher than those published in the literature.


2015 ◽  
Vol 6 (3) ◽  
Author(s):  
Resti Ludviani ◽  
Khadijah F. Hayati ◽  
Agus Zainal Arifin ◽  
Diana Purwitasari

Abstract. An appropriate selection term for expanding a query is very important in query expansion. Therefore, term selection optimization is added to improve query expansion performance on document retrieval system. This study proposes a new approach named Term Relatedness to Query-Entropy based (TRQE) to optimize weight in query expansion by considering semantic and statistic aspects from relevance evaluation of pseudo feedback to improve document retrieval performance. The proposed method has 3 main modules, they are relevace feedback, pseudo feedback, and document retrieval. TRQE is implemented in pseudo feedback module to optimize weighting term in query expansion. The evaluation result shows that TRQE can retrieve document with the highest result at precission of 100% and recall of 22,22%. TRQE for weighting optimization of query expansion is proven to improve retrieval document.     Keywords: TRQE, query expansion, term weighting, term relatedness to query, relevance feedback Abstrak..Pemilihan term yang tepat untuk memperluas queri merupakan hal yang penting pada query expansion. Oleh karena itu, perlu dilakukan optimasi penentuan term yang sesuai sehingga mampu meningkatkan performa query expansion pada system temu kembali dokumen. Penelitian ini mengajukan metode Term Relatedness to Query-Entropy based (TRQE), sebuah metode untuk mengoptimasi pembobotan pada query expansion dengan memperhatikan aspek semantic dan statistic dari penilaian relevansi suatu pseudo feedback sehingga mampu meningkatkan performa temukembali dokumen. Metode yang diusulkan memiliki 3 modul utama yaitu relevan feedback, pseudo feedback, dan document retrieval. TRQE diimplementasikan pada modul pseudo feedback untuk optimasi pembobotan term pada ekspansi query. Evaluasi hasil uji coba menunjukkan bahwa metode TRQE dapat melakukan temukembali dokumen dengan hasil terbaik pada precision  100% dan recall sebesar 22,22%.Metode TRQE untuk optimasi pembobotan pada query expansion terbukti memberikan pengaruh untuk meningkatkan relevansi pencarian dokumen.Kata Kunci: TRQE, ekspansi query, pembobotan term, term relatedness to query, relevance feedback


2017 ◽  
Vol 7 (3) ◽  
pp. 38-61 ◽  
Author(s):  
Ameni Yengui ◽  
Mahmoud Neji

In this article, the authors introduce their OSSVIRI information retrieval system which composed of three modules. In the analysis module, they have proposed a statistical technique exploiting the word frequency in order to extract the simple, compound and specific terms from the documents. In the indexing module, the authors used the ontology to associate the terms with their concepts, retrieve the relations between them and disambiguate the concepts to improve the sematic content of the documents. The concepts and relations are represented as a conceptual graph. In the research module, the authors have proposed a technique of users' query reformulation based on external resources and users' profiles and a technique of pairing based on the combined expansion of the requests and the documents guided by the context of the requirement in information and the documentary contents. This system is validated using the metrics from the research information and comparisons with existing statistical approach. The authors show that their approach achieves good results.


Sign in / Sign up

Export Citation Format

Share Document