Enhanced word embedding similarity measures using fuzzy rules for query expansion

Applying Similarity Measures to Improve Query Expansion

Iraqi Journal of Science ◽

10.24996/ijs.2021.62.6.31 ◽

2021 ◽

pp. 2053-2063

Author(s):

Wajih A. Ghani A. Hussain

Keyword(s):

Information Technologies ◽

Query Expansion ◽

Similarity Measures ◽

Relevant Information ◽

Jaccard Similarity ◽

Average Precision ◽

Average Recall ◽

Retrieval Efficiency ◽

Data Source ◽

F Measure

The huge evolving in the information technologies, especially in the few last decades, has produced an increase in the volume of data on the World Wide Web, which is still growing significantly. Retrieving the relevant information on the Internet or any data source with a query created by a few words has become a big challenge. To override this, query expansion (QE) has an important function in improving the information retrieval (IR), where the original query of user is recreated to a new query by appending new related terms with the same importance. One of the problems of query expansion is the choosing of suitable terms. This problem leads to another challenge of how to retrieve the important documents with high precision, high recall, and high F measure. In this paper, we solve this problem through applying different similarity measures with the use of English WordNet. The obtained results proved that, with a suitable selection method, we are able to take advantage of English WordNet to improve the retrieval efficiency. The work proposed in this paper is extracting the terms from all the documents and query, then applying the following steps: preprocessing, expanding the query based on English WordNet, selecting the best terms, weighting of term, and finally using the cosine similarity and Jaccard similarity to obtain the relevant documents. Our practical results were applied on the DUC2002 dataset that contains 559 documents distributed over several categories. The average precision of cosine (for random queries) = 100% whereas the average precision of Jaccard = 84.4 %, and the average recall of cosine = 86.8% whereas the average recall of Jaccard = 73.4%. The average f-measure of cosine = 92%, whereas the average f-measure of Jaccard = 76%.

Download Full-text

Word Embedding-Based Topic Similarity Measures

Natural Language Processing and Information Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-030-80599-9_4 ◽

2021 ◽

pp. 33-45

Author(s):

Silvia Terragni ◽

Elisabetta Fersini ◽

Enza Messina

Keyword(s):

Similarity Measures ◽

Word Embedding

Download Full-text

Merchandise Recommendation for Retail Events with Word Embedding Weighted Tf-idf and Dynamic Query Expansion

The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval - SIGIR '18 ◽

10.1145/3209978.3210202 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ted Tao Yuan ◽

Zezhong Zhang

Keyword(s):

Query Expansion ◽

Word Embedding ◽

Dynamic Query

Download Full-text

Query Expansion for Sentence Retrieval Using Pseudo Relevance Feedback and Word Embedding

Lecture Notes in Computer Science - Experimental IR Meets Multilinguality, Multimodality, and Interaction ◽

10.1007/978-3-319-65813-1_8 ◽

2017 ◽

pp. 97-103 ◽

Cited By ~ 1

Author(s):

Piyush Arora ◽

Jennifer Foster ◽

Gareth J. F. Jones

Keyword(s):

Relevance Feedback ◽

Query Expansion ◽

Word Embedding ◽

Sentence Retrieval ◽

Pseudo Relevance Feedback

Download Full-text

A Novel Approach to Query Expansion based on Semantic Similarity Measures

Proceedings of 4th International Conference on Data Management Technologies and Applications ◽

10.5220/0005579703440353 ◽

2015 ◽

Author(s):

Flora Amato ◽

Aniello De Santo ◽

Francesco Gargiulo ◽

Vincenzo Moscato ◽

Fabio Persia ◽

...

Keyword(s):

Semantic Similarity ◽

Query Expansion ◽

Similarity Measures ◽

Novel Approach

Download Full-text

Enhancing Query Expansion Method Using Word Embedding

2019 IEEE 9th International Conference on System Engineering and Technology (ICSET) ◽

10.1109/icsengt.2019.8906317 ◽

2019 ◽

Author(s):

Nuhu Yusuf ◽

Mohd Amin Mohd Yunus ◽

Norfaradilla Wahid ◽

Noorhaniza Wahid ◽

Nazri Mohd Nawi ◽

...

Keyword(s):

Query Expansion ◽

Expansion Method ◽

Word Embedding

Download Full-text

Query Expansion menggunakan Word Embedding dan Pseudo Relevance Feedback

Register Jurnal Ilmiah Teknologi Sistem Informasi ◽

10.26594/register.v5i1.1385 ◽

2019 ◽

Vol 5 (1) ◽

pp. 47 ◽

Cited By ~ 1

Author(s):

Evan Tanuwijaya ◽

Safri Adam ◽

Mohammad Fatoni Anggris ◽

Agus Zainal Arifin

Keyword(s):

Relevance Feedback ◽

Query Expansion ◽

Relevant Information ◽

Word Embedding ◽

Natural Languages ◽

F Measure ◽

Pseudo Relevance Feedback ◽

Better Than

Kata kunci merupakan hal terpenting dalam mencari sebuah informasi. Penggunaan kata kunci yang tepat menghasilkan informasi yang relevan. Saat penggunaannya sebagai query, pengguna menggunakan bahasa yang alami, sehingga terdapat kata di luar dokumen jawaban yang telah disiapkan oleh sistem. Sistem tidak dapat memproses bahasa alami secara langsung yang dimasukkan oleh pengguna, sehingga diperlukan proses untuk mengolah kata-kata tersebut dengan mengekspansi setiap kata yang dimasukkan pengguna yang dikenal dengan Query Expansion (QE). Metode QE pada penelitian ini menggunakan Word Embedding karena hasil dari Word Embedding dapat memberikan kata-kata yang sering muncul bersama dengan kata-kata dalam query. Hasil dari word embedding dipakai sebagai masukan pada pseudo relevance feedback untuk diperkaya berdasarkan dokumen jawaban yang telah ada. Metode QE diterapkan dan diuji coba pada aplikasi chatbot. Hasil dari uji coba metode QE yang diterapkan pada chatbot didapatkan nilai recall, precision, dan F-measure masing-masing 100%; 70% dan 82,35 %. Hasil tersebut meningkat 1,49% daripada chatbot tanpa menggunakan QE yang pernah dilakukan sebelumnya yang hanya meraih akurasi sebesar 68,51%. Berdasarkan hasil pengukuran tersebut, QE menggunakan word embedding dan pseudo relevance feedback pada chatbot dapat mengatasi query masukan dari pengguna yang ambigu dan alami, sehingga dapat memberikan jawaban yang relevan kepada pengguna. Keywords are the most important words and phrases used to obtain relevant information on content. Although users make use of natural languages, keywords are processed as queries by the system due to its inability to process. The language directly entered by the user is known as query expansion (QE). The proposed QE in this research uses word embedding owing to its ability to provide words that often appear along with those in the query. The results are used as inputs to the pseudo relevance feedback to be enriched based on the existing documents. This method is also applied to the chatbot application and precision, and F-measure values of the results obtained were 100%, 70%, 82.35% respectively. The results are 1.49% better than chatbot without using QE with 68.51% accuracy. Based on the results of these measurements, QE using word embedding and pseudo which gave relevance feedback in chatbots can resolve ambiguous and natural user’s input queries thereby enabling the system retrieve relevant answers.

Download Full-text

Smart combination of web measures for solving semantic similarity problems

10.31219/osf.io/g2pwx ◽

2017 ◽

Author(s):

Jorge Martinez‐Gil ◽

José F. Aldana‐Montes

Keyword(s):

Data Integration ◽

Semantic Similarity ◽

Query Expansion ◽

Similarity Measures ◽

Text Clustering ◽

The Past

Semantic similarity measures are very important in many computer‐related fields. Previous works on applications such as data integration, query expansion, tag refactoring or text clustering have used some semantic similarity measures in the past. Despite the usefulness of semantic similarity measures in these applications, the problem of measuring the similarity between two text expressions remains a key challenge. This paper aims to address this issue.

Download Full-text