scholarly journals Incorporating Semantic Word Representations into Query Expansion for Microblog Information Retrieval

2019 ◽  
Vol 48 (4) ◽  
pp. 626-636
Author(s):  
Bo Xu ◽  
Hongfei Lin ◽  
Yuan Lin ◽  
Kan Xu ◽  
Lin Wang ◽  
...  

Microblog information retrieval has attracted much attention of researchers to capture the desired information in daily communications on social networks. Since the contents of microblogs are always non-standardized and flexible, including many popular Internet expressions, the retrieval accuracy of microblogs has much room for improvement. To enhance microblog information retrieval, we propose a novel query expansion method to enrich user queries with semantic word representations. In our method, we use a neural network model to map each word in the corpus to a low-dimensional vector representation. The mapped word vectors satisfy the algebraic vector addition operation, and the new vector obtained by the addition operation can express some common attributes of the two words. In this sense, we represent keywords in user queries as vectors, sum all the keyword vectors, and use the obtained query vectors to select the expansion words. In addition, we also combine the traditional pseudo-relevance feedback query expansion method with the proposed query expansion method. Experimental results show that the proposed method is effective and reduces noises in the expanded query, which improves the accuracy of microblog retrieval.

2016 ◽  
Vol 40 (7) ◽  
pp. 1054-1070 ◽  
Author(s):  
Shihchieh Chou ◽  
Zhangting Dai

Purpose Conventional studies mainly classify a term’s appearance in the retrieved documents as either relevant or irrelevant for application. The purpose of this paper is to differentiate the term’s appearances in the retrieved documents in more detailed situations to generate relevance information and demonstrate the applicability of the derived information in combination with current methods of query expansion. Design/methodology/approach A method was designed first to utilize the derived information owing to term appearance differentiation within a conventional query expansion approach that has been proven as an effective technology in the enhancement of information retrieval. Then, an information retrieval system was developed to demonstrate the realization and sustain the study of the method. Formal tests were conducted to examine the distinguishing capability of the proposed information utilized in the method. Findings The experimental results show that substantial differences in performances can be achieved between the proposed method and the conventional query expansion method alone. Practical implications Since the proposed information resides at the bottom of the information hierarchy of relevance feedback, any technology regarding the application of relevance feedback information could consider the utilization of this piece of information. Originality/value The importance of the study is the disclosure of the applicability of the proposed information beyond current usage of term appearances in relevant/irrelevant documents and the initiation of a query expansion technology in the application of this information.


Author(s):  
FENG ZHAO ◽  
FEI FANG ◽  
FENGWEI YAN ◽  
HAI JIN ◽  
QIN ZHANG

Performance of information retrieval (IR) systems greatly relies on textual keywords and retrieval documents. Inaccurate and incomplete retrieval results are always induced by query drift and ignorance of semantic relationship among terms. Expanding retrieval approach attempts to incorporate expansion terms into original query, such as unexplored words combing from pseudo-relevance feedback (PRF) or relevance feedback documents semantic words extracting from external corpus etc. In this paper a semantic analysis-based query expansion method for information retrieval using WordNet and Wikipedia as corpus are proposed. We derive semantic-related words from human knowledge repositories such as WordNet and Wikipedia, which are combined with words filtered by semantic mining from PRF document. Our approach automatically generates new semantic-based query from original query of IR. Experimental results on TREC datasets and Google search engine show that performance of information retrieval can be significantly improved using proposed method over previous results.


2021 ◽  
pp. 1-11
Author(s):  
Zhinan Gou ◽  
Yan Li

With the development of the web 2.0 communities, information retrieval has been widely applied based on the collaborative tagging system. However, a user issues a query that is often a brief query with only one or two keywords, which leads to a series of problems like inaccurate query words, information overload and information disorientation. The query expansion addresses this issue by reformulating each search query with additional words. By analyzing the limitation of existing query expansion methods in folksonomy, this paper proposes a novel query expansion method, based on user profile and topic model, for search in folksonomy. In detail, topic model is constructed by variational antoencoder with Word2Vec firstly. Then, query expansion is conducted by user profile and topic model. Finally, the proposed method is evaluated by a real dataset. Evaluation results show that the proposed method outperforms the baseline methods.


2015 ◽  
Vol 5 (4) ◽  
pp. 31-45 ◽  
Author(s):  
Jagendra Singh ◽  
Aditi Sharan

Pseudo-relevance feedback (PRF) is a type of relevance feedback approach of query expansion that considers the top ranked retrieved documents as relevance feedback. In this paper the authors focus is to capture the limitation of co-occurrence and PRF based query expansion approach and the authors proposed a hybrid method to improve the performance of PRF based query expansion by combining query term co-occurrence and query terms contextual information based on corpus of top retrieved feedback documents in first pass. Firstly, the paper suggests top retrieved feedback documents based query term co-occurrence approach to select an optimal combination of query terms from a pool of terms obtained using PRF based query expansion. Second, contextual window based approach is used to select the query context related terms from top feedback documents. Third, comparisons were made among baseline, co-occurrence and contextual window based approaches using different performance evaluating metrics. The experiments were performed on benchmark data and the results show significant improvement over baseline approach.


2013 ◽  
Vol 791-793 ◽  
pp. 1593-1596
Author(s):  
Min Juan Zhong

Although pseudo relevant feedback is an effective query expansion method, query drift away from the topic has been occurred frequently. Therefore, the first important problem is how to identify relevant documents in the top retrieved set and form the good feedback source. In this paper, an effective XML identifying feedback documents method is proposed, in which a two-stage ranking model is presented and the relevant XML documents are found. The experiment results show that the proposed method is reasonable and the quality of feedback source is ensured.


Author(s):  
Jagendra Singh ◽  
Aditi Sharan

Pseudo-relevance feedback (PRF) is a type of relevance feedback approach of query expansion that considers the top ranked retrieved documents as relevance feedback. In this paper the authors focus is to capture the limitation of co-occurrence and PRF based query expansion approach and the authors proposed a hybrid method to improve the performance of PRF based query expansion by combining query term co-occurrence and query terms contextual information based on corpus of top retrieved feedback documents in first pass. Firstly, the paper suggests top retrieved feedback documents based query term co-occurrence approach to select an optimal combination of query terms from a pool of terms obtained using PRF based query expansion. Second, contextual window based approach is used to select the query context related terms from top feedback documents. Third, comparisons were made among baseline, co-occurrence and contextual window based approaches using different performance evaluating metrics. The experiments were performed on benchmark data and the results show significant improvement over baseline approach.


Author(s):  
Aicha Ghoulam ◽  
Fatiha Barigou ◽  
Ghalem Belalem ◽  
Farid Meziane

This article describes how many users' queries contain references to named entities, and this is particularly true in the medical field. Doctors express their information needs using medical entities as they are element rich with information that helps better target relevant documents. At the same time, many resources have been recognized as a large container of medical entities and relationships between them such as clinical reports; which are medical texts written by doctors. In this article, the authors present a query expansion method that uses medical entities and their semantic relations in the query context based on an external resource in OWL. The goal of this method is to evaluate the effectiveness of an information retrieval system to support doctors in accessing easily relevant information. Experiments on a collection of real clinical reports show that their approach reveals interesting improvements in precision, recall and MAP in medical information retrieval.


Sign in / Sign up

Export Citation Format

Share Document