Using UMLS-based Re-Weighting Terms as a Query Expansion Strategy

Author(s):  
Weizhong Zhu ◽  
Xuheng Xu ◽  
Xiaohua Hu ◽  
Il-Yeol Song ◽  
R.B. Allen
Author(s):  
Qian Gao ◽  
◽  
Young Im Cho ◽  

This paper proposes a multi-agent query refinement approach to realize personalized query expansion effective for academic paper retrieval in a Big Data environment. First, we use Hadoop as a platform to develop a formalized model to represent different types of large caches of data in order to analyze and process Big Data efficiently. Second, we use a client agent to verify user identities and monitor whether a device is ready to run a query-expanded task. We then use a query expansion agent to determine the domain that the initial query belongs to by applying a knowledgebased query expansion strategy and comprehensively considering users’ interests according to the intelligent devices they use by implementing a user-device-based query expansion strategy and a weighted query expansion strategy in order to obtain the optimized query expansion set. We compare our method with the conceptual retrieval method as well as other two lexical methods for query expansion, and we prove that our method has better average recall and average precision ratios.


2018 ◽  
Vol 54 (1) ◽  
pp. 1-13 ◽  
Author(s):  
Francis C. Fernández-Reyes ◽  
Jorge Hermosillo-Valadez ◽  
Manuel Montes-y-Gómez

2006 ◽  
Vol 13 (1) ◽  
pp. 75-90
Author(s):  
BOLETTE SANDFORD PEDERSEN

In this paper we focus on a specific search-related query expansion topic, namely search on Danish compounds and expansion to some of their synonymous phrases. Compounds constitute a specific issue in search, in particular in languages where they are written in one word, as is the case for Danish and the other Scandinavian languages. For such languages, expansion of the query compound into separate lemmas is a way of finding the often frequent alternative synonymous phrases in which the content of a compound can also be expressed. However, it is crucial to note that the number of irrelevant hits is generally very high when using this expansion strategy. The aim of this paper is therefore to examine how we can obtain better search results on split compounds, partly by looking at the internal structure of the original compound, partly by analyzing the context in which the split compound occurs. In this context, we pursue two hypotheses: (1) that some categories of compounds are more likely to have synonymous ‘split’ counterparts than others; and (2) that search results where both the search words (obtained by splitting the compound) occur in the same noun phrase, are more likely to contain a synonymous phrase to the original compound query. The search results from 410 enhanced compound queries are used as a test bed for our experiments. On these search results, we perform a shallow linguistic analysis and introduce a new, linguistically based threshold for retrieved hits. The results obtained by using this strategy demonstrate that compound splitting combined with a shallow linguistic analysis focusing on the argument structure of the compound head as well as on the recognition of NPs, can improve search by substantially bringing down the number of irrelevant hits.


2021 ◽  
pp. 016555152110406
Author(s):  
Yasir Hadi Farhan ◽  
Shahrul Azman Mohd Noah ◽  
Masnizah Mohd ◽  
Jaffar Atwan

One of the main issues associated with search engines is the query–document vocabulary mismatch problem, a long-standing problem in Information Retrieval (IR). This problem occurs when a user query does not match the content of stored documents, and it affects most search tasks. Automatic query expansion (AQE) is one of the most common approaches used to address this problem. Various AQE techniques have been proposed; these mainly involve finding synonyms or related words for the query terms. Word embedding (WE) is one of the methods that are currently receiving significant attention. Most of the existing AQE techniques focus on expanding the individual query terms rather the entire query during the expansion process, and this can lead to query drift if poor expansion terms are selected. In this article, we introduce Deep Averaging Networks (DANs), an architecture that feeds the average of the WE vectors produced by the Word2Vec toolkit for the terms in a query through several linear neural network layers. This average vector is assumed to represent the meaning of the query as a whole and can be used to find expansion terms that are relevant to the complete query. We explore the potential of DANs for AQE in Arabic document retrieval. We experiment with using DANs for AQE in the classic probabilistic BM25 model as well as for two recent expansion strategies: Embedding-Based Query Expansion approach (EQE1) and Prospect-Guided Query Expansion Strategy (V2Q). Although DANs did not improve all outcomes when used in the BM25 model, it outperformed all baselines when incorporated into the EQE1 and V2Q expansion strategies.


Author(s):  
Qinyuan Xiang ◽  
Weijiang Li ◽  
Hui Deng ◽  
Feng Wang

2019 ◽  
Author(s):  
KAIKAI MA ◽  
Peng Li ◽  
John Xin ◽  
Yongwei Chen ◽  
Zhijie Chen ◽  
...  

Creating crystalline porous materials with large pores is typically challenging due to undesired interpen-etration, staggered stacking, or weakened framework stability. Here, we report a pore size expansion strategy by self-recognizing π-π stacking interactions in a series of two-dimensional (2D) hydrogen–bonded organic frameworks (HOFs), HOF-10x (x=0,1,2), self-assembled from pyrene-based tectons with systematic elongation of π-conjugated molecular arms. This strategy successfully avoids interpene-tration or staggered stacking and expands the pore size of HOF materials to access mesoporous HOF-102, which features a surface area of ~ 2,500 m2/g and the largest pore volume (1.3 cm3/g) to date among all reported HOFs. More importantly, HOF-102 shows significantly enhanced thermal and chemical stability as evidenced by powder x-ray diffraction and N2 isotherms after treatments in chal-lenging conditions. Such stability enables the adsorption of dyes and cytochrome c from aqueous media by HOF-102 and affords a processible HOF-102/fiber composite for the efficient photochemical detox-ification of a mustard gas simulant.


Sign in / Sign up

Export Citation Format

Share Document