scholarly journals Modeling and mining term association for improving biomedical information retrieval performance

2012 ◽  
Vol 13 (Suppl 9) ◽  
pp. S2 ◽  
Author(s):  
Qinmin Hu ◽  
Jimmy Huang ◽  
Xiaohua Hu
2019 ◽  
Vol 20 (S16) ◽  
Author(s):  
Bo Xu ◽  
Hongfei Lin ◽  
Liang Yang ◽  
Kan Xu ◽  
Yijia Zhang ◽  
...  

Abstract Background The number of biomedical research articles have increased exponentially with the advancement of biomedicine in recent years. These articles have thus brought a great difficulty in obtaining the needed information of researchers. Information retrieval technologies seek to tackle the problem. However, information needs cannot be completely satisfied by directly introducing the existing information retrieval techniques. Therefore, biomedical information retrieval not only focuses on the relevance of search results, but also aims to promote the completeness of the results, which is referred as the diversity-oriented retrieval. Results We address the diversity-oriented biomedical retrieval task using a supervised term ranking model. The model is learned through a supervised query expansion process for term refinement. Based on the model, the most relevant and diversified terms are selected to enrich the original query. The expanded query is then fed into a second retrieval to improve the relevance and diversity of search results. To this end, we propose three diversity-oriented optimization strategies in our model, including the diversified term labeling strategy, the biomedical resource-based term features and a diversity-oriented group sampling learning method. Experimental results on TREC Genomics collections demonstrate the effectiveness of the proposed model in improving the relevance and the diversity of search results. Conclusions The proposed three strategies jointly contribute to the improvement of biomedical retrieval performance. Our model yields more relevant and diversified results than the state-of-the-art baseline models. Moreover, our method provides a general framework for improving biomedical retrieval performance, and can be used as the basis for future work.


2018 ◽  
Vol 15 (6) ◽  
pp. 1797-1809 ◽  
Author(s):  
Bo Xu ◽  
Hongfei Lin ◽  
Yuan Lin ◽  
Yunlong Ma ◽  
Liang Yang ◽  
...  

Patents are critical intellectual assets for any competitive business. With ever increasing patent filings, effective patent prior art search has become an inevitably important task in patent retrieval which is a subfield of information retrieval (IR). The goal of the prior art search is to find and rank documents related to a query patent. Query formulation is a key step in prior art search in which patent structure is exploited to generate queries using various fields available in patent text. As patent encodes multiple technical domains, this work argues that technical domains and patent structure have their combined effect on the effectiveness of patent retrieval. The study uses international patent classification codes (IPC) to categorize query patents in eight technical domains and also explores eighteen different combination of patent fields to generate search queries. A total of 144 extensive retrieval experiments have been carried out using BM25 ranking algorithm. Retrieval performance is evaluated in terms of recall score of top 1000 records. Empirical results support our assumption. A two-way analysis of variance is also conducted to validate the hypotheses. The findings of this work may be helpful for patent information retrieval professionals to develop domain specific patent retrieval systems exploiting the patent structure.


1988 ◽  
Vol 32 (5) ◽  
pp. 301-305
Author(s):  
Robert D. Peters ◽  
Gloria T. Yastrop ◽  
Deborah A. Boehm-Davis

This research examined the effects two different cognitive individual differences (perceptual speed and spatial scanning) on information retrieval performance under two matched and two mismatched database format/query conditions. A graphic and a tabular form of an airline database were constructed, along with questions that required users to search through the database to determine the correct response. Two types of questions were designed - graphic and tabular. The data indicate that users are faster when the format of the information in the database matches the type of information needed to answer the question and that cognitive individual differences are differentially predictive of performance in the matched and mismatched conditions. Recommendations for database design are presented.


2015 ◽  
Vol 39 (1) ◽  
pp. 81-103
Author(s):  
Tho Thanh Quan ◽  
Xuan H. Luong ◽  
Thanh C. Nguyen ◽  
Hui Siu Cheung

Purpose – Most digital libraries (DL) are now available online. They also provide the Z39.50 standard protocol which allows computer-based systems to effectively retrieve information stored in the DLs. The major difficulty lies in inconsistency between database schemas of multiple DLs. The purpose of this paper is to present a system known as Argumentation-based Digital Library Search (ADLSearch), which facilitates information retrieval across multiple DLs. Design/methodology/approach – The proposed approach is based on argumentation theory for schema matching reconciliation from multiple schema matching algorithms. In addition, a distributed architecture is proposed for the ADLSearch system for information retrieval from multiple DLs. Findings – Initial performance results are promising. First, schema matching can improve the retrieval performance on DLs, as compared to the baseline technique. Subsequently, argumentation-based retrieval can yield better matching accuracy and retrieval efficiency than individual schema matching algorithms. Research limitations/implications – The work discussed in this paper has been implemented as a prototype supporting scholarly retrieval from about 800 DLs over the world. However, due to complexity of argumentation algorithm, the process of adding new DLs to the system cannot be performed in a real-time manner. Originality/value – In this paper, an argumentation-based approach is proposed for reconciling the conflicts from multiple schema matching algorithms in the context of information retrieval from multiple DL. Moreover, the proposed approach can also be applied for similar applications which require automatic mapping from multiple database schemas.


Sign in / Sign up

Export Citation Format

Share Document