Modeling and mining term association for improving biomedical information retrieval performance

Abstract Background The number of biomedical research articles have increased exponentially with the advancement of biomedicine in recent years. These articles have thus brought a great difficulty in obtaining the needed information of researchers. Information retrieval technologies seek to tackle the problem. However, information needs cannot be completely satisfied by directly introducing the existing information retrieval techniques. Therefore, biomedical information retrieval not only focuses on the relevance of search results, but also aims to promote the completeness of the results, which is referred as the diversity-oriented retrieval. Results We address the diversity-oriented biomedical retrieval task using a supervised term ranking model. The model is learned through a supervised query expansion process for term refinement. Based on the model, the most relevant and diversified terms are selected to enrich the original query. The expanded query is then fed into a second retrieval to improve the relevance and diversity of search results. To this end, we propose three diversity-oriented optimization strategies in our model, including the diversified term labeling strategy, the biomedical resource-based term features and a diversity-oriented group sampling learning method. Experimental results on TREC Genomics collections demonstrate the effectiveness of the proposed model in improving the relevance and the diversity of search results. Conclusions The proposed three strategies jointly contribute to the improvement of biomedical retrieval performance. Our model yields more relevant and diversified results than the state-of-the-art baseline models. Moreover, our method provides a general framework for improving biomedical retrieval performance, and can be used as the basis for future work.

Download Full-text

Boosting Biomedical Information Retrieval Performance through Citation Graph: An Empirical Study

Advances in Knowledge Discovery and Data Mining - Lecture Notes in Computer Science ◽

10.1007/978-3-642-01307-2_100 ◽

2009 ◽

pp. 949-956 ◽

Cited By ~ 4

Author(s):

Xiaoshi Yin ◽

Xiangji Huang ◽

Qinmin Hu ◽

Zhoujun Li

Keyword(s):

Information Retrieval ◽

Empirical Study ◽

Retrieval Performance ◽

Biomedical Information Retrieval ◽

Citation Graph

Download Full-text

JASIST special issue on biomedical information retrieval

Journal of the Association for Information Science and Technology ◽

10.1002/asi.23991 ◽

2017 ◽

Vol 69 (3) ◽

pp. 500-500

Keyword(s):

Information Retrieval ◽

Special Issue ◽

Biomedical Information Retrieval

Download Full-text

Improve Biomedical Information Retrieval Using Modified Learning to Rank Methods

IEEE/ACM Transactions on Computational Biology and Bioinformatics ◽

10.1109/tcbb.2016.2578337 ◽

2018 ◽

Vol 15 (6) ◽

pp. 1797-1809 ◽

Cited By ~ 6

Author(s):

Bo Xu ◽

Hongfei Lin ◽

Yuan Lin ◽

Yunlong Ma ◽

Liang Yang ◽

...

Keyword(s):

Information Retrieval ◽

Learning To Rank ◽

Biomedical Information Retrieval

Download Full-text

Retrieval performance in Ferret a conceptual information retrieval system

Proceedings of the 14th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '91 ◽

10.1145/122860.122896 ◽

1991 ◽

Cited By ~ 12

Author(s):

Michael L. Mauldin

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System ◽

Retrieval Performance ◽

Conceptual Information

Download Full-text

Supporting BioMedical Information Retrieval: The BioTracer Approach

Transactions on Large-Scale Data- and Knowledge-Centered Systems IV - Lecture Notes in Computer Science ◽

10.1007/978-3-642-23740-9_4 ◽

2011 ◽

pp. 73-94 ◽

Cited By ~ 3

Author(s):

Heri Ramampiaro ◽

Chen Li

Keyword(s):

Information Retrieval ◽

Biomedical Information Retrieval

Download Full-text

Effect of Technical Domains and Patent Structure on Patent Information Retrieval

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1922.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 6067-6074

Keyword(s):

Information Retrieval ◽

Retrieval Performance ◽

Patent Retrieval ◽

Patent Classification ◽

Domain Specific ◽

Patent Information ◽

Prior Art ◽

Retrieval Systems ◽

International Patent ◽

Intellectual Assets

Patents are critical intellectual assets for any competitive business. With ever increasing patent filings, effective patent prior art search has become an inevitably important task in patent retrieval which is a subfield of information retrieval (IR). The goal of the prior art search is to find and rank documents related to a query patent. Query formulation is a key step in prior art search in which patent structure is exploited to generate queries using various fields available in patent text. As patent encodes multiple technical domains, this work argues that technical domains and patent structure have their combined effect on the effectiveness of patent retrieval. The study uses international patent classification codes (IPC) to categorize query patents in eight technical domains and also explores eighteen different combination of patent fields to generate search queries. A total of 144 extensive retrieval experiments have been carried out using BM25 ranking algorithm. Retrieval performance is evaluated in terms of recall score of top 1000 records. Empirical results support our assumption. A two-way analysis of variance is also conducted to validate the hypotheses. The findings of this work may be helpful for patent information retrieval professionals to develop domain specific patent retrieval systems exploiting the patent structure.

Download Full-text

Multiversion Information Retrieval: Performance Evaluation Of Neural Networks VS. Dempster-Shafer Model

Intelligent Systems Third Golden West International Conference ◽

10.1007/978-94-011-7108-3_56 ◽

1995 ◽

pp. 537-545 ◽

Cited By ~ 1

Author(s):

G. V. Meghabghab ◽

D. B. Meghabghab

Keyword(s):

Neural Networks ◽

Information Retrieval ◽

Performance Evaluation ◽

Retrieval Performance

Download Full-text

Predicting Information Retrieval Performance

Proceedings of the Human Factors Society Annual Meeting ◽

10.1177/154193128803200513 ◽

1988 ◽

Vol 32 (5) ◽

pp. 301-305

Author(s):

Robert D. Peters ◽

Gloria T. Yastrop ◽

Deborah A. Boehm-Davis

Keyword(s):

Information Retrieval ◽

Individual Differences ◽

Correct Response ◽

Database Design ◽

Tabular Form ◽

Perceptual Speed ◽

Retrieval Performance

This research examined the effects two different cognitive individual differences (perceptual speed and spatial scanning) on information retrieval performance under two matched and two mismatched database format/query conditions. A graphic and a tabular form of an airline database were constructed, along with questions that required users to search through the database to determine the correct response. Two types of questions were designed - graphic and tabular. The data indicate that users are faster when the format of the information in the database matches the type of information needed to answer the question and that cognitive individual differences are differentially predictive of performance in the matched and mismatched conditions. Recommendations for database design are presented.

Download Full-text

Argumentation-based schema matching for multiple digital libraries

Online Information Review ◽

10.1108/oir-02-2014-0023 ◽

2015 ◽

Vol 39 (1) ◽

pp. 81-103

Author(s):

Tho Thanh Quan ◽

Xuan H. Luong ◽

Thanh C. Nguyen ◽

Hui Siu Cheung

Keyword(s):

Information Retrieval ◽

Digital Libraries ◽

Design Methodology ◽

Schema Matching ◽

Retrieval Performance ◽

Content Type ◽

Retrieval Efficiency ◽

Automatic Mapping ◽

Computer Based ◽

Performance Results

Purpose – Most digital libraries (DL) are now available online. They also provide the Z39.50 standard protocol which allows computer-based systems to effectively retrieve information stored in the DLs. The major difficulty lies in inconsistency between database schemas of multiple DLs. The purpose of this paper is to present a system known as Argumentation-based Digital Library Search (ADLSearch), which facilitates information retrieval across multiple DLs. Design/methodology/approach – The proposed approach is based on argumentation theory for schema matching reconciliation from multiple schema matching algorithms. In addition, a distributed architecture is proposed for the ADLSearch system for information retrieval from multiple DLs. Findings – Initial performance results are promising. First, schema matching can improve the retrieval performance on DLs, as compared to the baseline technique. Subsequently, argumentation-based retrieval can yield better matching accuracy and retrieval efficiency than individual schema matching algorithms. Research limitations/implications – The work discussed in this paper has been implemented as a prototype supporting scholarly retrieval from about 800 DLs over the world. However, due to complexity of argumentation algorithm, the process of adding new DLs to the system cannot be performed in a real-time manner. Originality/value – In this paper, an argumentation-based approach is proposed for reconciling the conflicts from multiple schema matching algorithms in the context of information retrieval from multiple DL. Moreover, the proposed approach can also be applied for similar applications which require automatic mapping from multiple database schemas.

Download Full-text