Sub-document level information retrieval

2013 ◽  
Vol 47 (1) ◽  
pp. 65-66
Author(s):  
Sukomal Pal
2011 ◽  
pp. 118-146 ◽  
Author(s):  
Syed Sibte Raza Abidi

This chapter introduces intelligent information personalization as an approach to personalize the webbased information retrieval experiences based on an individual’s interests, needs and goals. We present intelligent techniques to dynamically compose new personalized information by adapting existing web-based information in line with a dynamic user-model, whilst simultaneously addressing linguistic, factual and functional requirements. This chapter will highlight the different facets, tasks and issues concerning intelligent information personalization to guide researchers in designing intelligent information personalization applications. The chapter presents intelligent methods that address information personalization at the content level as opposed to the traditional approaches that focus on interface level information personalization. To assist researchers in designing intelligent information personalization applications we present our information personalization framework, named AdWISE (Adaptive Webmediated Information and Services Environment), to demonstrate how to systematically integrate various intelligent methods to achieve information personalization. We will conclude with a commentary on the future outlook for intelligent information personalization.


Electronics ◽  
2021 ◽  
Vol 10 (7) ◽  
pp. 824
Author(s):  
Peng Wang ◽  
Zhenkai Deng ◽  
Ruilong Cui

Extracting financial events from numerous financial announcements is very important for investors to make right decisions. However, it is still challenging that event arguments always scatter in multiple sentences in a financial announcement, while most existing event extraction models only work in sentence-level scenarios. To address this problem, this paper proposes a relation-aware Transformer-based Document-level Joint Event Extraction model (TDJEE), which encodes relations between words into the context and leverages modified Transformer to capture document-level information to fill event arguments. Meanwhile, the absence of labeled data in financial domain could lead models be unstable in extraction results, which is known as the cold start problem. Furthermore, a Fonduer-based knowledge base combined with the distant supervision method is proposed to simplify the event labeling and provide high quality labeled training corpus for model training and evaluating. Experimental results on real-world Chinese financial announcement show that, compared with other models, TDJEE achieves competitive results and can effectively extract event arguments across multiple sentences.


2020 ◽  
Author(s):  
Sarthak Jain ◽  
Madeleine van Zuylen ◽  
Hannaneh Hajishirzi ◽  
Iz Beltagy

2021 ◽  
Vol 58 (4) ◽  
pp. 102563
Author(s):  
Klim Zaporojets ◽  
Johannes Deleu ◽  
Chris Develder ◽  
Thomas Demeester

2017 ◽  
Vol 68 (11) ◽  
pp. 2636-2648 ◽  
Author(s):  
Stephen Wu ◽  
Sijia Liu ◽  
Yanshan Wang ◽  
Tamara Timmons ◽  
Harsha Uppili ◽  
...  

JAMIA Open ◽  
2020 ◽  
Vol 3 (3) ◽  
pp. 395-404 ◽  
Author(s):  
Steven R Chamberlin ◽  
Steven D Bedrick ◽  
Aaron M Cohen ◽  
Yanshan Wang ◽  
Andrew Wen ◽  
...  

Abstract Objective Growing numbers of academic medical centers offer patient cohort discovery tools to their researchers, yet the performance of systems for this use case is not well understood. The objective of this research was to assess patient-level information retrieval methods using electronic health records for different types of cohort definition retrieval. Materials and Methods We developed a test collection consisting of about 100 000 patient records and 56 test topics that characterized patient cohort requests for various clinical studies. Automated information retrieval tasks using word-based approaches were performed, varying 4 different parameters for a total of 48 permutations, with performance measured using B-Pref. We subsequently created structured Boolean queries for the 56 topics for performance comparisons. In addition, we performed a more detailed analysis of 10 topics. Results The best-performing word-based automated query parameter settings achieved a mean B-Pref of 0.167 across all 56 topics. The way a topic was structured (topic representation) had the largest impact on performance. Performance not only varied widely across topics, but there was also a large variance in sensitivity to parameter settings across the topics. Structured queries generally performed better than automated queries on measures of recall and precision but were still not able to recall all relevant patients found by the automated queries. Conclusion While word-based automated methods of cohort retrieval offer an attractive solution to the labor-intensive nature of this task currently used at many medical centers, we generally found suboptimal performance in those approaches, with better performance obtained from structured Boolean queries. Future work will focus on using the test collection to develop and evaluate new approaches to query structure, weighting algorithms, and application of semantic methods.


Sign in / Sign up

Export Citation Format

Share Document