Leveraging medical context to recommend semantically similar terms for chart reviews

Abstract Background Information retrieval (IR) help clinicians answer questions posed to large collections of electronic medical records (EMRs), such as how best to identify a patient’s cancer stage. One of the more promising approaches to IR for EMRs is to expand a keyword query with similar terms (e.g., augmenting cancer with mets). However, there is a large range of clinical chart review tasks, such that fixed sets of similar terms is insufficient. Current language models, such as Bidirectional Encoder Representations from Transformers (BERT) embeddings, do not capture the full non-textual context of a task. In this study, we present new methods that provide similar terms dynamically by adjusting with the context of the chart review task. Methods We introduce a vector space for medical-context in which each word is represented by a vector that captures the word’s usage in different medical contexts (e.g., how frequently cancer is used when ordering a prescription versus describing family history) beyond the context learned from the surrounding text. These vectors are transformed into a vector space for customizing the set of similar terms selected for different chart review tasks. We evaluate the vector space model with multiple chart review tasks, in which supervised machine learning models learn to predict the preferred terms of clinically knowledgeable reviewers. To quantify the usefulness of the predicted similar terms to a baseline of standard word2vec embeddings, we measure (1) the prediction performance of the medical-context vector space model using the area under the receiver operating characteristic curve (AUROC) and (2) the labeling effort required to train the models. Results The vector space outperformed the baseline word2vec embeddings in all three chart review tasks with an average AUROC of 0.80 versus 0.66, respectively. Additionally, the medical-context vector space significantly reduced the number of labels required to learn and predict the preferred similar terms of reviewers. Specifically, the labeling effort was reduced to 10% of the entire dataset in all three tasks. Conclusions The set of preferred similar terms that are relevant to a chart review task can be learned by leveraging the medical context of the task.

Download Full-text

Chinese Web Page Classification Based on Vector Space Model

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.846-847.1801 ◽

2013 ◽

Vol 846-847 ◽

pp. 1801-1804

Author(s):

Li Wei ◽

Ling Zhang ◽

Hua Mei Li ◽

Xiao Zhou Chen

Keyword(s):

Machine Learning ◽

Data Mining ◽

Vector Space ◽

Vector Space Model ◽

Research Area ◽

Supervised Machine Learning ◽

Web Page ◽

Web Page Classification ◽

Space Model ◽

Page Classification

Chinese web page classification has been considered as a hot research area in data mining. In this paper, Chinese web page classification algorithm based on vector space model is proposed. This algorithm makes use of supervised machine learning theory to implement a web page classifier. It combined text frequency and methods for feature extraction and improved traditional TFIDF weighting formula. The results show that the classifier was feasible and effective.

Download Full-text

Extended Vector Space Model with Semantic Relatedness on Java Archive Search Engine

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v1i2.372 ◽

2015 ◽

Vol 1 (2) ◽

Cited By ~ 2

Author(s):

Oscar Karnalim

Keyword(s):

Vector Space ◽

Search Engine ◽

Vector Space Model ◽

Semantic Relatedness ◽

Space Model

Download Full-text

Aplikasi Deteksi Kemiripan Tugas Paper

Matrik Jurnal Manajemen Teknik Informatika dan Rekayasa Komputer ◽

10.30812/matrik.v15i2.39 ◽

2017 ◽

Vol 15 (2) ◽

pp. 5

Author(s):

Anthony Anggrawan ◽

Azhari

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Mean Average Precision ◽

Average Precision ◽

Information Searching ◽

Space Model ◽

Model Method

Information searching based on users’ query, which is hopefully able to find the documents based on users’ need, is known as Information Retrieval. This research uses Vector Space Model method in determining the similarity percentage of each student’s assignment. This research uses PHP programming and MySQL database. The finding is represented by ranking the similarity of document with query, with mean average precision value of 0,874. It shows how accurate the application with the examination done by the experts, which is gained from the evaluation with 5 queries that is compared to 25 samples of documents. If the number of counted assignments has higher similarity, thus the process of similarity counting needs more time, it depends on the assignment’s number which is submitted.

Download Full-text

Aplikasi Rekomendasi Buku Pada Katalog Perpustakaan Universitas Multimedia Nusantara Menggunakan Vector Space Model

Jurnal ULTIMATICS ◽

10.31937/ti.v9i2.639 ◽

2018 ◽

Vol 9 (2) ◽

pp. 97-105

Author(s):

Richard Firdaus Oeyliawan ◽

Dennis Gunawan

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Vector Model ◽

Library Management ◽

Space Model ◽

Library Management System ◽

Index Terms ◽

Library Catalogue ◽

Language Sample ◽

F Measure

Library is one of the facilities which provides information, knowledge resource, and acts as an academic helper for readers to get the information. The huge number of books which library has, usually make readers find the books with difficulty. Universitas Multimedia Nusantara uses the Senayan Library Management System (SLiMS) as the library catalogue. SLiMS has many features which help readers, but there is still no recommendation feature to help the readers finding the books which are relevant to the specific book that readers choose. The application has been developed using Vector Space Model to represent the document in vector model. The recommendation in this application is based on the similarity of the books description. Based on the testing phase using one-language sample of the relevant books, the F-Measure value gained is 55% using 0.1 as cosine similarity threshold. The books description and variety of languages affect the F-Measure value gained. Index Terms—Book Recommendation, Porter Stemmer, SLiMS Universitas Multimedia Nusantara, TF-IDF, Vector Space Model

Download Full-text

First Movers and Follow-on Invention: Evidence from a Vector Space Model of Invention

SSRN Electronic Journal ◽

10.2139/ssrn.3354530 ◽

2019 ◽

Cited By ~ 1

Author(s):

Kenneth A. Younge ◽

Jeffrey M. Kuhn

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model

Download Full-text

Topic detections in Arabic Dark websites using improved Vector Space Model

2012 4th Conference on Data Mining and Optimization (DMO) ◽

10.1109/dmo.2012.6329790 ◽

2012 ◽

Cited By ~ 9

Author(s):

Hanan M. Alghamdi ◽

Ali Selamat

Keyword(s):

Vector Space ◽

Vector Space Model ◽

Space Model

Download Full-text

A relational vector-space model of information retrieval adapted to images

ACM SIGIR Forum ◽

10.1145/1067268.1067292 ◽

2005 ◽

Vol 39 (1) ◽

pp. 62-62

Author(s):

Jean Martinet

Keyword(s):

Information Retrieval ◽

Vector Space ◽

Vector Space Model ◽

Space Model

Download Full-text

On Generalized Vector Space Model in Information Retrieval

Fundamenta Informaticae ◽

10.3233/fi-1985-8207 ◽

1985 ◽

Vol 8 (2) ◽

pp. 253-267

Author(s):

S.K.M. Wong ◽

Wojciech Ziarko

Keyword(s):

Information Retrieval ◽

Vector Space ◽

A Priori ◽

Vector Space Model ◽

Smart System ◽

Space Model ◽

Retrieval Systems ◽

Information Retrieval Systems ◽

Index Terms ◽

Minimal Modification

In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The main difficulty with this approach is that the explicit representation of term vectors is not known a priori. For this reason, the vector space model adopted by Salton for the SMART system treats the terms as a set of orthogonal vectors. In such a model it is often necessary to adopt a separate, corrective procedure to take into account the correlations between terms. In this paper, we propose a systematic method (the generalized vector space model) to compute term correlations directly from automatic indexing scheme. We also demonstrate how such correlations can be included with minimal modification in the existing vector based information retrieval systems.

Download Full-text