scholarly journals An empirical evaluation of supervised learning approaches in assigning diagnosis codes to electronic medical records

2015 ◽  
Vol 65 (2) ◽  
pp. 155-166 ◽  
Author(s):  
Ramakanth Kavuluru ◽  
Anthony Rios ◽  
Yuan Lu
Author(s):  
David Liebovitz

Electronic medical records provide potential benefits and also drawbacks. Potential benefits include increased patient safety and efficiency. Potential drawbacks include newly introduced errors and diminished workflow efficiency. In the patient safety context, medication errors account for significant patient harm. Electronic prescribing (e-prescribing) offers the promise of automated drug interaction and dosage verification. In addition, the process of enabling e-prescriptions also provides access to an often unrecognized benefit, that of viewing the dispensed medication history. This information is often critical to understanding patient symptoms. Obtaining significant value from electronic medical records requires use of standardized terminology for both targeted decision support and population-based management. Further, generating documentation for a billable encounter requires usage of proper codes. The emergence of International Classification of Diseases (ICD)-10 holds promise in facilitating identification of a more precise patient code while also presenting drawbacks given its complexity. This article will focus on elements of e-prescribing and use of structured chart content, including diagnosis codes as they relate to physician office practices.


2018 ◽  
Vol 2018 ◽  
pp. 1-8
Author(s):  
Saroochi Agarwal ◽  
Duc T. Nguyen ◽  
Justin D. Lew ◽  
Brenda Campbell ◽  
Edward A. Graviss

Background. The QuantiFERON Gold In-Tube (QFT-G) assay is used to identify individuals with tuberculosis infection and gives quantitative and qualitative results including positive, negative, or indeterminate results (that cannot be interpreted clinically). Several factors, including immunosuppression and preanalytical factors, have been suggested to be significantly associated with indeterminate QFT-G results. An online education program was designed and implemented to reduce the rate of indeterminate QFT-G test results at Houston Methodist Hospital (HMH). Methods. Data from patients’ electronic medical records having indeterminate QFT-G results between 01/2015 and 05/2016 at HMH in Houston, TX, were administratively extracted for (1) medical unit where QFT-G phlebotomy was performed, (2) demographics, and (3) ICD-9/10 diagnosis codes. Unit nurses identified with high proportions of indeterminate QFT-G results were emailed a link to an online pretest educational program with a QFT-G blood collection and handling presentation, and a posttest assessment. Results. Of the 332 nurses emailed, 94 (28.4%) voluntarily completed both tests within the 6-month time allotted. The nurses that completed the education program had a significantly higher posteducation test score than on the pretest (70.2% versus 55.3%, p<0.001, effect size=0.82). Improved posttest score was seen in 67.0% of participants. No reduction in the proportion of indeterminate test results was seen overall at HMH in the 6 months after education. Conclusions. A targeted education program was able to successfully increase nurses’ knowledge of blood collection and handling procedures for the QFT-G test, but no association was found between the improvement of posttest score and indeterminate QFT-G test results.


2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Ni Wang ◽  
Yanqun Huang ◽  
Honglei Liu ◽  
Zhiqiang Zhang ◽  
Lan Wei ◽  
...  

Abstract Background A new learning-based patient similarity measurement was proposed to measure patients’ similarity for heterogeneous electronic medical records (EMRs) data. Methods We first calculated feature-level similarities according to the features’ attributes. A domain expert provided patient similarity scores of 30 randomly selected patients. These similarity scores and feature-level similarities for 30 patients comprised the labeled sample set, which was used for the semi-supervised learning algorithm to learn the patient-level similarities for all patients. Then we used the k-nearest neighbor (kNN) classifier to predict four liver conditions. The predictive performances were compared in four different situations. We also compared the performances between personalized kNN models and other machine learning models. We assessed the predictive performances by the area under the receiver operating characteristic curve (AUC), F1-score, and cross-entropy (CE) loss. Results As the size of the random training samples increased, the kNN models using the learned patient similarity to select near neighbors consistently outperformed those using the Euclidean distance to select near neighbors (all P values < 0.001). The kNN models using the learned patient similarity to identify the top k nearest neighbors from the random training samples also had a higher best-performance (AUC: 0.95 vs. 0.89, F1-score: 0.84 vs. 0.67, and CE loss: 1.22 vs. 1.82) than those using the Euclidean distance. As the size of the similar training samples increased, which composed the most similar samples determined by the learned patient similarity, the performance of kNN models using the simple Euclidean distance to select the near neighbors degraded gradually. When exchanging the role of the Euclidean distance, and the learned patient similarity in selecting the near neighbors and similar training samples, the performance of the kNN models gradually increased. These two kinds of kNN models had the same best-performance of AUC 0.95, F1-score 0.84, and CE loss 1.22. Among the four reference models, the highest AUC and F1-score were 0.94 and 0.80, separately, which were both lower than those for the simple and similarity-based kNN models. Conclusions This learning-based method opened an opportunity for similarity measurement based on heterogeneous EMR data and supported the secondary use of EMR data.


Author(s):  
Xiangrui Cai ◽  
Jinyang Gao ◽  
Kee Yuan Ngiam ◽  
Beng Chin Ooi ◽  
Ying Zhang ◽  
...  

Embeddings of medical concepts such as medication, procedure and diagnosis codes in Electronic Medical Records (EMRs) are central to healthcare analytics. Previous work on medical concept embedding takes medical concepts and EMRs as words and documents respectively. Nevertheless, such models miss out the temporal nature of EMR data. On the one hand, two consecutive medical concepts do not indicate they are temporally close, but the correlations between them can be revealed by the time gap. On the other hand, the temporal scopes of medical concepts often vary greatly (e.g., common cold and diabetes). In this paper, we propose to incorporate the temporal information to embed medical codes. Based on the Continuous Bag-of-Words model, we employ the attention mechanism to learn a ``soft'' time-aware context window for each medical concept. Experiments on public and proprietary datasets through clustering and nearest neighbour search tasks demonstrate the effectiveness of our model, showing that it outperforms five state-of-the-art baselines.


2019 ◽  
Author(s):  
Ying Shen ◽  
Buzhou Tang ◽  
Yaliang Li ◽  
Nan Du

BACKGROUND Severity classification of diseases and symptoms in electronic medical records (EMRs) is very important in medicine and the life sciences, as it facilitates an easier understanding of medical documents by physicians. However, existing methods perform symptom name recognition and severity assessment tasks separately, which requires very large amounts of expert time and effort and neglects the rich correlations in information between tasks. OBJECTIVE The task of predicting symptom name and severity simultaneously from informative but noisy EMRs is important yet challenging in practice. There is a strong motivation to develop new methods that can effectively perform these two tasks. METHODS In this paper, we explore multi-task learning approaches to integrate symptom name recognition and severity assessment in a unified framework, motivated by the fact that these two tasks can benefit each other. To fulfill the goal of learn the correlation between these two tasks, we propose a novel cluster-based knowledge-aware learning scheme to reduce semantic ambiguity for name recognition and enrich sentence representation learning for severity assessment. RESULTS Symptom classification emerges from the cooperation of several machine learning modes and from the ontology we have developed and released. The experiments performed on synthetic dataset demonstrate the effectiveness of the proposed method and the improved performance of both tasks. We also consider a practical testbed application - symptom severity assessment and diagnosis inference - to test and validate our method and assess its impact in real-world clinical settings. CONCLUSIONS Our proposed model can provide symptom knowledge and implications for clinicians and patients as a reference and has remarkable applicability and generality, outperforming competitors and defining the state-of-the-art. The gastrointestinal ontology and severity assessment corpus are accessible via: https://github.com/shenyingpku/MTL CLINICALTRIAL N/A


2014 ◽  
Author(s):  
C. McKenna ◽  
B. Gaines ◽  
C. Hatfield ◽  
S. Helman ◽  
L. Meyer ◽  
...  

Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 908-P
Author(s):  
SOSTENES MISTRO ◽  
THALITA V.O. AGUIAR ◽  
VANESSA V. CERQUEIRA ◽  
KELLE O. SILVA ◽  
JOSÉ A. LOUZADO ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document