scholarly journals Multi-layer Representation Learning and Its Application to Electronic Health Records

Author(s):  
Shan Yang ◽  
Xiangwei Zheng ◽  
Cun Ji ◽  
Xuanchi Chen
2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Isotta Landi ◽  
Benjamin S. Glicksberg ◽  
Hao-Chih Lee ◽  
Sarah Cherng ◽  
Giulia Landi ◽  
...  

Author(s):  
Tong Ruan ◽  
Liqi Lei ◽  
Yangming Zhou ◽  
Jie Zhai ◽  
Le Zhang ◽  
...  

Abstract Background Electronic health records (EHRs) provide possibilities to improve patient care and facilitate clinical research. However, there are many challenges faced by the applications of EHRs, such as temporality, high dimensionality, sparseness, noise, random error and systematic bias. In particular, temporal information is difficult to effectively use by traditional machine learning methods while the sequential information of EHRs is very useful. Method In this paper, we propose a general-purpose patient representation learning approach to summarize sequential EHRs. Specifically, a recurrent neural network based denoising autoencoder (RNN-DAE) is employed to encode inhospital records of each patient into a low dimensional dense vector. Results Based on EHR data collected from Shuguang Hospital affiliated to Shanghai University of Traditional Chinese Medicine, we experimentally evaluate our proposed RNN-DAE method on both mortality prediction task and comorbidity prediction task. Extensive experimental results show that our proposed RNN-DAE method outperforms existing methods. In addition, we apply the “Deep Feature” represented by our proposed RNN-DAE method to track similar patients with t-SNE, which also achieves some interesting observations. Conclusion We propose an effective unsupervised RNN-DAE method to summarize patient sequential information in EHR data. Our proposed RNN-DAE method is useful on both mortality prediction task and comorbidity prediction task.


2021 ◽  
Author(s):  
Kyunghoon Hur ◽  
Jiyoung Lee ◽  
Jungwoo Oh ◽  
Wesley Price ◽  
Young-Hak Kim ◽  
...  

BACKGROUND Substantial increase in the use of Electronic Health Records (EHRs) has opened new frontiers for predictive healthcare. However, while EHR systems are nearly ubiquitous, they lack a unified code system for representing medical concepts. Heterogeneous formats of EHR present a substantial barrier for the training and deployment of state-of-the-art deep learning models at scale. OBJECTIVE The aim of this study is to suggest a novel text embedding approach to overcome heterogeneity of EHR structure among different EHR systems. METHODS We introduce Description-based Embedding, DescEmb, a code-agnostic description-based representation learning framework for predictive modeling on EHR. DescEmb takes advantage of the flexibility of neural language understanding models while maintaining a neutral approach that can be combined with prior frameworks for task-specific representation learning or predictive modeling. RESULTS Based on five prediction tasks with two heterogeneous EHR datasets, DescEmb achieves comparable or superior performance to the traditional code-based embedding approach, especially under the zero-shot and few-shot transfer learning scenarios. We also demonstrate that DescEmb enables us to train a single model on a pooled dataset from heterogeneous EHR systems and achieve the same, if not better performance compared to training separate models for each EHR system. CONCLUSIONS Based on the promising results, we believe the description-based embedding approach on EHR will open a new direction for large-scale predictive modeling in healthcare.


2020 ◽  
Vol 34 (01) ◽  
pp. 606-613
Author(s):  
Edward Choi ◽  
Zhen Xu ◽  
Yujia Li ◽  
Michael Dusenberry ◽  
Gerardo Flores ◽  
...  

Effective modeling of electronic health records (EHR) is rapidly becoming an important topic in both academia and industry. A recent study showed that using the graphical structure underlying EHR data (e.g. relationship between diagnoses and treatments) improves the performance of prediction tasks such as heart failure prediction. However, EHR data do not always contain complete structure information. Moreover, when it comes to claims data, structure information is completely unavailable to begin with. Under such circumstances, can we still do better than just treating EHR data as a flat-structured bag-of-features? In this paper, we study the possibility of jointly learning the hidden structure of EHR while performing supervised prediction tasks on EHR data. Specifically, we discuss that Transformer is a suitable basis model to learn the hidden EHR structure, and propose Graph Convolutional Transformer, which uses data statistics to guide the structure learning process. The proposed model consistently outperformed previous approaches empirically, on both synthetic data and publicly available EHR data, for various prediction tasks such as graph reconstruction and readmission prediction, indicating that it can serve as an effective general-purpose representation learning algorithm for EHR data.


Sign in / Sign up

Export Citation Format

Share Document