Representing Evidence from Biomedical Literature for Clinical Decision Support: Challenges on Semantic Computing and Biomedicine

Author(s):  
William Hsu
2021 ◽  
Author(s):  
Sajit Kumar ◽  
Alicia Nanelia Tan Li Shi ◽  
Ragunathan Mariappan ◽  
Adithya Rajagopal ◽  
Vaibhav Rajan

BACKGROUND Patient Representation Learning aims to learn features, also called representations, from input sources automatically, often in an unsupervised manner, for use in predictive models. This obviates the need for cumbersome, time- and resource-intensive manual feature engineering, especially from unstructured data such as text, images or graphs. Most previous techniques have used neural network based autoencoders to learn patient representations, primarily from clinical notes in Electronic Medical Records (EMR). Knowledge Graphs (KG), with clinical entities as nodes and their relations as edges, can be extracted automatically from biomedical literature, and provide complementary information to EMR data that have been found to provide valuable predictive signals. OBJECTIVE We evaluate the efficacy of Collective Matrix Factorization (CMF) - both classical variants and a recent neural architecture called Deep CMF (DCMF) - in integrating heterogeneous data sources from EMR and KG to obtain patient representations for Clinical Decision Support Tasks. METHODS Using a recent formulation of obtaining graph representations through matrix factorization, within the context of CMF, we infuse auxiliary information during patient representation learning. We also extend the DCMF architecture to create a task-specific end-to-end model that learns to simultaneously find effective patient representations and predict. We compare the efficacy of such a model to that of first learning unsupervised representations and then independently learning a predictive model. We evaluate patient representation learning using CMF-based methods and autoencoders for two clinical decision support tasks on a large EMR dataset. RESULTS Our experiments show that DCMF provides a seamless way to integrate multiple sources of data to obtain patient representations, both in unsupervised and supervised settings. Its performance in single-source settings is comparable to that of previous autoencoder-based representation learning methods. When DCMF is used to obtain representations from a combination of EMR and KG, where most previous autoencoder-based methods cannot be used directly, its performance is superior to that of previous non-neural methods for CMF. Infusing information from KGs into patient representations using DCMF was found to improve downstream predictive performance. CONCLUSIONS Our experiments indicate that DCMF is a versatile model that can be used to obtain representations from single and multiple data sources, and to combine information from EMR data and Knowledge Graphs. Further, DCMF can be used to learn representations in both supervised and unsupervised settings. Thus, DCMF offers an effective way of integrating heterogeneous data sources and infusing auxiliary knowledge into patient representations.


2017 ◽  
Vol 1 (1) ◽  
pp. 49-60 ◽  
Author(s):  
Danchen Zhang ◽  
Daqing He

Abstract With vast amount of biomedical literature available online, doctors have the benefits of consulting the literature before making clinical decisions, but they are facing the daunting task of finding needles in haystacks. In this situation, it would be of great use to the doctors if an effective clinical decision support system is available to generate accurate queries and return a manageable size of highly useful articles. Existing studies showed the usefulness of patients’ diagnosis information in supporting effective retrieval of relevant literature, but such diagnosis information is often missing in most cases. Furthermore, existing diagnosis prediction systems mainly focus on predicting a small range of diseases with well-formatted features, and it is still a great challenge to perform large-scale automatic diagnosis predictions based on noisy medical records of the patient. In this paper, we propose automatic diagnosis prediction methods for enhancing the retrieval in a clinical decision support system, where the prediction is based on evidences automatically collected from publicly accessible online knowledge bases such as Wikipedia and Semantic MEDLINE Database (SemMedDB). The assumption is that relevant diseases and their corresponding symptoms co-occur more frequently in these knowledge bases. Our methods use Markov Random Field (MRF) model to identify diagnosis candidates in the knowledge bases, and their performance was evaluated using test collections from the Clinical Decision Support (CDS) track in TREC 2014, 2015, and 2016. The results show that our methods can automatically predict diagnosis with about 75% accuracy, and such predictions can significantly improve the related biomedical literatures retrieval. Our methods can generate comparable retrieval results to the state-of-the-art methods, which utilize much more complicated methods and some manually crafted medical knowledge. One possible future work is to apply these methods in collaboration with real doctors. Notes: a portion of this work was published in iConference 2017 as a poster, which won the best poster award. This paper greatly expands the research scope over that poster.


2013 ◽  
Vol 46 (2) ◽  
pp. 52
Author(s):  
CHRISTOPHER NOTTE ◽  
NEIL SKOLNIK

1993 ◽  
Vol 32 (01) ◽  
pp. 12-13 ◽  
Author(s):  
M. A. Musen

Abstract:Response to Heathfield HA, Wyatt J. Philosophies for the design and development of clinical decision-support systems. Meth Inform Med 1993; 32: 1-8.


2006 ◽  
Vol 45 (05) ◽  
pp. 523-527 ◽  
Author(s):  
A. Abu-Hanna ◽  
B. Nannings

Summary Objectives: Decision Support Telemedicine Systems (DSTS) are at the intersection of two disciplines: telemedicine and clinical decision support systems (CDSS). The objective of this paper is to provide a set of characterizing properties for DSTSs. This characterizing property set (CPS) can be used for typing, classifying and clustering DSTSs. Methods: We performed a systematic keyword-based literature search to identify candidate-characterizing properties. We selected a subset of candidates and refined them by assessing their potential in order to obtain the CPS. Results: The CPS consists of 14 properties, which can be used for the uniform description and typing of applications of DSTSs. The properties are grouped in three categories that we refer to as the problem dimension, process dimension, and system dimension. We provide CPS instantiations for three prototypical applications. Conclusions: The CPS includes important properties for typing DSTSs, focusing on aspects of communication for the telemedicine part and on aspects of decisionmaking for the CDSS part. The CPS provides users with tools for uniformly describing DSTSs.


Sign in / Sign up

Export Citation Format

Share Document