Patient Representation Learning from Heterogeneous Data Sources and Knowledge Graphs using Deep Collective Matrix Factorization: Evaluation Study (Preprint)

2021 ◽  
Author(s):  
Sajit Kumar ◽  
Alicia Nanelia Tan Li Shi ◽  
Ragunathan Mariappan ◽  
Adithya Rajagopal ◽  
Vaibhav Rajan

BACKGROUND Patient Representation Learning aims to learn features, also called representations, from input sources automatically, often in an unsupervised manner, for use in predictive models. This obviates the need for cumbersome, time- and resource-intensive manual feature engineering, especially from unstructured data such as text, images or graphs. Most previous techniques have used neural network based autoencoders to learn patient representations, primarily from clinical notes in Electronic Medical Records (EMR). Knowledge Graphs (KG), with clinical entities as nodes and their relations as edges, can be extracted automatically from biomedical literature, and provide complementary information to EMR data that have been found to provide valuable predictive signals. OBJECTIVE We evaluate the efficacy of Collective Matrix Factorization (CMF) - both classical variants and a recent neural architecture called Deep CMF (DCMF) - in integrating heterogeneous data sources from EMR and KG to obtain patient representations for Clinical Decision Support Tasks. METHODS Using a recent formulation of obtaining graph representations through matrix factorization, within the context of CMF, we infuse auxiliary information during patient representation learning. We also extend the DCMF architecture to create a task-specific end-to-end model that learns to simultaneously find effective patient representations and predict. We compare the efficacy of such a model to that of first learning unsupervised representations and then independently learning a predictive model. We evaluate patient representation learning using CMF-based methods and autoencoders for two clinical decision support tasks on a large EMR dataset. RESULTS Our experiments show that DCMF provides a seamless way to integrate multiple sources of data to obtain patient representations, both in unsupervised and supervised settings. Its performance in single-source settings is comparable to that of previous autoencoder-based representation learning methods. When DCMF is used to obtain representations from a combination of EMR and KG, where most previous autoencoder-based methods cannot be used directly, its performance is superior to that of previous non-neural methods for CMF. Infusing information from KGs into patient representations using DCMF was found to improve downstream predictive performance. CONCLUSIONS Our experiments indicate that DCMF is a versatile model that can be used to obtain representations from single and multiple data sources, and to combine information from EMR data and Knowledge Graphs. Further, DCMF can be used to learn representations in both supervised and unsupervised settings. Thus, DCMF offers an effective way of integrating heterogeneous data sources and infusing auxiliary knowledge into patient representations.

To keep pace with the updates in obliging scientific discipline, thriving recuperating knowledge is being assembled incessantly. Regardless, inferable from the not too appalling gathering of its categories and sources, therapeutic knowledge has over up being significantly hugger-mugger in numerous specialist's work environments that it currently wants Clinical call Support (CDS) system for its affiliation. To reasonably utilize the party flourishing knowledge, we tend to propose a CDS structure which will distort mixed thriving knowledge from totally different sources, for example, take a goose at workplace check works out as planned, important info of patients and action records into a joined depiction of options everything thought-about. Victimization the electronic roaring healing knowledge therefore created, multi-name delineation was accustomed endorse a layout of afflictions and so facilitate consultants in diagnosis or treating their patients' therapeutic problems a lot of competently. Once the ace sees the contamination of a patient, the running with organize is to contemplate the conceivable complexities of that disarray, which may impel a lot of infections


2018 ◽  
Vol 22 (6) ◽  
pp. 1824-1833 ◽  
Author(s):  
Mengxing Huang ◽  
Huirui Han ◽  
Hao Wang ◽  
Lefei Li ◽  
Yu Zhang ◽  
...  

2018 ◽  
Vol 27 (01) ◽  
pp. 016-024 ◽  
Author(s):  
Prabhu Shankar ◽  
Nick Anderson

Introduction: Clinical decision support science is expanding to include integration from broader and more varied data sources, diverse platforms and delivery modalities, and is responding to emerging regulatory guidelines and increased interest from industry. Objective: Evaluate key advances and challenges of accessing, sharing, and managing data from multiple sources for development and implementation of Clinical Decision Support (CDS) systems in 2016-2017. Methods: Assessment of literature and scientific conference proceedings, current and pending policy development, and review of commercial applications nationally and internationally. Results: CDS research is approaching multiple landmark points driven by commercialization interests, emerging regulatory policy, and increased public awareness. However, the availability of patient-related “Big Data” sources from genomics and mobile health, expanded privacy considerations, applications of service-based computational techniques and tools, the emergence of “app” ecosystems, and evolving patient-centric approaches reflect the distributed, complex, and uneven maturity of the CDS landscape. Nonetheless, the field of CDS is yet to mature. The lack of standards and CDS-specific policies from regulatory bodies that address the privacy and safety concerns of data and knowledge sharing to support CDS development may continue to slow down the broad CDS adoption within and across institutions. Conclusion: Partnerships with Electronic Health Record and commercial CDS vendors, policy makers, standards development agencies, clinicians, and patients are needed to see CDS deployed in the evolving learning health system.


2017 ◽  
Vol 1 (1) ◽  
pp. 49-60 ◽  
Author(s):  
Danchen Zhang ◽  
Daqing He

Abstract With vast amount of biomedical literature available online, doctors have the benefits of consulting the literature before making clinical decisions, but they are facing the daunting task of finding needles in haystacks. In this situation, it would be of great use to the doctors if an effective clinical decision support system is available to generate accurate queries and return a manageable size of highly useful articles. Existing studies showed the usefulness of patients’ diagnosis information in supporting effective retrieval of relevant literature, but such diagnosis information is often missing in most cases. Furthermore, existing diagnosis prediction systems mainly focus on predicting a small range of diseases with well-formatted features, and it is still a great challenge to perform large-scale automatic diagnosis predictions based on noisy medical records of the patient. In this paper, we propose automatic diagnosis prediction methods for enhancing the retrieval in a clinical decision support system, where the prediction is based on evidences automatically collected from publicly accessible online knowledge bases such as Wikipedia and Semantic MEDLINE Database (SemMedDB). The assumption is that relevant diseases and their corresponding symptoms co-occur more frequently in these knowledge bases. Our methods use Markov Random Field (MRF) model to identify diagnosis candidates in the knowledge bases, and their performance was evaluated using test collections from the Clinical Decision Support (CDS) track in TREC 2014, 2015, and 2016. The results show that our methods can automatically predict diagnosis with about 75% accuracy, and such predictions can significantly improve the related biomedical literatures retrieval. Our methods can generate comparable retrieval results to the state-of-the-art methods, which utilize much more complicated methods and some manually crafted medical knowledge. One possible future work is to apply these methods in collaboration with real doctors. Notes: a portion of this work was published in iConference 2017 as a poster, which won the best poster award. This paper greatly expands the research scope over that poster.


Author(s):  
Enayat Rajabi ◽  
Kobra Etminani

The decisions derived from AI-based clinical decision support systems should be explainable and transparent so that the healthcare professionals can understand the rationale behind the predictions. To improve the explanations, knowledge graphs are a well-suited choice to be integrated into eXplainable AI. In this paper, we introduce a knowledge graph-based explainable framework for AI-based clinical decision support systems to increase their level of explainability.


2021 ◽  
Vol 7 (2) ◽  
pp. 223-226
Author(s):  
Jan Gaebel ◽  
Johannes Keller ◽  
Daniel Schneider ◽  
Adrian Lindenmeyer ◽  
Thomas Neumuth ◽  
...  

Abstract To overcome obstacles and complexity of decision making in clinical oncology, we propose an integrated clinical decision support approach; the Digital Twin. We analyse the reasons for frustration in applying clinical decision support and provide a multi-levelled approach to implementing a flexible system to support and strengthen clinical decisions. Describing medical patterns and contexts with Resource Description Framework (RDF) allows for standardised way of connecting medical knowledge and processing modules. Having flexible web-based interfaces integrated a multitude of heterogeneous data processing systems to either make clinical data available altogether, or provide calculations and assessments. Transition of the Digital Twin to clinical practice promises effective assistance and safer clinical decisions.


Sign in / Sign up

Export Citation Format

Share Document