Mining heterogeneous clinical notes by multi-modal latent topic model

Latent knowledge can be extracted from the electronic notes that are recorded during patient encounters with the health system. Using these clinical notes to decipher a patient’s underlying comorbidites, symptom burdens, and treatment courses is an ongoing challenge. Latent topic model as an efficient Bayesian method can be used to model each patient’s clinical notes as “documents” and the words in the notes as “tokens”. However, standard latent topic models assume that all of the notes follow the same topic distribution, regardless of the type of note or the domain expertise of the author (such as doctors or nurses). We propose a novel application of latent topic modeling, using multi-note topic model (MNTM) to jointly infer distinct topic distributions of notes of different types. We applied our model to clinical notes from the MIMIC-III dataset to infer distinct topic distributions over the physician and nursing note types. Based on manual assessments made by clinicians, we observed a significant improvement in topic interpretability using MNTM modeling over the baseline single-note topic models that ignore the note types. Moreover, our MNTM model led to a significantly higher prediction accuracy for prolonged mechanical ventilation and mortality using only the first 48 hours of patient data. By correlating the patients’ topic mixture with hospital mortality and prolonged mechanical ventilation, we identified several diagnostic topics that are associated with poor outcomes. Because of its elegant and intuitive formation, we envision a broad application of our approach in mining multi-modality text-based healthcare information that goes beyond clinical notes. Code available at https://github.com/li-lab-mcgill/heterogeneous_ehr.

Download Full-text

Author–Subject–Topic model for reviewer recommendation

Journal of Information Science ◽

10.1177/0165551518806116 ◽

2018 ◽

Vol 45 (4) ◽

pp. 554-570 ◽

Cited By ~ 1

Author(s):

Jian Jin ◽

Qian Geng ◽

Haikun Mou ◽

Chong Chen

Keyword(s):

Information System ◽

Topic Model ◽

Academic Library ◽

Topic Models ◽

Interdisciplinary Studies ◽

Distribution Analysis ◽

Topic Distribution ◽

Research Domains

Interdisciplinary studies are becoming increasingly popular, and research domains of many experts are becoming diverse. This phenomenon brings difficulty in recommending experts to review interdisciplinary submissions. In this study, an Author–Subject–Topic (AST) model is proposed with two versions. In the model, reviewers’ subject information is embedded to analyse topic distributions of submissions and reviewers’ publications. The major difference between the AST and Author–Topic models lies in the introduction of a ‘Subject’ layer, which supervises the generation of hierarchical topics and allows sharing of subjects among authors. To evaluate the performance of the AST model, papers in Information System and Management (a typical interdisciplinary domain) in a famous Chinese academic library are investigated. Comparative experiments are conducted, which show the effectiveness of the AST model in topic distribution analysis and reviewer recommendation for interdisciplinary studies.

Download Full-text

It all starts with entities: A Salient entity topic model

Natural Language Engineering ◽

10.1017/s1351324919000585 ◽

2019 ◽

Vol 26 (5) ◽

pp. 531-549

Author(s):

Chuan Wu ◽

Evangelos Kanoulas ◽

Maarten de Rijke

Keyword(s):

Topic Model ◽

State Of The Art ◽

Topic Models ◽

Generation Process ◽

Qualitative And Quantitative Analysis ◽

Generative Process ◽

Qualitative And Quantitative ◽

Proposed Model ◽

Topic Distribution ◽

Document Generation

AbstractEntities play an essential role in understanding textual documents, regardless of whether the documents are short, such as tweets, or long, such as news articles. In short textual documents, all entities mentioned are usually considered equally important because of the limited amount of information. In long textual documents, however, not all entities are equally important: some are salient and others are not. Traditional entity topic models (ETMs) focus on ways to incorporate entity information into topic models to better explain the generative process of documents. However, entities are usually treated equally, without considering whether they are salient or not. In this work, we propose a novel ETM, Salient Entity Topic Model, to take salient entities into consideration in the document generation process. In particular, we model salient entities as a source of topics used to generate words in documents, in addition to the topic distribution of documents used in traditional topic models. Qualitative and quantitative analysis is performed on the proposed model. Application to entity salience detection demonstrates the effectiveness of our model compared to the state-of-the-art topic model baselines.

Download Full-text

Graph Neural Collaborative Topic Model for Citation Recommendation

ACM Transactions on Information Systems ◽

10.1145/3473973 ◽

2022 ◽

Vol 40 (3) ◽

pp. 1-30

Author(s):

Qianqian Xie ◽

Yutao Zhu ◽

Jimin Huang ◽

Pan Du ◽

Jian-Yun Nie

Keyword(s):

Topic Model ◽

Topic Models ◽

Joint Modeling ◽

Research Problem ◽

High Order ◽

Semantic Structure ◽

Critical Research ◽

First Order ◽

Latent Topic ◽

Graph Neural Networks

Due to the overload of published scientific articles, citation recommendation has long been a critical research problem for automatically recommending the most relevant citations of given articles. Relational topic models (RTMs) have shown promise on citation prediction via joint modeling of document contents and citations. However, existing RTMs can only capture pairwise or direct (first-order) citation relationships among documents. The indirect (high-order) citation links have been explored in graph neural network–based methods, but these methods suffer from the well-known explainability problem. In this article, we propose a model called Graph Neural Collaborative Topic Model that takes advantage of both relational topic models and graph neural networks to capture high-order citation relationships and to have higher explainability due to the latent topic semantic structure. Experiments on three real-world citation datasets show that our model outperforms several competitive baseline methods on citation recommendation. In addition, we show that our approach can learn better topics than the existing approaches. The recommendation results can be well explained by the underlying topics.

Download Full-text

Circulating IL-6 mediates lung injury via CXCL1 production after acute kidney injury in mice

AJP Renal Physiology ◽

10.1152/ajprenal.00025.2012 ◽

2012 ◽

Vol 303 (6) ◽

pp. F864-F872 ◽

Cited By ~ 66

Author(s):

Nilesh Ahuja ◽

Ana Andres-Hernando ◽

Christopher Altmann ◽

Rhea Bhargava ◽

Jasna Bacalja ◽

...

Keyword(s):

Mechanical Ventilation ◽

Acute Kidney Injury ◽

Lung Injury ◽

Prolonged Mechanical Ventilation ◽

Kidney Injury ◽

Neutrophil Infiltration ◽

Bilateral Nephrectomy ◽

Poor Outcomes ◽

Deficient Mice

Serum IL-6 is increased in patients with acute kidney injury (AKI) and is associated with prolonged mechanical ventilation and increased mortality. Inhibition of IL-6 in mice with AKI reduces lung injury associated with a reduction in the chemokine CXCL1 and lung neutrophils. Whether circulating IL-6 or locally produced lung IL-6 mediates lung injury after AKI is unknown. We hypothesized that circulating IL-6 mediates lung injury after AKI by increasing lung endothelial CXCL1 production and subsequent neutrophil infiltration. To test the role of circulating IL-6 in AKI-mediated lung injury, recombinant murine IL-6 was administered to IL-6-deficient mice. To test the role of CXCL1 in AKI-mediated lung injury, CXCL1 was inhibited by use of CXCR2-deficient mice and anti-CXCL1 antibodies in mice with ischemic AKI or bilateral nephrectomy. Injection of recombinant IL-6 to IL-6-deficient mice with AKI increased lung CXCL1 and lung neutrophils. Lung endothelial CXCL1 was increased after AKI. CXCR2-deficient and CXCL1 antibody-treated mice with ischemic AKI or bilateral nephrectomy had reduced lung neutrophil content. In summary, we demonstrate for the first time that circulating IL-6 is a mediator of lung inflammation and injury after AKI. Since serum IL-6 is increased in patients with either AKI or acute lung injury and predicts prolonged mechanical ventilation and increased mortality in both conditions, our data suggest that serum IL-6 is not simply a biomarker of poor outcomes but a pathogenic mediator of lung injury.

Download Full-text

Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation

10.18653/v1/2020.clinicalnlp-1.11 ◽

2020 ◽

Author(s):

Kexin Huang ◽

Abhishek Singh ◽

Sitong Chen ◽

Edward Moseley ◽

Chih-Ying Deng ◽

...

Keyword(s):

Mechanical Ventilation ◽

Prolonged Mechanical Ventilation ◽

Clinical Notes

Download Full-text

Semi-supervised Max-margin Topic Model with Manifold Posterior Regularization

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/259 ◽

2017 ◽

Cited By ~ 2

Author(s):

Wenbo Hu ◽

Jun Zhu ◽

Hang Su ◽

Jingwei Zhuo ◽

Bo Zhang

Keyword(s):

Supervised Learning ◽

Topic Model ◽

Topic Models ◽

Stochastic Gradient ◽

Mcmc Method ◽

Tight Coupling ◽

Label Information ◽

Latent Topic ◽

Latent Topics ◽

Qualitative Performance

Supervised topic models leverage label information to learn discriminative latent topic representations. As collecting a fully labeled dataset is often time-consuming, semi-supervised learning is of high interest. In this paper, we present an effective semi-supervised max-margin topic model by naturally introducing manifold posterior regularization to a regularized Bayesian topic model, named LapMedLDA. The model jointly learns latent topics and a related classifier with only a small fraction of labeled documents. To perform the approximate inference, we derive an efficient stochastic gradient MCMC method. Unlike the previous semi-supervised topic models, our model adopts a tight coupling between the generative topic model and the discriminative classifier. Extensive experiments demonstrate that such tight coupling brings significant benefits in quantitative and qualitative performance.

Download Full-text