scholarly journals Development of a Consumer Health Vocabulary by Mining Health Forum Texts Based on Word Embedding: Semiautomatic Approach (Preprint)

2018 ◽  
Author(s):  
Gen Gu ◽  
Xingting Zhang ◽  
Xingeng Zhu ◽  
Zhe Jian ◽  
Ken Chen ◽  
...  

BACKGROUND The vocabulary gap between consumers and professionals in the medical domain hinders information seeking and communication. Consumer health vocabularies have been developed to aid such informatics applications. This purpose is best served if the vocabulary evolves with consumers’ language. OBJECTIVE Our objective is to develop a method for identifying and adding new terms to consumer health vocabularies, so that it can keep up with the constantly evolving medical knowledge and language use. METHODS In this paper, we propose a consumer health term–finding framework based on a distributed word vector space model. We first learned word vectors from a large-scale text corpus and then adopted a supervised method with existing consumer health vocabularies for learning vector representation of words, which can provide additional supervised fine tuning after unsupervised word embedding learning. With a fine-tuned word vector space, we identified pairs of professional terms and their consumer variants by their semantic distance in the vector space. A subsequent manual review of the extracted and labeled pairs of entities was conducted to validate the results generated by the proposed approach. The results were evaluated using mean reciprocal rank (MRR). RESULTS Manual evaluation showed that it is feasible to identify alternative medical concepts by using professional or consumer concepts as queries in the word vector space without fine tuning, but the results are more promising in the final fine-tuned word vector space. The MRR values indicated that on an average, a professional or consumer concept is about 14th closest to its counterpart in the word vector space without fine tuning, and the MRR in the final fine-tuned word vector space is 8. Furthermore, the results demonstrate that our method can collect abbreviations and common typos frequently used by consumers. CONCLUSIONS By integrating a large amount of text information and existing consumer health vocabularies, our method outperformed several baseline ranking methods and is effective for generating a list of candidate terms for human review during consumer health vocabulary development.

2022 ◽  
Author(s):  
Jakob Nikolas Kather ◽  
Narmin Ghaffari Laleh ◽  
Sebastian Foersch ◽  
Daniel Truhn

The text-guided diffusion model GLIDE (Guided Language to Image Diffusion for Generation and Editing) is the state of the art in text-to-image generative artificial intelligence (AI). GLIDE has rich representations, but medical applications of this model have not been systematically explored. If GLIDE had useful medical knowledge, it could be used for medical image analysis tasks, a domain in which AI systems are still highly engineered towards a single use-case. Here we show that the publicly available GLIDE model has reasonably strong representations of key topics in cancer research and oncology, in particular the general style of histopathology images and multiple facets of diseases, pathological processes and laboratory assays. However, GLIDE seems to lack useful representations of the style and content of radiology data. Our findings demonstrate that domain-agnostic generative AI models can learn relevant medical concepts without explicit training. Thus, GLIDE and similar models might be useful for medical image processing tasks in the future - particularly with additional domain-specific fine-tuning.


Author(s):  
Ke Wang ◽  
Xuyan Chen ◽  
Ning Chen ◽  
Ting Chen

Automatic diagnosis based on clinical notes is critical especially in the emergency department, where a fast and professional result is vital in assuring proper and timely treatment. Previous works formalize this task as plain text classification and fail to utilize the medically significant tree structure of International Classification of Diseases (ICD) coding system. Besides, external medical knowledge is rarely used before, and we explore it by extracting relevant materials from Wikipedia or Baidupedia. In this paper, we propose a knowledge-based tree decoding model (K-BTD), and the inference procedure is a top-down decoding process from the root node to leaf nodes. The stepwise inference procedure enables the model to give support for decision at each step, which visualizes the diagnosis procedure and adds to the interpretability of final predictions. Experiments on real-world data from the emergency department of a large-scale hospital indicate that the proposed model outperforms all baselines in both micro-F1 and macro-F1, and reduce the semantic distance dramatically.


Author(s):  
Xiaoyi Chen ◽  
Carole Faviez ◽  
Marc Vincent ◽  
Nicolas Garcelon ◽  
Sophie Saunier ◽  
...  

To identify patients with similar clinical profiles and derive insights from the records and outcomes of similar patients can help fast and precise diagnosis and other clinical decisions for rare diseases. Similarity methods are required to take into account the semantic relations between medical concepts and also the different relevance of all medical concepts presented in patients’ medical records. In this paper, we introduce the methods developed in the context of rare disease screening/diagnosis from clinical data warehouse using medical concept embedding and adjusted aggregations. Our methods provided better preliminary results than baseline methods, with a significant improvement of precision among the top ranked similar patients, which is encouraging for further fine-tuning and application on a large-scale dataset for new/candidate patient identification.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Alisa M. Loosen ◽  
Vasilisa Skvortsova ◽  
Tobias U. Hauser

AbstractIncreased mental-health symptoms as a reaction to stressful life events, such as the Covid-19 pandemic, are common. Critically, successful adaptation helps to reduce such symptoms to baseline, preventing long-term psychiatric disorders. It is thus important to understand whether and which psychiatric symptoms show transient elevations, and which persist long-term and become chronically heightened. At particular risk for the latter trajectory are symptom dimensions directly affected by the pandemic, such as obsessive–compulsive (OC) symptoms. In this longitudinal large-scale study (N = 406), we assessed how OC, anxiety and depression symptoms changed throughout the first pandemic wave in a sample of the general UK public. We further examined how these symptoms affected pandemic-related information seeking and adherence to governmental guidelines. We show that scores in all psychiatric domains were initially elevated, but showed distinct longitudinal change patterns. Depression scores decreased, and anxiety plateaued during the first pandemic wave, while OC symptoms further increased, even after the ease of Covid-19 restrictions. These OC symptoms were directly linked to Covid-related information seeking, which gave rise to higher adherence to government guidelines. This increase of OC symptoms in this non-clinical sample shows that the domain is disproportionately affected by the pandemic. We discuss the long-term impact of the Covid-19 pandemic on public mental health, which calls for continued close observation of symptom development.


Author(s):  
Junshu Wang ◽  
Guoming Zhang ◽  
Wei Wang ◽  
Ka Zhang ◽  
Yehua Sheng

AbstractWith the rapid development of hospital informatization and Internet medical service in recent years, most hospitals have launched online hospital appointment registration systems to remove patient queues and improve the efficiency of medical services. However, most of the patients lack professional medical knowledge and have no idea of how to choose department when registering. To instruct the patients to seek medical care and register effectively, we proposed CIDRS, an intelligent self-diagnosis and department recommendation framework based on Chinese medical Bidirectional Encoder Representations from Transformers (BERT) in the cloud computing environment. We also established a Chinese BERT model (CHMBERT) trained on a large-scale Chinese medical text corpus. This model was used to optimize self-diagnosis and department recommendation tasks. To solve the limited computing power of terminals, we deployed the proposed framework in a cloud computing environment based on container and micro-service technologies. Real-world medical datasets from hospitals were used in the experiments, and results showed that the proposed model was superior to the traditional deep learning models and other pre-trained language models in terms of performance.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Fuyong Xing ◽  
Yuanpu Xie ◽  
Xiaoshuang Shi ◽  
Pingjun Chen ◽  
Zizhao Zhang ◽  
...  

Abstract Background Nucleus or cell detection is a fundamental task in microscopy image analysis and supports many other quantitative studies such as object counting, segmentation, tracking, etc. Deep neural networks are emerging as a powerful tool for biomedical image computing; in particular, convolutional neural networks have been widely applied to nucleus/cell detection in microscopy images. However, almost all models are tailored for specific datasets and their applicability to other microscopy image data remains unknown. Some existing studies casually learn and evaluate deep neural networks on multiple microscopy datasets, but there are still several critical, open questions to be addressed. Results We analyze the applicability of deep models specifically for nucleus detection across a wide variety of microscopy image data. More specifically, we present a fully convolutional network-based regression model and extensively evaluate it on large-scale digital pathology and microscopy image datasets, which consist of 23 organs (or cancer diseases) and come from multiple institutions. We demonstrate that for a specific target dataset, training with images from the same types of organs might be usually necessary for nucleus detection. Although the images can be visually similar due to the same staining technique and imaging protocol, deep models learned with images from different organs might not deliver desirable results and would require model fine-tuning to be on a par with those trained with target data. We also observe that training with a mixture of target and other/non-target data does not always mean a higher accuracy of nucleus detection, and it might require proper data manipulation during model training to achieve good performance. Conclusions We conduct a systematic case study on deep models for nucleus detection in a wide variety of microscopy images, aiming to address several important but previously understudied questions. We present and extensively evaluate an end-to-end, pixel-to-pixel fully convolutional regression network and report a few significant findings, some of which might have not been reported in previous studies. The model performance analysis and observations would be helpful to nucleus detection in microscopy images.


Author(s):  
David Mendonça ◽  
William A. Wallace ◽  
Barbara Cutler ◽  
James Brooks

AbstractLarge-scale disasters can produce profound disruptions in the fabric of interdependent critical infrastructure systems such as water, telecommunications and electric power. The work of post-disaster infrastructure restoration typically requires information sharing and close collaboration across these sectors; yet – due to a number of factors – the means to investigate decision making phenomena associated with these activities are limited. This paper motivates and describes the design and implementation of a computer-based synthetic environment for investigating collaborative information seeking in the performance of a (simulated) infrastructure restoration task. The main contributions of this work are twofold. First, it develops a set of theoretically grounded measures of collaborative information seeking processes and embeds them within a computer-based system. Second, it suggests how these data may be organized and modeled to yield insights into information seeking processes in the performance of a complex, collaborative task. The paper concludes with a discussion of implications of this work for practice and for future research.


2013 ◽  
Vol 07 (04) ◽  
pp. 377-405 ◽  
Author(s):  
TRAVIS GOODWIN ◽  
SANDA M. HARABAGIU

The introduction of electronic medical records (EMRs) enabled the access of unprecedented volumes of clinical data, both in structured and unstructured formats. A significant amount of this clinical data is expressed within the narrative portion of the EMRs, requiring natural language processing techniques to unlock the medical knowledge referred to by physicians. This knowledge, derived from the practice of medical care, complements medical knowledge already encoded in various structured biomedical ontologies. Moreover, the clinical knowledge derived from EMRs also exhibits relational information between medical concepts, derived from the cohesion property of clinical text, which is an attractive attribute that is currently missing from the vast biomedical knowledge bases. In this paper, we describe an automatic method of generating a graph of clinically related medical concepts by considering the belief values associated with those concepts. The belief value is an expression of the clinician's assertion that the concept is qualified as present, absent, suggested, hypothetical, ongoing, etc. Because the method detailed in this paper takes into account the hedging used by physicians when authoring EMRs, the resulting graph encodes qualified medical knowledge wherein each medical concept has an associated assertion (or belief value) and such qualified medical concepts are spanned by relations of different strengths, derived from the clinical contexts in which concepts are used. In this paper, we discuss the construction of a qualified medical knowledge graph (QMKG) and treat it as a BigData problem addressed by using MapReduce for deriving the weighted edges of the graph. To be able to assess the value of the QMKG, we demonstrate its usage for retrieving patient cohorts by enabling query expansion that produces greatly enhanced results against state-of-the-art methods.


Sign in / Sign up

Export Citation Format

Share Document