Numerical Feature Transformation-based Sequence Generation Model for Multi-disease Diagnosis

Author(s):  
Ming Yuan ◽  
Jiangtao Ren

The goal of computer-aided diagnosis is to predict patient’s diseases based on patient’s clinical data. The development of deep learning technology provides new help for clinical diagnosis. In this paper, we propose a new sequence generation model for multi-disease diagnosis prediction based on numerical feature transformation. Our model simultaneously uses patient’s laboratory test results and clinical text as input to diagnose and predict the disease that the patient may have. According to medical knowledge, our model can transform numerical features into descriptive text features, thereby enriching the semantic information of clinical texts. Besides, our model uses attention-based sequence generation methods to achieve the diagnosis of multiple diseases and better utilizes the correlation information between multiple diseases. We evaluate our model’s performance on a dataset of respiratory diseases from the real world, and experimental results show that our model’s accuracy reaches 42.75%, and the [Formula: see text] score reaches 65.65%, which is better than many other methods. It is suitable for the accurate diagnosis of multiple diseases.

2013 ◽  
Vol 07 (04) ◽  
pp. 377-405 ◽  
Author(s):  
TRAVIS GOODWIN ◽  
SANDA M. HARABAGIU

The introduction of electronic medical records (EMRs) enabled the access of unprecedented volumes of clinical data, both in structured and unstructured formats. A significant amount of this clinical data is expressed within the narrative portion of the EMRs, requiring natural language processing techniques to unlock the medical knowledge referred to by physicians. This knowledge, derived from the practice of medical care, complements medical knowledge already encoded in various structured biomedical ontologies. Moreover, the clinical knowledge derived from EMRs also exhibits relational information between medical concepts, derived from the cohesion property of clinical text, which is an attractive attribute that is currently missing from the vast biomedical knowledge bases. In this paper, we describe an automatic method of generating a graph of clinically related medical concepts by considering the belief values associated with those concepts. The belief value is an expression of the clinician's assertion that the concept is qualified as present, absent, suggested, hypothetical, ongoing, etc. Because the method detailed in this paper takes into account the hedging used by physicians when authoring EMRs, the resulting graph encodes qualified medical knowledge wherein each medical concept has an associated assertion (or belief value) and such qualified medical concepts are spanned by relations of different strengths, derived from the clinical contexts in which concepts are used. In this paper, we discuss the construction of a qualified medical knowledge graph (QMKG) and treat it as a BigData problem addressed by using MapReduce for deriving the weighted edges of the graph. To be able to assess the value of the QMKG, we demonstrate its usage for retrieving patient cohorts by enabling query expansion that produces greatly enhanced results against state-of-the-art methods.


2019 ◽  
Vol 7 ◽  
pp. 661-676 ◽  
Author(s):  
Jiatao Gu ◽  
Qi Liu ◽  
Kyunghyun Cho

Conventional neural autoregressive decoding commonly assumes a fixed left-to-right generation order, which may be sub-optimal. In this work, we propose a novel decoding algorithm— InDIGO—which supports flexible sequence generation in arbitrary orders through insertion operations. We extend Transformer, a state-of-the-art sequence generation model, to efficiently implement the proposed approach, enabling it to be trained with either a pre-defined generation order or adaptive orders obtained from beam-search. Experiments on four real-world tasks, including word order recovery, machine translation, image caption, and code generation, demonstrate that our algorithm can generate sequences following arbitrary orders, while achieving competitive or even better performance compared with the conventional left-to-right generation. The generated sequences show that InDIGO adopts adaptive generation orders based on input information.


2019 ◽  
Vol 4 (4) ◽  
pp. 142 ◽  
Author(s):  
Junior Mudji ◽  
Jonathan Benhamou ◽  
Erick Mwamba-Miaka ◽  
Christian Burri ◽  
Johannes Blum

Human African Trypanosomiasis (HAT) is a neglected disease caused by the protozoan parasites Trypanosoma brucei and transmitted by tsetse flies that progresses in two phases. Symptoms in the first phase include fever, headaches, pruritus, lymphadenopathy, and in certain cases, hepato- and splenomegaly. Neurological disorders such as sleep disorder, aggressive behavior, logorrhea, psychotic reactions, and mood changes are signs of the second stage of the disease. Diagnosis follows complex algorithms, including serological testing and microscopy. Our case report illustrates the course of events of a 41-year old woman with sleep disorder, among other neurological symptoms, whose diagnosis was made seven months after the onset of symptoms. The patient had consulted two different hospitals in Kinshasa and was on the verge of being discharged from a third due to negative laboratory test results. This case report highlights the challenges that may arise when a disease is on the verge of eradication.


2011 ◽  
Vol 57 (8) ◽  
pp. 1108-1117 ◽  
Author(s):  
W Greg Miller ◽  
Gary L Myers ◽  
Mary Lou Gantzer ◽  
Stephen E Kahn ◽  
E Ralf Schönbrunner ◽  
...  

Abstract Results between different clinical laboratory measurement procedures (CLMP) should be equivalent, within clinically meaningful limits, to enable optimal use of clinical guidelines for disease diagnosis and patient management. When laboratory test results are neither standardized nor harmonized, a different numeric result may be obtained for the same clinical sample. Unfortunately, some guidelines are based on test results from a specific laboratory measurement procedure without consideration of the possibility or likelihood of differences between various procedures. When this happens, aggregation of data from different clinical research investigations and development of appropriate clinical practice guidelines will be flawed. A lack of recognition that results are neither standardized nor harmonized may lead to erroneous clinical, financial, regulatory, or technical decisions. Standardization of CLMPs has been accomplished for several measurands for which primary (pure substance) reference materials exist and/or reference measurement procedures (RMPs) have been developed. However, the harmonization of clinical laboratory procedures for measurands that do not have RMPs has been problematic owing to inadequate definition of the measurand, inadequate analytical specificity for the measurand, inadequate attention to the commutability of reference materials, and lack of a systematic approach for harmonization. To address these problems, an infrastructure must be developed to enable a systematic approach for identification and prioritization of measurands to be harmonized on the basis of clinical importance and technical feasibility, and for management of the technical implementation of a harmonization process for a specific measurand.


Author(s):  
Hung D. Nguyen ◽  
Tru H. Cao

Electronic medical records (EMR) have emerged as an important source of data for research in medicine andinformation technology, as they contain much of valuable human medical knowledge in healthcare and patienttreatment. This paper tackles the problem of coreference resolution in Vietnamese EMRs. Unlike in English ones,in Vietnamese clinical texts, verbs are often used to describe disease symptoms. So we first define rules to annotateverbs as mentions and consider coreference between verbs and other noun or adjective mentions possible. Thenwe propose a support vector machine classifier on bag-of-words vector representation of mentions that takes intoaccount the special characteristics of Vietnamese language to resolve their coreference. The achieved F1 scoreon our dataset of real Vietnamese EMRs provided by a hospital in Ho Chi Minh city is 91.4%. To the best of ourknowledge, this is the first research work in coreference resolution on Vietnamese clinical texts.Keywords: Clinical text, support vector machine, bag-of-words vector, lexical similarity, unrestricted coreference


2019 ◽  
Vol 131 ◽  
pp. 01118
Author(s):  
Fan Tongke

Aiming at the problem of disease diagnosis of large-scale crops, this paper combines machine vision and deep learning technology to propose an algorithm for constructing disease recognition by LM_BP neural network. The images of multiple crop leaves are collected, and the collected pictures are cut by image cutting technology, and the data are obtained by the color distance feature extraction method. The data are input into the disease recognition model, the feature weights are set, and the model is repeatedly trained to obtain accurate results. In this model, the research on corn disease shows that the model is simple and easy to implement, and the data are highly reliable.


2020 ◽  
Vol 103 ◽  
pp. 101772 ◽  
Author(s):  
Jingchi Jiang ◽  
Huanzheng Wang ◽  
Jing Xie ◽  
Xitong Guo ◽  
Yi Guan ◽  
...  

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Lu Zhou ◽  
Shuangqiao Liu ◽  
Caiyan Li ◽  
Yuemeng Sun ◽  
Yizhuo Zhang ◽  
...  

Background. The modernization of traditional Chinese medicine (TCM) demands systematic data mining using medical records. However, this process is hindered by the fact that many TCM symptoms have the same meaning but different literal expressions (i.e., TCM synonymous symptoms). This problem can be solved by using natural language processing algorithms to construct a high-quality TCM symptom normalization model for normalizing TCM synonymous symptoms to unified literal expressions. Methods. Four types of TCM symptom normalization models, based on natural language processing, were constructed to find a high-quality one: (1) a text sequence generation model based on a bidirectional long short-term memory (Bi-LSTM) neural network with an encoder-decoder structure; (2) a text classification model based on a Bi-LSTM neural network and sigmoid function; (3) a text sequence generation model based on bidirectional encoder representation from transformers (BERT) with sequence-to-sequence training method of unified language model (BERT-UniLM); (4) a text classification model based on BERT and sigmoid function (BERT-Classification). The performance of the models was compared using four metrics: accuracy, recall, precision, and F1-score. Results. The BERT-Classification model outperformed the models based on Bi-LSTM and BERT-UniLM with respect to the four metrics. Conclusions. The BERT-Classification model has superior performance in normalizing expressions of TCM synonymous symptoms.


Sign in / Sign up

Export Citation Format

Share Document