Multitask Recalibrated Aggregation Network for Medical Code Prediction

AbstractMedical coding translates professionally written medical reports into standardized codes, which is an essential part of medical information systems and health insurance reimbursement. Manual coding by trained human coders is time-consuming and error-prone. Thus, automated coding algorithms have been developed, building especially on the recent advances in machine learning and deep neural networks. To solve the challenges of encoding lengthy and noisy clinical documents and capturing code associations, we propose a multitask recalibrated aggregation network. In particular, multitask learning shares information across different coding schemes and captures the dependencies between different medical codes. Feature recalibration and aggregation in shared modules enhance representation learning for lengthy notes. Experiments with a real-world MIMIC-III dataset show significantly improved predictive performance.

Download Full-text

Creating Resources for Marking Diagnoses in Electronic Health Reports in Serbian

IJEEC - INTERNATIONAL JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTING ◽

10.7251/ijeec2001018m ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Ulfeta Marovac ◽

Aldina Avdić ◽

Dragan Janković ◽

Sead Marovac

Keyword(s):

Medical Information ◽

Free Text ◽

Lexical Resources ◽

Natural Languages ◽

Medical Information Systems ◽

Structural Part ◽

Diagnosis Codes ◽

Icd 10 ◽

Medical Reports ◽

Extract Information

Thanks to medical information systems, many medical reports are collected in an electronic form daily. Apart from the fields with allowed values for input (the structural part), one part of these reports consists of the free, non-structural text. It contains a more detailed description of the patient's condition, which could not be described using the structural part. Symptoms, results of laboratory analyses, accompanying diagnoses, etc. can often be found in it. Due to a lack of time, doctors often write these descriptions in non-standard ways, using their abbreviations and synonyms, and they often contain typos. All this makes it difficult to extract information in documents specific to the medical domain. This paper presents the creation of medical lexical resources for the automatic labeling of terms from diagnoses in medical reports. In order to perform the automatic marking of the free text, methods of the computer processing of natural languages are needed, as well as appropriate lexical resources. As there are no publicly available medical lexical resources for the Serbian language, as well as a corpus with medical reports, the contribution of this paper is the construction of such resources for needs of automatic marking of diagnoses. Using the proposed resources, diagnosis codes, Latin and Serbian terms specific to certain ICD-10 can be mapped with precision of 83.47%, 86.86% and 78.29%, respectively.

Download Full-text

Converging Semantic Knowledge and Deep Learning for Medical Coding

International Journal of Privacy and Health Information Management ◽

10.4018/ijphim.2019070103 ◽

2019 ◽

Vol 7 (2) ◽

pp. 33-52

Author(s):

Nuria Garcia-Santa ◽

Beatriz San Miguel ◽

Takanori Ugai

Keyword(s):

Deep Learning ◽

International Classification Of Diseases ◽

Semantic Knowledge ◽

Clinical Notes ◽

Medical Coding ◽

Advantages And Disadvantages ◽

Classification Of Diseases ◽

Medical Reports ◽

Mimic Iii

The field of medical coding enables to assign codes of medical classifications such as the international classification of diseases (ICD) to clinical notes, which are medical reports about patients' conditions written by healthcare professionals in natural language. These texts potentially include medical terms that define diagnosis, symptoms, drugs, treatments, etc., and the use of spontaneous language is challenging for automatic processing. Medical coding is usually performed manually by human medical coders becoming time-consuming and prone to errors. This research aims at developing new approaches that combine deep learning elements together with traditional technologies. A semantic-based proposal supported by a proprietary knowledge graph (KG), neural network implementations, and an ensemble model to resolve the medical coding are presented. A comparative discussion between the proposals where the advantages and disadvantages of each one is analysed. To evaluate approaches, two main corpus have been used: MIMIC-III and private de-identified clinical notes.

Download Full-text

Medical information systems

AccessScience ◽

10.1036/1097-8542.412875 ◽

2015 ◽

Keyword(s):

Information Systems ◽

Medical Information ◽

Medical Information Systems

Download Full-text

A Comprehensive Model for Medical Information Processing

Methods of Information in Medicine ◽

10.1055/s-0038-1635429 ◽

1983 ◽

Vol 22 (03) ◽

pp. 124-130 ◽

Cited By ~ 11

Author(s):

J. H. Bemmel

Keyword(s):

Intensive Care ◽

Information Processing ◽

Medical Information ◽

Computer Applications ◽

Human Interaction ◽

Comprehensive Model ◽

Computers In Medicine ◽

Medical Information Systems ◽

Research And Education ◽

The Many

At first sight, the many applications of computers in medicine—from payroll and registration systems to computerized tomography, intensive care and diagnostics—do make a rather chaotic impression. The purpose of this article is to propose a scheme or working model for putting medical information systems in order. The model comprises six »levels of complexity«, running parallel to dependence on human interaction. Several examples are treated to illustrate the scheme. The reason why certain computer applications are more frequently used than others is analyzed. It has to be strongly considered that the differences in complexity and dependence on human involvement are not accidental but fundamental. This has consequences for research and education which are also discussed.

Download Full-text

Health informatics. Information security management for remote maintenance of medical devices and medical information systems

10.3403/30203776u ◽

2015 ◽

Keyword(s):

Information Systems ◽

Information Security ◽

Medical Devices ◽

Health Informatics ◽

Medical Information ◽

Security Management ◽

Information Security Management ◽

Medical Information Systems ◽

Remote Maintenance

Download Full-text

Digging for the truth: the case for active annotation in evaluating the credibility of online medical information (Preprint)

10.2196/preprints.25920 ◽

2020 ◽

Author(s):

Mikołaj Morzy ◽

Bartłomiej Balcerzak ◽

Adam Wierzbicki ◽

Adam Wierzbicki

Keyword(s):

Machine Learning ◽

Medical Information ◽

Representation Learning ◽

Training Dataset ◽

Highly Qualified ◽

Human In The Loop ◽

Annotation Process ◽

Comprehensive Framework ◽

Online Sources ◽

The Web

BACKGROUND With the rapidly accelerating spread of dissemination of false medical information on the Web, the task of establishing the credibility of online sources of medical information becomes a pressing necessity. The sheer number of websites offering questionable medical information presented as reliable and actionable suggestions with possibly harmful effects poses an additional requirement for potential solutions, as they have to scale to the size of the problem. Machine learning is one such solution which, when properly deployed, can be an effective tool in fighting medical disinformation on the Web. OBJECTIVE We present a comprehensive framework for designing and curating of machine learning training datasets for online medical information credibility assessment. We show how the annotation process should be constructed and what pitfalls should be avoided. Our main objective is to provide researchers from medical and computer science communities with guidelines on how to construct datasets for machine learning models for various areas of medical information wars. METHODS The key component of our approach is the active annotation process. We begin by outlining the annotation protocol for the curation of high-quality training dataset, which then can be augmented and rapidly extended by employing the human-in-the-loop paradigm to machine learning training. To circumvent the cold start problem of insufficient gold standard annotations, we propose a pre-processing pipeline consisting of representation learning, clustering, and re-ranking of sentences for the acceleration of the training process and the optimization of human resources involved in the annotation. RESULTS We collect over 10 000 annotations of sentences related to selected subjects (psychiatry, cholesterol, autism, antibiotics, vaccines, steroids, birth methods, food allergy testing) for less than $7 000 employing 9 highly qualified annotators (certified medical professionals) and we release this dataset to the general public. We develop an active annotation framework for more efficient annotation of non-credible medical statements. The results of the qualitative analysis support our claims of the efficacy of the presented method. CONCLUSIONS A set of very diverse incentives is driving the widespread dissemination of medical disinformation on the Web. An effective strategy of countering this spread is to use machine learning for automatically establishing the credibility of online medical information. This, however, requires a thoughtful design of the training pipeline. In this paper we present a comprehensive framework of active annotation. In addition, we publish a large curated dataset of medical statements labelled as credible, non-credible, or neutral.

Download Full-text