Generalized Radiograph Representation Learning via Cross-supervision between Images and Free-text Radiology Reports

Pre-training lays the foundation for recent successes in radiograph analysis supported by deep learning. It learns transferable image representations by conducting large-scale fully-supervised or self-supervised learning on a source domain. However, supervised pre-training requires a complex and labor intensive two-stage human-assisted annotation process while self-supervised learning cannot compete with the supervised paradigm. To tackle these issues, we propose a cross-supervised methodology named REviewing FreE-text Reports for Supervision (REFERS), which acquires free supervision signals from original radiology reports accompanying the radiographs. The proposed approach employs a vision transformer and is designed to learn joint representations from multiple views within every patient study. REFERS outperforms its transfer learning and self-supervised learning counterparts on 4 well-known X-ray datasets under extremely limited supervision. Moreover, REFERS even surpasses methods based on a source domain of radiographs with human-assisted structured labels. Thus REFERS has the potential to replace canonical pre-training methodologies.

Download Full-text

The Effect of Image Resolution on Automated Classification of Chest X-rays

10.1101/2021.07.30.21261225 ◽

2021 ◽

Author(s):

Md Inzamam Ul Haque ◽

Abhishek K Dubey ◽

Jacob D Hinkle

Keyword(s):

High Resolution ◽

Automated Analysis ◽

Image Resolution ◽

Free Text ◽

X Rays ◽

Training Time ◽

X Ray ◽

Radiology Reports ◽

Chest X Ray ◽

High Resolution Images

Deep learning models have received much attention lately for their ability to achieve expert-level performance on the accurate automated analysis of chest X-rays. Although publicly available chest X-ray datasets include high resolution images, most models are trained on reduced size images due to limitations on GPU memory and training time. As compute capability continues to advance, it will become feasible to train large convolutional neural networks on high-resolution images. This study is based on the publicly available MIMIC-CXR-JPG dataset, comprising 377,110 high resolution chest X-ray images, and provided with 14 labels to the corresponding free-text radiology reports. We find, interestingly, that tasks that require a large receptive field are better suited to downscaled input images, and we verify this qualitatively by inspecting effective receptive fields and class activation maps of trained models. Finally, we show that stacking an ensemble across resolutions outperforms each individual learner at all input resolutions while providing interpretable scale weights, suggesting that multi-scale features are crucially important to information extraction from high-resolution chest X-rays.

Download Full-text

Development of a System that Generates Structured Reports for Chest X-ray Radiography

Methods of Information in Medicine ◽

10.3414/me09-01-0014 ◽

2010 ◽

Vol 49 (04) ◽

pp. 360-370 ◽

Cited By ~ 4

Author(s):

Y. Matsumura ◽

N. Mihara ◽

Y. Kawakami ◽

K. Sasai ◽

H. Takeda ◽

...

Keyword(s):

Narrative Form ◽

Free Text ◽

X Rays ◽

Sentence Structure ◽

X Ray ◽

Radiology Reports ◽

Chest X Ray ◽

Structured Reports ◽

Description Framework ◽

Resource Description

Summary Objectives: Radiology reports are typically made in narrative form; this is a barrier to the implementation of advanced applications for data analysis or a decision support. We developed a system that generates structured reports for chest x-ray radiography. Methods: Based on analyzing existing reports, we determined the fundamental sentence structure of findings as compositions of procedure, region, finding, and diagnosis. We categorized the observation objects into lung, mediastinum, bone, soft tissue, and pleura and chest wall. The terms of region, finding, and diagnosis were associated with each other. We expressed the terms and the relations between the terms using a resource description framework (RDF) and developed a reporting system based on it. The system shows a list of terms in each category, and modifiers can be entered using templates that are linked to each term. This system guides users to select terms by highlighting associated terms. Fifty chest x-rays with abnormal findings were interpreted by five radiologists and reports were made either by the system or by the free-text method. Results: The system decreased the time needed to make a report by 12.5% compared with the free-text method, and the sentences generated by the system were well concordant with those made by free-text method (F-measure = 90%). The results of the questionnaire showed that our system is applicable to radiology reports of chest x-rays in daily clinical practice. Conclusions: The method of generating structured reports for chest x-rays was feasible, because it generated almost concordant reports in shorter time compared with the free-text method.

Download Full-text

Cross-modal Common Representation Learning by Hybrid Transfer Network

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/263 ◽

2017 ◽

Cited By ~ 20

Author(s):

Xin Huang ◽

Yuxin Peng ◽

Mingkuan Yuan

Keyword(s):

Domain Knowledge ◽

Large Scale ◽

Representation Learning ◽

Training Data ◽

Target Domain ◽

Retrieval Task ◽

Source Domain ◽

Useful Knowledge ◽

Semantic Correlation ◽

Common Representation

DNN-based cross-modal retrieval is a research hotspot to retrieve across different modalities as image and text, but existing methods often face the challenge of insufficient cross-modal training data. In single-modal scenario, similar problem is usually relieved by transferring knowledge from large-scale auxiliary datasets (as ImageNet). Knowledge from such single-modal datasets is also very useful for cross-modal retrieval, which can provide rich general semantic information that can be shared across different modalities. However, it is challenging to transfer useful knowledge from single-modal (as image) source domain to cross-modal (as image/text) target domain. Knowledge in source domain cannot be directly transferred to both two different modalities in target domain, and the inherent cross-modal correlation contained in target domain provides key hints for cross-modal retrieval which should be preserved during transfer process. This paper proposes Cross-modal Hybrid Transfer Network (CHTN) with two subnetworks: Modal-sharing transfer subnetwork utilizes the modality in both source and target domains as a bridge, for transferring knowledge to both two modalities simultaneously; Layer-sharing correlation subnetwork preserves the inherent cross-modal semantic correlation to further adapt to cross-modal retrieval task. Cross-modal data can be converted to common representation by CHTN for retrieval, and comprehensive experiment on 3 datasets shows its effectiveness.

Download Full-text

VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation

10.18653/v1/2021.acl-long.80 ◽

2021 ◽

Author(s):

Changhan Wang ◽

Morgane Riviere ◽

Ann Lee ◽

Anne Wu ◽

Chaitanya Talnikar ◽

...

Keyword(s):

Supervised Learning ◽

Large Scale ◽

Representation Learning ◽

Speech Corpus

Download Full-text

Finding relevant free-text radiology reports at scale with IBM Watson Content Analytics: a feasibility study in the UK NHS

Journal of Biomedical Semantics ◽

10.1186/s13326-019-0213-5 ◽

2019 ◽

Vol 10 (S1) ◽

Author(s):

Alicja Piotrkowicz ◽

Owen Johnson ◽

Geoff Hall

Keyword(s):

Feasibility Study ◽

Large Scale ◽

Information Overload ◽

Point Of Care ◽

Safety Net ◽

Free Text ◽

Text Analytics ◽

Clinical Safety ◽

Radiology Reports ◽

Ibm Watson

Abstract Background Significant amounts of health data are stored as free-text within clinical reports, letters, discharge summaries and notes. Busy clinicians have limited time to read such large amounts of free-text and are at risk of information overload and consequently missing information vital to patient care. Automatically identifying relevant information at the point of care has the potential to reduce these risks but represents a considerable research challenge. One software solution that has been proposed in industry is the IBM Watson analytics suite which includes rule-based analytics capable of processing large document collections at scale. Results In this paper we present an overview of IBM Watson Content Analytics and a feasibility study using Content Analytics with a large-scale corpus of clinical free-text reports within a UK National Health Service (NHS) context. We created dictionaries and rules for identifying positive incidence of hydronephrosis and brain metastasis from 5.6 m radiology reports and were able to achieve 94% precision, 95% recall and 89% precision, 94% recall respectively on a sample of manually annotated reports. With minor changes for US English we applied the same rule set to an open access corpus of 0.5 m radiology reports from a US hospital and achieved 93% precision, 94% recall and 84% precision, 88% recall respectively. Conclusions We were able to implement IBM Watson within a UK NHS context and demonstrate effective results that could provide clinicians with an automatic safety net which highlights clinically important information within free-text documents. Our results suggest that currently available technologies such as IBM Watson Content Analytics already have the potential to address information overload and improve clinical safety and that solutions developed in one hospital and country may be transportable to different hospitals and countries. Our study was limited to exploring technical aspects of the feasibility of one industry solution and we recognise that healthcare text analytics research is a fast-moving field. That said, we believe our study suggests that text analytics is sufficiently advanced to be implemented within industry solutions that can improve clinical safety.

Download Full-text

A Retrieval System for a Library of Pathology Reports, Slides and Kodachromes

Methods of Information in Medicine ◽

10.1055/s-0038-1636075 ◽

1972 ◽

Vol 11 (03) ◽

pp. 152-162 ◽

Cited By ~ 6

Author(s):

P. GAYNON ◽

R. L. WONG

Keyword(s):

Medical Records ◽

Operative Procedure ◽

Clinical Findings ◽

Free Text ◽

X Ray ◽

Inverted File ◽

Natural Language Parsing ◽

On Line ◽

Pathology Reports ◽

Frequency Counts

With the objective of providing easier access to pathology specimens, slides and kodachromes with linkage to x-ray and the remainder of the patient’s medical records, an automated natural language parsing routine, based on dictionary look-up, was written for Surgical Pathology document-pairs, each consisting of a Request for Examination (authored by clinicians) and its corresponding report (authored by pathologists). These documents were input to the system in free-text English without manual editing or coding.Two types of indices were prepared. The first was an »inverted« file, available for on-line retrieval, for display of the content of the document-pairs, frequency counts of cases or listing of cases in table format. Retrievable items are patient’s and specimen’s identification data, date of operation, name of clinician and pathologist, etc. The English content of the operative procedure, clinical findings and pathologic diagnoses can be retrieved through logical combination of key words. The second type of index was a catalog. Three catalog files — »operation«, »clinical«, and »pathology« — were prepared by alphabetization of lines formed by the rotation of phrases, headed by keywords. These keywords were automatically selected and standardized by the parsing routine and the phrases were extracted from each sentence of each input document. Over 2,500 document-pairs have been entered and are currently being utilized for purpose of medical education.

Download Full-text

Efficient Image Retrieval approach for Large-scale Chest X Ray data using Hand-Crafted Features and Machine Learning Algorithms

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i11.890896 ◽

2018 ◽

Vol 6 (11) ◽

pp. 890-896

Author(s):

Irene Getzi S ◽

D. Christopher Durairaj ◽

V Joseph Raj

Keyword(s):

Machine Learning ◽

Image Retrieval ◽

Large Scale ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

X Ray ◽

Chest X Ray

Download Full-text

A Drug Target Interaction Prediction Based on LINE-RF Learning

Current Bioinformatics ◽

10.2174/1574893615666191227092453 ◽

2020 ◽

Vol 15 (7) ◽

pp. 750-757

Author(s):

Jihong Wang ◽

Yue Shi ◽

Xiaodan Wang ◽

Huiyou Chang

Keyword(s):

Network Topology ◽

Drug Target ◽

Large Scale ◽

Representation Learning ◽

New Drugs ◽

Combination Method ◽

Learning Methods ◽

Network Representation ◽

On Line ◽

Clinical Experiments

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.

Download Full-text

Developing a RadLex-based Named Entity Recognition Tool for Mining Textual Radiology Reports (Preprint)

10.2196/preprints.25378 ◽

2020 ◽

Author(s):

Shintaro Tsuji ◽

Andrew Wen ◽

Naoki Takahashi ◽

Hongjian Zhang ◽

Katsuhiko Ogasawara ◽

...

Keyword(s):

Named Entity Recognition ◽

Noun Phrases ◽

General Purpose ◽

Entity Recognition ◽

Free Text ◽

Clinical Text ◽

Named Entity ◽

Radiology Reports ◽

Two Measures ◽

F Measure

BACKGROUND Named entity recognition (NER) plays an important role in extracting the features of descriptions for mining free-text radiology reports. However, the performance of existing NER tools is limited because the number of entities depends on its dictionary lookup. Especially, the recognition of compound terms is very complicated because there are a variety of patterns. OBJECTIVE The objective of the study is to develop and evaluate a NER tool concerned with compound terms using the RadLex for mining free-text radiology reports. METHODS We leveraged the clinical Text Analysis and Knowledge Extraction System (cTAKES) to develop customized pipelines using both RadLex and SentiWordNet (a general-purpose dictionary, GPD). We manually annotated 400 of radiology reports for compound terms (Cts) in noun phrases and used them as the gold standard for the performance evaluation (precision, recall, and F-measure). Additionally, we also created a compound-term-enhanced dictionary (CtED) by analyzing false negatives (FNs) and false positives (FPs), and applied it for another 100 radiology reports for validation. We also evaluated the stem terms of compound terms, through defining two measures: an occurrence ratio (OR) and a matching ratio (MR). RESULTS The F-measure of the cTAKES+RadLex+GPD was 32.2% (Precision 92.1%, Recall 19.6%) and that of combined the CtED was 67.1% (Precision 98.1%, Recall 51.0%). The OR indicated that stem terms of “effusion”, "node", "tube", and "disease" were used frequently, but it still lacks capturing Cts. The MR showed that 71.9% of stem terms matched with that of ontologies and RadLex improved about 22% of the MR from the cTAKES default dictionary. The OR and MR revealed that the characteristics of stem terms would have the potential to help generate synonymous phrases using ontologies. CONCLUSIONS We developed a RadLex-based customized pipeline for parsing radiology reports and demonstrated that CtED and stem term analysis has the potential to improve dictionary-based NER performance toward expanding vocabularies.

Download Full-text

Digging for the truth: the case for active annotation in evaluating the credibility of online medical information (Preprint)

10.2196/preprints.25920 ◽

2020 ◽

Author(s):

Mikołaj Morzy ◽

Bartłomiej Balcerzak ◽

Adam Wierzbicki ◽

Adam Wierzbicki

Keyword(s):

Machine Learning ◽

Medical Information ◽

Representation Learning ◽

Training Dataset ◽

Highly Qualified ◽

Human In The Loop ◽

Annotation Process ◽

Comprehensive Framework ◽

Online Sources ◽

The Web

BACKGROUND With the rapidly accelerating spread of dissemination of false medical information on the Web, the task of establishing the credibility of online sources of medical information becomes a pressing necessity. The sheer number of websites offering questionable medical information presented as reliable and actionable suggestions with possibly harmful effects poses an additional requirement for potential solutions, as they have to scale to the size of the problem. Machine learning is one such solution which, when properly deployed, can be an effective tool in fighting medical disinformation on the Web. OBJECTIVE We present a comprehensive framework for designing and curating of machine learning training datasets for online medical information credibility assessment. We show how the annotation process should be constructed and what pitfalls should be avoided. Our main objective is to provide researchers from medical and computer science communities with guidelines on how to construct datasets for machine learning models for various areas of medical information wars. METHODS The key component of our approach is the active annotation process. We begin by outlining the annotation protocol for the curation of high-quality training dataset, which then can be augmented and rapidly extended by employing the human-in-the-loop paradigm to machine learning training. To circumvent the cold start problem of insufficient gold standard annotations, we propose a pre-processing pipeline consisting of representation learning, clustering, and re-ranking of sentences for the acceleration of the training process and the optimization of human resources involved in the annotation. RESULTS We collect over 10 000 annotations of sentences related to selected subjects (psychiatry, cholesterol, autism, antibiotics, vaccines, steroids, birth methods, food allergy testing) for less than $7 000 employing 9 highly qualified annotators (certified medical professionals) and we release this dataset to the general public. We develop an active annotation framework for more efficient annotation of non-credible medical statements. The results of the qualitative analysis support our claims of the efficacy of the presented method. CONCLUSIONS A set of very diverse incentives is driving the widespread dissemination of medical disinformation on the Web. An effective strategy of countering this spread is to use machine learning for automatically establishing the credibility of online medical information. This, however, requires a thoughtful design of the training pipeline. In this paper we present a comprehensive framework of active annotation. In addition, we publish a large curated dataset of medical statements labelled as credible, non-credible, or neutral.

Download Full-text