scholarly journals CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison

Author(s):  
Jeremy Irvin ◽  
Pranav Rajpurkar ◽  
Michael Ko ◽  
Yifan Yu ◽  
Silviana Ciurea-Ilcus ◽  
...  

Large, labeled datasets have driven deep learning methods to achieve expert-level performance on a variety of medical imaging tasks. We present CheXpert, a large dataset that contains 224,316 chest radiographs of 65,240 patients. We design a labeler to automatically detect the presence of 14 observations in radiology reports, capturing uncertainties inherent in radiograph interpretation. We investigate different approaches to using the uncertainty labels for training convolutional neural networks that output the probability of these observations given the available frontal and lateral radiographs. On a validation set of 200 chest radiographic studies which were manually annotated by 3 board-certified radiologists, we find that different uncertainty approaches are useful for different pathologies. We then evaluate our best model on a test set composed of 500 chest radiographic studies annotated by a consensus of 5 board-certified radiologists, and compare the performance of our model to that of 3 additional radiologists in the detection of 5 selected pathologies. On Cardiomegaly, Edema, and Pleural Effusion, the model ROC and PR curves lie above all 3 radiologist operating points. We release the dataset to the public as a standard benchmark to evaluate performance of chest radiograph interpretation models.

Diagnostics ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. 1943
Author(s):  
Diego R. Cervera ◽  
Luke Smith ◽  
Luis Diaz-Santana ◽  
Meenakshi Kumar ◽  
Rajiv Raman ◽  
...  

The aim of this study was to develop and validate a deep learning-based system to detect peripheral neuropathy (DN) from retinal colour images in people with diabetes. Retinal images from 1561 people with diabetes were used to predictDN diagnosed on vibration perception threshold. A total of 189 had diabetic retinopathy (DR), 276 had DN, and 43 had both DR and DN. 90% of the images were used for training and validation and 10% for testing. Deep neural networks, including Squeezenet, Inception, and Densenet were utilized, and the architectures were tested with and without pre-trained weights. Random transform of images was used during training. The algorithm was trained and tested using three sets of data: all retinal images, images without DR and images with DR. Area under the ROC curve (AUC) was used to evaluate performance. The AUC to predict DN on the whole cohort was 0.8013 (±0.0257) on the validation set and 0.7097 (±0.0031) on the test set. The AUC increased to 0.8673 (±0.0088) in the presence of DR. The retinal images can be used to identify individuals with DN and provides an opportunity to educate patients about their DN status when they attend DR screening.


2020 ◽  
Vol 9 (1) ◽  
pp. 7-10
Author(s):  
Hendry Fonda

ABSTRACT Riau batik is known since the 18th century and is used by royal kings. Riau Batik is made by using a stamp that is mixed with coloring and then printed on fabric. The fabric used is usually silk. As its development, comparing Javanese  batik with riau batik Riau is very slowly accepted by the public. Convolutional Neural Networks (CNN) is a combination of artificial neural networks and deeplearning methods. CNN consists of one or more convolutional layers, often with a subsampling layer followed by one or more fully connected layers as a standard neural network. In the process, CNN will conduct training and testing of Riau batik so that a collection of batik models that have been classified based on the characteristics that exist in Riau batik can be determined so that images are Riau batik and non-Riau batik. Classification using CNN produces Riau batik and not Riau batik with an accuracy of 65%. Accuracy of 65% is due to basically many of the same motifs between batik and other batik with the difference lies in the color of the absorption in the batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning   ABSTRAK   Batik Riau dikenal sejak abad ke 18 dan digunakan oleh bangsawan raja. Batik Riau dibuat dengan menggunakan cap yang dicampur dengan pewarna kemudian dicetak di kain. Kain yang digunakan biasanya sutra. Seiring perkembangannya, dibandingkan batik Jawa maka batik Riau sangat lambat diterima oleh masyarakat. Convolutional Neural Networks (CNN) merupakan kombinasi dari jaringan syaraf tiruan dan metode deeplearning. CNN terdiri dari satu atau lebih lapisan konvolutional, seringnya dengan suatu lapisan subsampling yang diikuti oleh satu atau lebih lapisan yang terhubung penuh sebagai standar jaringan syaraf. Dalam prosesnya CNN akan melakukan training dan testing terhadap batik Riau sehingga didapat kumpulan model batik yang telah terklasi    fikasi berdasarkan ciri khas yang ada pada batik Riau sehingga dapat ditentukan gambar (image) yang merupakan batik Riau dan yang bukan merupakan batik Riau. Klasifikasi menggunakan CNN menghasilkan batik riau dan bukan batik riau dengan akurasi 65%. Akurasi 65% disebabkan pada dasarnya banyak motif yang sama antara batik riau dengan batik lainnya dengan perbedaan terletak pada warna cerap pada batik riau. Kata kunci: Batik; Batik Riau; CNN; Image; Deep Learning


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
John Pfeifer ◽  
Sushravya M Raghunath ◽  
Alvaro Ulloa ◽  
Arun Nemani ◽  
Tanner Carbonati ◽  
...  

Background: Atrial fibrillation (AF) is associated with stroke, especially when AF goes undetected. Deep neural networks (DNN) can predict incident AF from a 12-lead resting ECG. We hypothesize that use of a DNN to predict new onset AF from an ECG may identify patients at risk of sustaining a potentially preventable AF-related stroke. Methods: We trained a DNN model to predict new-onset AF using 382,604 ECGs prior to 2010. We then evaluated the model performance on a test set of ECGs from 2010 through 2014 linked to patients in an institutional stroke registry. There were 181,969 patients in the test set with at least one ECG and no prior history of AF. Of those patients 3,497 (1.9%) had a stroke following an ECG that did not show AF. Within the set of patients with stroke, 375 had the stroke within 3 years of the ECG and were diagnosed with new AF between -3 and 365 days of the stroke. We considered these potentially preventable AF-related strokes. We report the sensitivity and positive predictive value (PPV) of the model for appropriately risk stratifying these 375 patients who sustained a potentially preventable AF-related stroke. Results: We used F β scores to identify different risk prediction thresholds (operating points) for the model. Operating points chosen by F 0.5 , F 1 , and F 2 scores identified 4, 12, and 21% of the population as high risk for the development of AF within 1 year (Figure 1). Screening 1, 4, 12, and 21% of the overall population resulted in PPV of 28, 21, 15, and 12%, respectively, for identification of new onset AF in one year. Using those same thresholds yielded sensitivities of 4, 17, 45, and 62% for identifying potentially preventable AF-related strokes. The different risk prediction thresholds resulted in a low (120-162) number needed to screen to detect one potentially preventable AF-related stroke at 3 years. Conclusions: Use of a deep learning model to predict new onset AF may identify patients at high risk of sustaining a potentially preventable AF-related stroke.


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6666
Author(s):  
Kamil Książek ◽  
Michał Romaszewski ◽  
Przemysław Głomb ◽  
Bartosz Grabowski ◽  
Michał Cholewa

In recent years, growing interest in deep learning neural networks has raised a question on how they can be used for effective processing of high-dimensional datasets produced by hyperspectral imaging (HSI). HSI, traditionally viewed as being within the scope of remote sensing, is used in non-invasive substance classification. One of the areas of potential application is forensic science, where substance classification on the scenes is important. An example problem from that area—blood stain classification—is a case study for the evaluation of methods that process hyperspectral data. To investigate the deep learning classification performance for this problem we have performed experiments on a dataset which has not been previously tested using this kind of model. This dataset consists of several images with blood and blood-like substances like ketchup, tomato concentrate, artificial blood, etc. To test both the classic approach to hyperspectral classification and a more realistic application-oriented scenario, we have prepared two different sets of experiments. In the first one, Hyperspectral Transductive Classification (HTC), both a training and a test set come from the same image. In the second one, Hyperspectral Inductive Classification (HIC), a test set is derived from a different image, which is more challenging for classifiers but more useful from the point of view of forensic investigators. We conducted the study using several architectures like 1D, 2D and 3D convolutional neural networks (CNN), a recurrent neural network (RNN) and a multilayer perceptron (MLP). The performance of the models was compared with baseline results of Support Vector Machine (SVM). We have also presented a model evaluation method based on t-SNE and confusion matrix analysis that allows us to detect and eliminate some cases of model undertraining. Our results show that in the transductive case, all models, including the MLP and the SVM, have comparative performance, with no clear advantage of deep learning models. The Overall Accuracy range across all models is 98–100% for the easier image set, and 74–94% for the more difficult one. However, in a more challenging inductive case, selected deep learning architectures offer a significant advantage; their best Overall Accuracy is in the range of 57–71%, improving the baseline set by the non-deep models by up to 9 percentage points. We have presented a detailed analysis of results and a discussion, including a summary of conclusions for each tested architecture. An analysis of per-class errors shows that the score for each class is highly model-dependent. Considering this and the fact that the best performing models come from two different architecture families (3D CNN and RNN), our results suggest that tailoring the deep neural network architecture to hyperspectral data is still an open problem.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 2084-2084 ◽  
Author(s):  
Ta-Chuan Yu ◽  
Wen-Chien Chou ◽  
Chao-Yuan Yeh ◽  
Cheng-Kun Yang ◽  
Sheng-Chuan Huang ◽  
...  

Purpose Differential counting of blood cells is the basis of diagnostic hematology. In many circumstances, identification of cells in bone marrow smears is the golden standard for diagnosis. Presently, methods for automatic differential counting of peripheral blood are readily available commercially. However, morphological assessment and differential counting of bone marrow smears are still performed manually. This procedure is tedious, time-consuming and laden with high inter-operator variation. In recent years, deep neural networks have proven useful in many medical image recognition tasks, such as diagnosis of diabetic retinopathy, and detection of cancer metastasis in lymph nodes. However, there has been no published work on using deep neural networks for complete differential counting of entire bone marrow smear. In this work, we present the results of using deep convolutional neural network for automatic differential counting of bone marrow nucleated cells. Materials & Methods The bone marrow smears from patients with either benign or malignant disorders in National Taiwan University Hospital were recruited in this study. The bone marrow smears are stained with Liu's stain, a modified Romanowsky stain. Digital images of the bone marrow smears were taken using 1000x oil immersion lens and 20MP color CCD camera on a single microscope with standard illumination and white-balance settings. The contour of each nucleated cell was artificially defined. These cells were then divided into a training/validation set and a test set. Each cell was then classified into 1 of the 11 categories (blast, promyelocyte, neutrophilic myelocyte, neutrophilic metamyelocyte, neutrophils, eosinophils and precursors, basophil, monocyte and precursors, lymphocyte, erythroid lineage cells, and invalid cell). In training/validation set, the classification of each cell was annotated once by experienced medical technician or hematologist. The annotated dataset was used to train a Path-Aggregation Network for instance segmentation task. In test set, cell classification was annotated by three medical technicians or hematologists; only over 2/3 consensus was regarded as valid. After the neural network model was fully trained, the ability of the model to classify and detect bone marrow nucleated cells was evaluated in terms of precision, recall and accuracy. During the model training, we used group normalization and stochastic gradient descent optimizer for training. Random noise, Gaussian blur, rotation, contrast and color shift were also used as means for data augmentation. Results The digital images of 150 bone marrow aspirate smears were taken for this study. They included 61 for acute leukemia, 39 for lymphoma, 2 for myelodysplastic syndrome (MDS), 2 for myeloproliferative neoplasm (MPN), 10 for MDS/MPN, 12 for multiple myeloma, 4 for hemolytic anemia, 9 for aplastic anemia, 8 for infectious etiology and 3 for solid cancers. The final data contained 5927 images and 187730 nucleated bone marrow cells, which were divided into 2 sets: 5630 images containing 170966 cells as the training/validation set, and 297 images containing 16764 cells as the test set. Among the 16764 cells annotated in test set, 15676 cells (93.6 %) reached over 2/3 consensus. The trained neural network achieved 0.832 recall and 0.736 precision for cell detection task, 0.79 mean intersection over union (IOU) for cell segmentation task, mean average precision of 0.659 and accuracy of 0.801 for cell classification. For individual cell categories, the model performs the best with "erythroid-lineage-cells" (0.971 recall, 0.935 precision) and the worst with "monocyte-and-precursors" (0.825 recall, 0.337 precision). Conclusions We have created the largest and the most comprehensive annotated bone marrow smear image dataset for deep neural network training. Compared with previous works, our approach is more practical for clinical application because it is able to take in an entire field of smear and generate differential counts without any other preprocessing steps. Current results are highly encouraging. With continued expansion of dataset, our model would be more precise and clinically useful. Figure Disclosures Yeh: aether AI: Other: CEO and co-founder. Yang:aether AI: Employment. Tien:Novartis: Honoraria; Daiichi Sankyo: Honoraria; Celgene: Research Funding; Roche: Honoraria; Johnson &Johnson: Honoraria; Alexion: Honoraria; BMS: Honoraria; Roche: Research Funding; Celgene: Honoraria; Pfizer: Honoraria; Abbvie: Honoraria. Hsu:aether AI: Employment.


BMC Genomics ◽  
2020 ◽  
Vol 21 (S11) ◽  
Author(s):  
Chen Li ◽  
Jiaxing Chen ◽  
Shuai Cheng Li

Abstract Background Horizontal Gene Transfer (HGT) refers to the sharing of genetic materials between distant species that are not in a parent-offspring relationship. The HGT insertion sites are important to understand the HGT mechanisms. Recent studies in main agents of HGT, such as transposon and plasmid, demonstrate that insertion sites usually hold specific sequence features. This motivates us to find a method to infer HGT insertion sites according to sequence features. Results In this paper, we propose a deep residual network, DeepHGT, to recognize HGT insertion sites. To train DeepHGT, we extracted about 1.55 million sequence segments as training instances from 262 metagenomic samples, where the ratio between positive instances and negative instances is about 1:1. These segments are randomly partitioned into three subsets: 80% of them as the training set, 10% as the validation set, and the remaining 10% as the test set. The training loss of DeepHGT is 0.4163 and the validation loss is 0.423. On the test set, DeepHGT has achieved the area under curve (AUC) value of 0.8782. Furthermore, in order to further evaluate the generalization of DeepHGT, we constructed an independent test set containing 689,312 sequence segments from another 147 gut metagenomic samples. DeepHGT has achieved the AUC value of 0.8428, which approaches the previous test AUC value. As a comparison, the gradient boosting classifier model implemented in PyFeat achieve an AUC value of 0.694 and 0.686 on the above two test sets, respectively. Furthermore, DeepHGT could learn discriminant sequence features; for example, DeepHGT has learned a sequence pattern of palindromic subsequences as a significantly (P-value=0.0182) local feature. Hence, DeepHGT is a reliable model to recognize the HGT insertion site. Conclusion DeepHGT is the first deep learning model that can accurately recognize HGT insertion sites on genomes according to the sequence pattern.


2021 ◽  
Vol 10 (8) ◽  
pp. 1772
Author(s):  
Hyun-Doo Moon ◽  
Han-Gyeol Choi ◽  
Kyong-Joon Lee ◽  
Dong-Jun Choi ◽  
Hyun-Jin Yoo ◽  
...  

Weight bearing whole-leg radiograph (WLR) is essential to assess lower limb alignment such as weight bearing line (WBL) ratio. The purpose of this study was to develop a deep learning (DL) model that predicts the WBL ratio using knee standing AP alone. Total of 3997 knee AP & WLRs were used. WBL ratio was used for labeling and analysis of prediction accuracy. The WBL ratio was divided into seven categories (0, 0.1, 0.2, 0.3, 0.4, 0.5, and 0.6). After training, performance of the DL model was evaluated. Final performance was evaluated using 386 subjects as a test set. Cumulative score (CS) within error range 0.1 was set with showing maximum CS in the validation set (95% CI, 0.924–0.970). In the test set, mean absolute error was 0.054 (95% CI, 0.048–0.061) and CS was 0.951 (95% CI, 0.924–0.970). Developed DL algorithm could predict the WBL ratio on knee standing AP alone with comparable accuracy as the degree primary physician can assess the alignment. It can be the basis for developing an automated lower limb alignment assessment tool that can be used easily and cost-effectively in primary clinics.


2018 ◽  
Author(s):  
Karim Rajaei ◽  
Yalda Mohsenzadeh ◽  
Reza Ebrahimpour ◽  
Seyed-Mahdi Khaligh-Razavi

AbstractCore object recognition, the ability to rapidly recognize objects despite variations in their appearance, is largely solved through the feedforward processing of visual information. Deep neural networks are shown to achieve human-level performance in these tasks, and explain the primate brain representation. On the other hand, object recognition under more challenging conditions (i.e. beyond the core recognition problem) is less characterized. One such example is object recognition under occlusion. It is unclear to what extent feedforward and recurrent processes contribute in object recognition under occlusion. Furthermore, we do not know whether the conventional deep neural networks, such as AlexNet, which were shown to be successful in solving core object recognition, can perform similarly well in problems that go beyond the core recognition. Here, we characterize neural dynamics of object recognition under occlusion, using magnetoencephalography (MEG), while participants were presented with images of objects with various levels of occlusion. We provide evidence from multivariate analysis of MEG data, behavioral data, and computational modelling, demonstrating an essential role for recurrent processes in object recognition under occlusion. Furthermore, the computational model with local recurrent connections, used here, suggests a mechanistic explanation of how the human brain might be solving this problem.Author SummaryIn recent years, deep-learning-based computer vision algorithms have been able to achieve human-level performance in several object recognition tasks. This has also contributed in our understanding of how our brain may be solving these recognition tasks. However, object recognition under more challenging conditions, such as occlusion, is less characterized. Temporal dynamics of object recognition under occlusion is largely unknown in the human brain. Furthermore, we do not know if the previously successful deep-learning algorithms can similarly achieve human-level performance in these more challenging object recognition tasks. By linking brain data with behavior, and computational modeling, we characterized temporal dynamics of object recognition under occlusion, and proposed a computational mechanism that explains both behavioral and the neural data in humans. This provides a plausible mechanistic explanation for how our brain might be solving object recognition under more challenging conditions.


2018 ◽  
pp. 1-8 ◽  
Author(s):  
Okyaz Eminaga ◽  
Nurettin Eminaga ◽  
Axel Semjonow ◽  
Bernhard Breil

Purpose The recognition of cystoscopic findings remains challenging for young colleagues and depends on the examiner’s skills. Computer-aided diagnosis tools using feature extraction and deep learning show promise as instruments to perform diagnostic classification. Materials and Methods Our study considered 479 patient cases that represented 44 urologic findings. Image color was linearly normalized and was equalized by applying contrast-limited adaptive histogram equalization. Because these findings can be viewed via cystoscopy from every possible angle and side, we ultimately generated images rotated in 10-degree grades and flipped them vertically or horizontally, which resulted in 18,681 images. After image preprocessing, we developed deep convolutional neural network (CNN) models (ResNet50, VGG-19, VGG-16, InceptionV3, and Xception) and evaluated these models using F1 scores. Furthermore, we proposed two CNN concepts: 90%-previous-layer filter size and harmonic-series filter size. A training set (60%), a validation set (10%), and a test set (30%) were randomly generated from the study data set. All models were trained on the training set, validated on the validation set, and evaluated on the test set. Results The Xception-based model achieved the highest F1 score (99.52%), followed by models that were based on ResNet50 (99.48%) and the harmonic-series concept (99.45%). All images with cancer lesions were correctly determined by these models. When the focus was on the images misclassified by the model with the best performance, 7.86% of images that showed bladder stones with indwelling catheter and 1.43% of images that showed bladder diverticulum were falsely classified. Conclusion The results of this study show the potential of deep learning for the diagnostic classification of cystoscopic images. Future work will focus on integration of artificial intelligence–aided cystoscopy into clinical routines and possibly expansion to other clinical endoscopy applications.


2020 ◽  
Author(s):  
Wenzhong Liu

AbstractFruit classification is conductive to improving the self-checkout and packaging systems. The convolutional neural networks automatically extract features through the direct processing of original images, which has attracted extensive attention from researchers in fruit classification. However, due to the similarity of fruit color, it is difficult to recognize at a higher accuracy. In the present study, a deep learning network, Interfruit, was built to classify various types of fruit images. A fruit dataset involving 40 categories was also constructed to train the network model and to assess its performance. According to the evaluation results, the overall accuracy of Interfruit reached 93.17% in the test set, which was superior to that of several advanced methods. According to the findings, the classification system, Interfruit, recognizes fruits with high accuracy, which has a broad application prospect.


Sign in / Sign up

Export Citation Format

Share Document