Deep Learning-Based Breast Cancer Diagnosis at Ultrasound: Initial Application of Weakly-Supervised Algorithm Without Image Annotation Original Research

Abstract Conventional deep learning (DL) algorithm requires full supervision of annotating the region of interest (ROI) that is laborious and often biased. We aimed to develop a weakly-supervised DL algorithm that diagnosis breast cancer at ultrasound without image annotation. Weakly-supervised DL algorithms were implemented with three networks (VGG16, ResNet34, and GoogLeNet) and trained using 1000 unannotated US images (500 benign and 500 malignant masses). Two sets of 200 images (100 benign and 100 malignant masses) were used for internal and external validation sets. For comparison with fully-supervised algorithms, ROI annotation was performed manually and automatically. Diagnostic performances were calculated as the area under the receiver operating characteristic curve (AUC). Using the class activation map, we determined how accurately the weakly-supervised DL algorithms localized the breast masses. For internal validation sets, the weakly-supervised DL algorithms achieved excellent diagnostic performances, with AUC values of 0.92–0.96, which were not statistically different (all Ps > 0.05) from those of fully-supervised DL algorithms with either manual or automated ROI annotation (AUC, 0.92–0.96). For external validation sets, the weakly-supervised DL algorithms achieved AUC values of 0.86–0.90, which were not statistically different (Ps > 0.05) or higher (P = 0.04, VGG16 with automated ROI annotation) from those of fully-supervised DL algorithms (AUC, 0.84–0.92). In internal and external validation sets, weakly-supervised algorithms could localize 100% of malignant masses, except for ResNet34 (98%). The weakly-supervised DL algorithms developed in the present study were feasible for US diagnosis of breast cancer with well-performing localization and differential diagnosis.

Download Full-text

Weakly-supervised deep learning for ultrasound diagnosis of breast cancer

Scientific Reports ◽

10.1038/s41598-021-03806-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jaeil Kim ◽

Hye Jung Kim ◽

Chanho Kim ◽

Jin Hwa Lee ◽

Keum Won Kim ◽

...

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Operating Characteristic ◽

Image Annotation ◽

Characteristic Curve ◽

External Validation ◽

Region Of Interest ◽

Internal Validation ◽

Breast Masses ◽

Weakly Supervised

AbstractConventional deep learning (DL) algorithm requires full supervision of annotating the region of interest (ROI) that is laborious and often biased. We aimed to develop a weakly-supervised DL algorithm that diagnosis breast cancer at ultrasound without image annotation. Weakly-supervised DL algorithms were implemented with three networks (VGG16, ResNet34, and GoogLeNet) and trained using 1000 unannotated US images (500 benign and 500 malignant masses). Two sets of 200 images (100 benign and 100 malignant masses) were used for internal and external validation sets. For comparison with fully-supervised algorithms, ROI annotation was performed manually and automatically. Diagnostic performances were calculated as the area under the receiver operating characteristic curve (AUC). Using the class activation map, we determined how accurately the weakly-supervised DL algorithms localized the breast masses. For internal validation sets, the weakly-supervised DL algorithms achieved excellent diagnostic performances, with AUC values of 0.92–0.96, which were not statistically different (all Ps > 0.05) from those of fully-supervised DL algorithms with either manual or automated ROI annotation (AUC, 0.92–0.96). For external validation sets, the weakly-supervised DL algorithms achieved AUC values of 0.86–0.90, which were not statistically different (Ps > 0.05) or higher (P = 0.04, VGG16 with automated ROI annotation) from those of fully-supervised DL algorithms (AUC, 0.84–0.92). In internal and external validation sets, weakly-supervised algorithms could localize 100% of malignant masses, except for ResNet34 (98%). The weakly-supervised DL algorithms developed in the present study were feasible for US diagnosis of breast cancer with well-performing localization and differential diagnosis.

Download Full-text

Automated Breast Cancer Diagnosis Using Deep Learning and Region of Interest Detection (BC-DROID)

Proceedings of the 8th ACM International Conference on Bioinformatics, Computational Biology,and Health Informatics - ACM-BCB '17 ◽

10.1145/3107411.3107484 ◽

2017 ◽

Cited By ~ 13

Author(s):

Richard Platania ◽

Shayan Shams ◽

Seungwon Yang ◽

Jian Zhang ◽

Kisung Lee ◽

...

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Cancer Diagnosis ◽

Region Of Interest ◽

Breast Cancer Diagnosis ◽

Interest Detection

Download Full-text

Deep Learning Model for Screening Sepsis Using Electrocardiography

10.21203/rs.3.rs-186976/v1 ◽

2021 ◽

Author(s):

Joon-myoung Kwon ◽

Ye Rang Lee ◽

Min-Seung Jung ◽

Yoon-Ji Lee ◽

Yong-Yeon Jo ◽

...

Keyword(s):

Septic Shock ◽

Deep Learning ◽

Characteristic Curve ◽

External Validation ◽

Medical Emergency ◽

Validation Dataset ◽

Internal Validation ◽

Significant Difference ◽

Life Threatening ◽

Sepsis And Septic Shock

Abstract Background: Sepsis is a life-threatening organ dysfunction and is a major healthcare burden worldwide. Although sepsis is a medical emergency that requires immediate management, it is difficult to screen the occurrence of sepsis. In this study, we propose an artificial intelligence based on deep learning-based model (DLM) for screening sepsis using electrocardiography (ECG).Methods: This retrospective cohort study included 46,017 patients who admitted to two hospitals. 1,548 and 639 patients underwent sepsis and septic shock. The DLM was developed using 73,727 ECGs of 18,142 patients and internal validation was conducted using 7,774 ECGs of 7,774 patients. Furthermore, we conducted an external validation with 20,101 ECGs of 20,101 patients from another hospital to verify the applicability of the DLM across centers.Results: During the internal and external validation, the area under the receiver operating characteristic curve (AUC) of an DLM using 12-lead ECG for screening sepsis were 0.901 (95% confidence interval 0.882–0.920) and 0.863 (0.846–0.879), respectively. During internal and external validation, AUC of an DLM for detecting septic shock were 0.906 (95% CI = 0.877–0.936) and 0.899 (95% CI = 0.872–0.925), respectively. The AUC of the DLM for detecting sepsis using 6-lead and single-lead ECGs were 0.845–0.882. A sensitivity map showed that the QRS complex and T wave was associated with sepsis. Subgroup analysis was conducted using ECGs from 4,609 patients who admitted with infectious disease, The AUC of the DLM for predicting in-hospital mortality was 0.817 (0.793–0.840). There was a significant difference in the prediction score of DLM using ECG according to the presence of infection in the validation dataset (0.277 vs 0.574, p<0.001), including severe acute respiratory syndrome coronavirus 2 (0.260 vs 0.725, p=0.018).Conclusions: The DLM demonstrated reasonable performance for screening sepsis using 12-, 6-, and single-lead ECG. The results suggest that sepsis can be screened using not only conventional ECG devices, but also diverse life-type ECG machine employing the DLM, thereby preventing irreversible disease progression and mortality.

Download Full-text

A new deep learning algorithm of 12-lead electrocardiogram for identifying atrial fibrillation during sinus rhythm

Scientific Reports ◽

10.1038/s41598-021-92172-5 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yong-Soo Baek ◽

Sang-Chul Lee ◽

Wonik Choi ◽

Dae-Hyeok Kim

Keyword(s):

Atrial Fibrillation ◽

Deep Learning ◽

Sinus Rhythm ◽

Learning Algorithm ◽

Normal Sinus Rhythm ◽

Characteristic Curve ◽

External Validation ◽

Digital Data ◽

Internal Validation ◽

Optimal Interval

AbstractAtrial fibrillation (AF) is the most prevalent arrhythmia and is associated with increased morbidity and mortality. Its early detection is challenging because of the low detection yield of conventional methods. We aimed to develop a deep learning-based algorithm to identify AF during normal sinus rhythm (NSR) using 12-lead electrocardiogram (ECG) findings. We developed a new deep neural network to detect subtle differences in paroxysmal AF (PAF) during NSR using digital data from standard 12-lead ECGs. Raw digital data of 2,412 12-lead ECGs were analyzed. The artificial intelligence (AI) model showed that the optimal interval to detect subtle changes in PAF was within 0.24 s before the QRS complex in the 12-lead ECG. We allocated the enrolled ECGs to the training, internal validation, and testing datasets in a 7:1:2 ratio. Regarding AF identification, the AI-based algorithm showed the following values in the internal and external validation datasets: area under the receiver operating characteristic curve, 0.79 and 0.75; recall, 82% and 77%; specificity, 78% and 72%; F1 score, 75% and 74%; and overall accuracy, 72.8% and 71.2%, respectively. The deep learning-based algorithm using 12-lead ECG demonstrated high accuracy for detecting AF during NSR.

Download Full-text

Predicting Sex from Retinal Fundus Photographs Using Automated Deep Learning

10.21203/rs.3.rs-402433/v1 ◽

2021 ◽

Author(s):

Edward Korot ◽

Nikolas Pontikos ◽

Xiaoxuan Liu ◽

Siegfried K Wagner ◽

Livia Faes ◽

...

Keyword(s):

Deep Learning ◽

Characteristic Curve ◽

External Validation ◽

Model Development ◽

Model Performance ◽

Validation Dataset ◽

Internal Validation ◽

Fundus Photographs ◽

The Uk ◽

Retinal Fundus

Abstract Deep learning may transform health care, but model development has largely been dependent on availability of advanced technical expertise. Herein we present the development of a deep learning model by clinicians without coding, which predicts reported sex from retinal fundus photographs. A model was trained on 84,743 retinal fundus photos from the UK Biobank dataset. External validation was performed on 252 fundus photos from a tertiary ophthalmic referral center. For internal validation, the area under the receiver operating characteristic curve (AUROC) of the code free deep learning (CFDL) model was 0.93. Sensitivity, specificity, positive predictive value (PPV) and accuracy (ACC) were 88.8%, 83.6%, 87.3% and 86.5%, and for external validation were 83.9%, 72.2%, 78.2% and 78.6% respectively. Clinicians are currently unaware of distinct retinal feature variations between males and females, highlighting the importance of model explainability for this task. The model performed significantly worse when foveal pathology was present in the external validation dataset, ACC: 69.4%, compared to 85.4% in healthy eyes, suggesting the fovea is a salient region for model performance OR (95% CI): 0.36 (0.19, 0.70) p = 0.0022. Automated machine learning (AutoML) may enable clinician-driven automated discovery of novel insights and disease biomarkers.

Download Full-text

Deep-learning model for screening sepsis using electrocardiography

Scandinavian Journal of Trauma Resuscitation and Emergency Medicine ◽

10.1186/s13049-021-00953-8 ◽

2021 ◽

Vol 29 (1) ◽

Author(s):

Joon-myoung Kwon ◽

Ye Rang Lee ◽

Min-Seung Jung ◽

Yoon-Ji Lee ◽

Yong-Yeon Jo ◽

...

Keyword(s):

Septic Shock ◽

Deep Learning ◽

Confidence Interval ◽

Characteristic Curve ◽

External Validation ◽

Medical Emergency ◽

Validation Dataset ◽

Internal Validation ◽

Significant Difference ◽

Sepsis And Septic Shock

Abstract Background Sepsis is a life-threatening organ dysfunction and a major healthcare burden worldwide. Although sepsis is a medical emergency that requires immediate management, screening for the occurrence of sepsis is difficult. Herein, we propose a deep learning-based model (DLM) for screening sepsis using electrocardiography (ECG). Methods This retrospective cohort study included 46,017 patients who were admitted to two hospitals. A total of 1,548 and 639 patients had sepsis and septic shock, respectively. The DLM was developed using 73,727 ECGs from 18,142 patients, and internal validation was conducted using 7774 ECGs from 7,774 patients. Furthermore, we conducted an external validation with 20,101 ECGs from 20,101 patients from another hospital to verify the applicability of the DLM across centers. Results During the internal and external validations, the area under the receiver operating characteristic curve (AUC) of the DLM using 12-lead ECG was 0.901 (95% confidence interval, 0.882–0.920) and 0.863 (0.846–0.879), respectively, for screening sepsis and 0.906 (95% confidence interval (CI), 0.877–0.936) and 0.899 (95% CI, 0.872–0.925), respectively, for detecting septic shock. The AUC of the DLM for detecting sepsis using 6-lead and single-lead ECGs was 0.845–0.882. A sensitivity map revealed that the QRS complex and T waves were associated with sepsis. Subgroup analysis was conducted using ECGs from 4,609 patients who were admitted with an infectious disease, and the AUC of the DLM for predicting in-hospital mortality was 0.817 (0.793–0.840). There was a significant difference in the prediction score of DLM using ECG according to the presence of infection in the validation dataset (0.277 vs. 0.574, p < 0.001), including severe acute respiratory syndrome coronavirus 2 (0.260 vs. 0.725, p = 0.018). Conclusions The DLM delivered reasonable performance for sepsis screening using 12-, 6-, and single-lead ECGs. The results suggest that sepsis can be screened using not only conventional ECG devices but also diverse life-type ECG machines employing the DLM, thereby preventing irreversible disease progression and mortality.

Download Full-text

Deep Learning Model for Detection of Hypoalbuminemia Using Electrocardiography

10.20944/preprints202101.0408.v1 ◽

2021 ◽

Author(s):

Joon-myoung Kwon ◽

Soo Youn Lee ◽

Yoon-Ji Lee ◽

Yong-Yeon Jo ◽

Min-Seung Jung ◽

...

Keyword(s):

Deep Learning ◽

Characteristic Curve ◽

External Validation ◽

Patient Data ◽

Albumin Concentration ◽

Historical Cohort ◽

Internal Validation ◽

Patient Deterioration ◽

Risk Patients ◽

Deep Learning Model

Background: Albumin has a pivotal role in the homeostasis of osmotic pressure and is associated with cardiovascular, nephrotic, hepatic, and nutritional diseases. The detection of hypoalbuminemia is a cornerstone for diagnosis of hidden diseases and patient deterioration. We developed and validated a deep-learning-based model (DLM) for detection of hypoalbuminemia using electrocardiography (ECG). Methods: This historical cohort study included data from consecutive patients from two hospitals. The patient data in one hospital were divided into development (82,499 ECGs from 54,248 patients) and internal validation (20,664 ECGs from 20,664 patients) datasets, whereas the patient data in the other hospital were included in only an external validation (37,421 ECGs from 37,421 patients) dataset. An DLM was developed using a 12-lead ECG signal, age, and sex from the development dataset. The endpoint was hypoalbuminemia, defined by serum albumin concentration below 3.5 g/dL. Results: During the internal and external validations, the areas under the receiver operating characteristic curve of the DLM for the detection of hypoalbuminemia were 0.887 (0.877–0.897) and 0.888 (0.880–0.896), respectively. Among the 27,400 individuals without hypoalbuminemia at the initial laboratory exam, those identified by the DLM as higher-risk patients had a significantly larger change in developing hypoalbuminemia than those in the low-risk group (7.09% vs. 1.01%, p < 0.001) during 24 months. The sensitivity map showed that the DLM focused on the T wave and QRS complex for the detection of hypoalbuminemia. Conclusions: The DLM exhibited a high accuracy for hypoalbuminemia detection and prediction using 12-, 6-, and single-lead ECGs.

Download Full-text

Predicting sex from retinal fundus photographs using automated deep learning

Scientific Reports ◽

10.1038/s41598-021-89743-x ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Edward Korot ◽

Nikolas Pontikos ◽

Xiaoxuan Liu ◽

Siegfried K. Wagner ◽

Livia Faes ◽

...

Keyword(s):

Deep Learning ◽

Characteristic Curve ◽

External Validation ◽

Model Development ◽

Model Performance ◽

Validation Dataset ◽

Internal Validation ◽

Fundus Photographs ◽

The Uk ◽

Retinal Fundus

AbstractDeep learning may transform health care, but model development has largely been dependent on availability of advanced technical expertise. Herein we present the development of a deep learning model by clinicians without coding, which predicts reported sex from retinal fundus photographs. A model was trained on 84,743 retinal fundus photos from the UK Biobank dataset. External validation was performed on 252 fundus photos from a tertiary ophthalmic referral center. For internal validation, the area under the receiver operating characteristic curve (AUROC) of the code free deep learning (CFDL) model was 0.93. Sensitivity, specificity, positive predictive value (PPV) and accuracy (ACC) were 88.8%, 83.6%, 87.3% and 86.5%, and for external validation were 83.9%, 72.2%, 78.2% and 78.6% respectively. Clinicians are currently unaware of distinct retinal feature variations between males and females, highlighting the importance of model explainability for this task. The model performed significantly worse when foveal pathology was present in the external validation dataset, ACC: 69.4%, compared to 85.4% in healthy eyes, suggesting the fovea is a salient region for model performance OR (95% CI): 0.36 (0.19, 0.70) p = 0.0022. Automated machine learning (AutoML) may enable clinician-driven automated discovery of novel insights and disease biomarkers.

Download Full-text

Development and Verification of a Deep Learning Algorithm to Evaluate Small-Bowel Preparation Quality

Diagnostics ◽

10.3390/diagnostics11061127 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1127

Author(s):

Ji Hyung Nam ◽

Dong Jun Oh ◽

Sumin Lee ◽

Hyun Joo Song ◽

Yun Jeong Lim

Keyword(s):

Deep Learning ◽

Small Bowel ◽

Scoring System ◽

Operating Characteristic ◽

Clinical Evidence ◽

Learning Algorithm ◽

Characteristic Curve ◽

External Validation ◽

Test Results ◽

Deep Learning Algorithm

Capsule endoscopy (CE) quality control requires an objective scoring system to evaluate the preparation of the small bowel (SB). We propose a deep learning algorithm to calculate SB cleansing scores and verify the algorithm’s performance. A 5-point scoring system based on clarity of mucosal visualization was used to develop the deep learning algorithm (400,000 frames; 280,000 for training and 120,000 for testing). External validation was performed using additional CE cases (n = 50), and average cleansing scores (1.0 to 5.0) calculated using the algorithm were compared to clinical grades (A to C) assigned by clinicians. Test results obtained using 120,000 frames exhibited 93% accuracy. The separate CE case exhibited substantial agreement between the deep learning algorithm scores and clinicians’ assessments (Cohen’s kappa: 0.672). In the external validation, the cleansing score decreased with worsening clinical grade (scores of 3.9, 3.2, and 2.5 for grades A, B, and C, respectively, p < 0.001). Receiver operating characteristic curve analysis revealed that a cleansing score cut-off of 2.95 indicated clinically adequate preparation. This algorithm provides an objective and automated cleansing score for evaluating SB preparation for CE. The results of this study will serve as clinical evidence supporting the practical use of deep learning algorithms for evaluating SB preparation quality.

Download Full-text

Development and External Validation of Deep-Learning-Based Tumor Grading Models in Soft-Tissue Sarcoma Patients Using MR Imaging

Cancers ◽

10.3390/cancers13122866 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2866

Author(s):

Fernando Navarro ◽

Hendrik Dapper ◽

Rebecca Asadpour ◽

Carolin Knebel ◽

Matthew B. Spraker ◽

...

Keyword(s):

Deep Learning ◽

Soft Tissue ◽

Mr Imaging ◽

Characteristic Curve ◽

External Validation ◽

Soft Tissue Sarcomas ◽

Treatment Decision ◽

Receiver Operator Characteristic Curve ◽

Low Grade ◽

Tumor Grading

Background: In patients with soft-tissue sarcomas, tumor grading constitutes a decisive factor to determine the best treatment decision. Tumor grading is obtained by pathological work-up after focal biopsies. Deep learning (DL)-based imaging analysis may pose an alternative way to characterize STS tissue. In this work, we sought to non-invasively differentiate tumor grading into low-grade (G1) and high-grade (G2/G3) STS using DL techniques based on MR-imaging. Methods: Contrast-enhanced T1-weighted fat-saturated (T1FSGd) MRI sequences and fat-saturated T2-weighted (T2FS) sequences were collected from two independent retrospective cohorts (training: 148 patients, testing: 158 patients). Tumor grading was determined following the French Federation of Cancer Centers Sarcoma Group in pre-therapeutic biopsies. DL models were developed using transfer learning based on the DenseNet 161 architecture. Results: The T1FSGd and T2FS-based DL models achieved area under the receiver operator characteristic curve (AUC) values of 0.75 and 0.76 on the test cohort, respectively. T1FSGd achieved the best F1-score of all models (0.90). The T2FS-based DL model was able to significantly risk-stratify for overall survival. Attention maps revealed relevant features within the tumor volume and in border regions. Conclusions: MRI-based DL models are capable of predicting tumor grading with good reproducibility in external validation.

Download Full-text