scholarly journals Testing a Deep Learning Algorithm for Detection of Diabetic Retinopathy in a Spanish Diabetic Population and with MESSIDOR Database

Diagnostics ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 1385
Author(s):  
Marc Baget-Bernaldiz ◽  
Romero-Aroca Pedro ◽  
Esther Santos-Blanco ◽  
Raul Navarro-Gil ◽  
Aida Valls ◽  
...  

Background: The aim of the present study was to test our deep learning algorithm (DLA) by reading the retinographies. Methods: We tested our DLA built on convolutional neural networks in 14,186 retinographies from our population and 1200 images extracted from MESSIDOR. The retinal images were graded both by the DLA and independently by four retina specialists. Results of the DLA were compared according to accuracy (ACC), sensitivity (S), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), and area under the receiver operating characteristic curve (AUC), distinguishing between identification of any type of DR (any DR) and referable DR (RDR). Results: The results of testing the DLA for identifying any DR in our population were: ACC = 99.75, S = 97.92, SP = 99.91, PPV = 98.92, NPV = 99.82, and AUC = 0.983. When detecting RDR, the results were: ACC = 99.66, S = 96.7, SP = 99.92, PPV = 99.07, NPV = 99.71, and AUC = 0.988. The results of testing the DLA for identifying any DR with MESSIDOR were: ACC = 94.79, S = 97.32, SP = 94.57, PPV = 60.93, NPV = 99.75, and AUC = 0.959. When detecting RDR, the results were: ACC = 98.78, S = 94.64, SP = 99.14, PPV = 90.54, NPV = 99.53, and AUC = 0.968. Conclusions: Our DLA performed well, both in detecting any DR and in classifying those eyes with RDR in a sample of retinographies of type 2 DM patients in our population and the MESSIDOR database.

Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1127
Author(s):  
Ji Hyung Nam ◽  
Dong Jun Oh ◽  
Sumin Lee ◽  
Hyun Joo Song ◽  
Yun Jeong Lim

Capsule endoscopy (CE) quality control requires an objective scoring system to evaluate the preparation of the small bowel (SB). We propose a deep learning algorithm to calculate SB cleansing scores and verify the algorithm’s performance. A 5-point scoring system based on clarity of mucosal visualization was used to develop the deep learning algorithm (400,000 frames; 280,000 for training and 120,000 for testing). External validation was performed using additional CE cases (n = 50), and average cleansing scores (1.0 to 5.0) calculated using the algorithm were compared to clinical grades (A to C) assigned by clinicians. Test results obtained using 120,000 frames exhibited 93% accuracy. The separate CE case exhibited substantial agreement between the deep learning algorithm scores and clinicians’ assessments (Cohen’s kappa: 0.672). In the external validation, the cleansing score decreased with worsening clinical grade (scores of 3.9, 3.2, and 2.5 for grades A, B, and C, respectively, p < 0.001). Receiver operating characteristic curve analysis revealed that a cleansing score cut-off of 2.95 indicated clinically adequate preparation. This algorithm provides an objective and automated cleansing score for evaluating SB preparation for CE. The results of this study will serve as clinical evidence supporting the practical use of deep learning algorithms for evaluating SB preparation quality.


2021 ◽  
Author(s):  
Jae-Seung Yun ◽  
Jaesik Kim ◽  
Sang-Hyuk Jung ◽  
Seon-Ah Cha ◽  
Seung-Hyun Ko ◽  
...  

Objective: We aimed to develop and evaluate a non-invasive deep learning algorithm for screening type 2 diabetes in UK Biobank participants using retinal images. Research Design and Methods: The deep learning model for prediction of type 2 diabetes was trained on retinal images from 50,077 UK Biobank participants and tested on 12,185 participants. We evaluated its performance in terms of predicting traditional risk factors (TRFs) and genetic risk for diabetes. Next, we compared the performance of three models in predicting type 2 diabetes using 1) an image-only deep learning algorithm, 2) TRFs, 3) the combination of the algorithm and TRFs. Assessing net reclassification improvement (NRI) allowed quantification of the improvement afforded by adding the algorithm to the TRF model. Results: When predicting TRFs with the deep learning algorithm, the areas under the curve (AUCs) obtained with the validation set for age, sex, and HbA1c status were 0.931 (0.928-0.934), 0.933 (0.929-0.936), and 0.734 (0.715-0.752), respectively. When predicting type 2 diabetes, the AUC of the composite logistic model using non-invasive TRFs was 0.810 (0.790-0.830), and that for the deep learning model using only fundus images was 0.731 (0.707-0.756). Upon addition of TRFs to the deep learning algorithm, discriminative performance was improved to 0.844 (0.826-0.861). The addition of the algorithm to the TRFs model improved risk stratification with an overall NRI of 50.8%. Conclusions: Our results demonstrate that this deep learning algorithm can be a useful tool for stratifying individuals at high risk of type 2 diabetes in the general population.


Author(s):  
Chuansheng Zheng ◽  
Xianbo Deng ◽  
Qiang Fu ◽  
Qiang Zhou ◽  
Jiapei Feng ◽  
...  

AbstractAccurate and rapid diagnosis of COVID-19 suspected cases plays a crucial role in timely quarantine and medical treatment. Developing a deep learning-based model for automatic COVID-19 detection on chest CT is helpful to counter the outbreak of SARS-CoV-2. A weakly-supervised deep learning-based software system was developed using 3D CT volumes to detect COVID-19. For each patient, the lung region was segmented using a pre-trained UNet; then the segmented 3D lung region was fed into a 3D deep neural network to predict the probability of COVID-19 infectious. 499 CT volumes collected from Dec. 13, 2019, to Jan. 23, 2020, were used for training and 131 CT volumes collected from Jan 24, 2020, to Feb 6, 2020, were used for testing. The deep learning algorithm obtained 0.959 ROC AUC and 0.976 PR AUC. There was an operating point with 0.907 sensitivity and 0.911 specificity in the ROC curve. When using a probability threshold of 0.5 to classify COVID-positive and COVID-negative, the algorithm obtained an accuracy of 0.901, a positive predictive value of 0.840 and a very high negative predictive value of 0.982. The algorithm took only 1.93 seconds to process a single patient’s CT volume using a dedicated GPU. Our weakly-supervised deep learning model can accurately predict the COVID-19 infectious probability in chest CT volumes without the need for annotating the lesions for training. The easily-trained and highperformance deep learning algorithm provides a fast way to identify COVID-19 patients, which is beneficial to control the outbreak of SARS-CoV-2. The developed deep learning software is available at https://github.com/sydney0zq/covid-19-detection.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e16572-e16572
Author(s):  
Alexa Meyer ◽  
Nancy Stambler ◽  
Karl Sjöstrand ◽  
Jens Richter ◽  
Mohamad Allaf ◽  
...  

e16572 Background: Previous work has shown that the degree of expression of prostate-specific membrane antigen (PSMA) correlates with prostate cancer (PCa) grade and stage. We evaluated the additive value of a deep learning algorithm (PSMA-AI) of a PSMA-targeted small molecule SPECT/CT imaging agent (99mTc-MIP-1404) to identify men with low risk PCa who are potential active surveillance candidates. Methods: A secondary analysis of a phase III trial (NCT02615067) of men with PCa who underwent 99mTc-MIP-1404 SPECT/CT was conducted. Patients with a biopsy Gleason score (GS) of ≤6, clinical stage ≤T2, and prostate specific antigen (PSA) < 10 ng/mL who underwent radical prostatectomy (RP) following SPECT/CT were included in the present analysis. SPECT/CT images were retrospectively analyzed by PSMA-AI, which was developed and locked prior to analysis. PSMA-AI calculated the uptake of 99mTc-MIP-1404 against the background reference (TBR). The automated TBR of 14 was used as a threshold for PSMA-AI calls of positive disease. Multivariable logistic regression analysis was used to develop a base model for identifying men with occult GS ≥7 PCa in the RP specimen. This model included PSA density, % positive biopsy cores, and clinical stage. The diagnostic performance of this model was then compared to a second model that incorporated PSMA-AI calls. Results: In total, 87 patients enrolled in the original trial contributed to the analysis. The base model indicated that PSA density and % positive cores were significantly associated with occult GS ≥7 PCa (p < 0.05), but clinical stage was not (p = 0.23). The predictive ability of the model resulted in an area under the curve (AUC) of 0.73. Upon adding PSMA-AI calls, the AUC increased to 0.77. PSMA-AI calls (p = 0.045), pre-surgery PSA density (0.019) and % positive core (p < 0.004) remained statistically significant. PSMA-AI calls increased the positive predictive value from 70% to 77% and the negative predictive value from 57% to 74%. Conclusions: The addition of PSMA-AI calls demonstrated a significant improvement over known predictors for identifying men with occult GS ≥7 PCa, who are inappropriate candidates for active surveillance. Clinical trial information: NCT02615067.


2020 ◽  
Vol 4 (12) ◽  
pp. 1197-1207
Author(s):  
Wanshan Ning ◽  
Shijun Lei ◽  
Jingjing Yang ◽  
Yukun Cao ◽  
Peiran Jiang ◽  
...  

AbstractData from patients with coronavirus disease 2019 (COVID-19) are essential for guiding clinical decision making, for furthering the understanding of this viral disease, and for diagnostic modelling. Here, we describe an open resource containing data from 1,521 patients with pneumonia (including COVID-19 pneumonia) consisting of chest computed tomography (CT) images, 130 clinical features (from a range of biochemical and cellular analyses of blood and urine samples) and laboratory-confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) clinical status. We show the utility of the database for prediction of COVID-19 morbidity and mortality outcomes using a deep learning algorithm trained with data from 1,170 patients and 19,685 manually labelled CT slices. In an independent validation cohort of 351 patients, the algorithm discriminated between negative, mild and severe cases with areas under the receiver operating characteristic curve of 0.944, 0.860 and 0.884, respectively. The open database may have further uses in the diagnosis and management of patients with COVID-19.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0246472
Author(s):  
Eun Young Kim ◽  
Young Jae Kim ◽  
Won-Jun Choi ◽  
Gi Pyo Lee ◽  
Ye Ra Choi ◽  
...  

Purpose This study evaluated the performance of a commercially available deep-learning algorithm (DLA) (Insight CXR, Lunit, Seoul, South Korea) for referable thoracic abnormalities on chest X-ray (CXR) using a consecutively collected multicenter health screening cohort. Methods and materials A consecutive health screening cohort of participants who underwent both CXR and chest computed tomography (CT) within 1 month was retrospectively collected from three institutions’ health care clinics (n = 5,887). Referable thoracic abnormalities were defined as any radiologic findings requiring further diagnostic evaluation or management, including DLA-target lesions of nodule/mass, consolidation, or pneumothorax. We evaluated the diagnostic performance of the DLA for referable thoracic abnormalities using the area under the receiver operating characteristic (ROC) curve (AUC), sensitivity, and specificity using ground truth based on chest CT (CT-GT). In addition, for CT-GT-positive cases, three independent radiologist readings were performed on CXR and clear visible (when more than two radiologists called) and visible (at least one radiologist called) abnormalities were defined as CXR-GTs (clear visible CXR-GT and visible CXR-GT, respectively) to evaluate the performance of the DLA. Results Among 5,887 subjects (4,329 males; mean age 54±11 years), referable thoracic abnormalities were found in 618 (10.5%) based on CT-GT. DLA-target lesions were observed in 223 (4.0%), nodule/mass in 202 (3.4%), consolidation in 31 (0.5%), pneumothorax in one 1 (<0.1%), and DLA-non-target lesions in 409 (6.9%). For referable thoracic abnormalities based on CT-GT, the DLA showed an AUC of 0.771 (95% confidence interval [CI], 0.751–0.791), a sensitivity of 69.6%, and a specificity of 74.0%. Based on CXR-GT, the prevalence of referable thoracic abnormalities decreased, with visible and clear visible abnormalities found in 405 (6.9%) and 227 (3.9%) cases, respectively. The performance of the DLA increased significantly when using CXR-GTs, with an AUC of 0.839 (95% CI, 0.829–0.848), a sensitivity of 82.7%, and s specificity of 73.2% based on visible CXR-GT and an AUC of 0.872 (95% CI, 0.863–0.880, P <0.001 for the AUC comparison of GT-CT vs. clear visible CXR-GT), a sensitivity of 83.3%, and a specificity of 78.8% based on clear visible CXR-GT. Conclusion The DLA provided fair-to-good stand-alone performance for the detection of referable thoracic abnormalities in a multicenter consecutive health screening cohort. The DLA showed varied performance according to the different methods of ground truth.


2020 ◽  
Vol 41 (46) ◽  
pp. 4400-4411 ◽  
Author(s):  
Shen Lin ◽  
Zhigang Li ◽  
Bowen Fu ◽  
Sipeng Chen ◽  
Xi Li ◽  
...  

Abstract Aims Facial features were associated with increased risk of coronary artery disease (CAD). We developed and validated a deep learning algorithm for detecting CAD based on facial photos. Methods and results We conducted a multicentre cross-sectional study of patients undergoing coronary angiography or computed tomography angiography at nine Chinese sites to train and validate a deep convolutional neural network for the detection of CAD (at least one ≥50% stenosis) from patient facial photos. Between July 2017 and March 2019, 5796 patients from eight sites were consecutively enrolled and randomly divided into training (90%, n = 5216) and validation (10%, n = 580) groups for algorithm development. Between April 2019 and July 2019, 1013 patients from nine sites were enrolled in test group for algorithm test. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were calculated using radiologist diagnosis as the reference standard. Using an operating cut point with high sensitivity, the CAD detection algorithm had sensitivity of 0.80 and specificity of 0.54 in the test group; the AUC was 0.730 (95% confidence interval, 0.699–0.761). The AUC for the algorithm was higher than that for the Diamond–Forrester model (0.730 vs. 0.623, P &lt; 0.001) and the CAD consortium clinical score (0.730 vs. 0.652, P &lt; 0.001). Conclusion Our results suggested that a deep learning algorithm based on facial photos can assist in CAD detection in this Chinese cohort. This technique may hold promise for pre-test CAD probability assessment in outpatient clinics or CAD screening in community. Further studies to develop a clinical available tool are warranted.


2021 ◽  
Vol 8 ◽  
Author(s):  
Yueye Wang ◽  
Danli Shi ◽  
Zachary Tan ◽  
Yong Niu ◽  
Yu Jiang ◽  
...  

Purpose: To assess the accuracy and efficacy of a semi-automated deep learning algorithm (DLA) assisted approach to detect vision-threatening diabetic retinopathy (DR).Methods: We developed a two-step semi-automated DLA-assisted approach to grade fundus photographs for vision-threatening referable DR. Study images were obtained from the Lingtou Cohort Study, and captured at participant enrollment in 2009–2010 (“baseline images”) and annual follow-up between 2011 and 2017. To begin, a validated DLA automatically graded baseline images for referable DR and classified them as positive, negative, or ungradable. Following, each positive image, all other available images from patients who had a positive image, and a 5% random sample of all negative images were selected and regraded by trained human graders. A reference standard diagnosis was assigned once all graders achieved consistent grading outcomes or with a senior ophthalmologist's final diagnosis. The semi-automated DLA assisted approach combined initial DLA screening and subsequent human grading for images identified as high-risk. This approach was further validated within the follow-up image datasets and its time and economic costs evaluated against fully human grading.Results: For evaluation of baseline images, a total of 33,115 images were included and automatically graded by the DLA. 2,604 images (480 positive results, 624 available other images from participants with a positive result, and 1500 random negative samples) were selected and regraded by graders. The DLA achieved an area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy of 0.953, 0.970, 0.879, and 88.6%, respectively. In further validation within the follow-up image datasets, a total of 88,363 images were graded using this semi-automated approach and human grading was performed on 8975 selected images. The DLA achieved an AUC, sensitivity, and specificity of 0.914, 0.852, 0.853, respectively. Compared against fully human grading, the semi-automated DLA-assisted approach achieved an estimated 75.6% time and 90.1% economic cost saving.Conclusions: The DLA described in this study was able to achieve high accuracy, sensitivity, and specificity in grading fundus images for referable DR. Validated against long-term follow-up datasets, a semi-automated DLA-assisted approach was able to accurately identify suspect cases, and minimize misdiagnosis whilst balancing safety, time, and economic cost.


Author(s):  
Supreeth P. Shashikumar ◽  
Gabriel Wardi ◽  
Paulina Paul ◽  
Paulina Paul ◽  
Morgan Carlile ◽  
...  

ABSTRACTIMPORTANCEObjective and early identification of hospitalized patients, and particularly those with novel coronavirus disease 2019 (COVID-19), who may require mechanical ventilation is of great importance and may aid in delivering timely treatment.OBJECTIVETo develop, externally validate and prospectively test a transparent deep learning algorithm for predicting 24 hours in advance the need for mechanical ventilation in hospitalized patients and those with COVID-19.DESIGNObservational cohort studySETTINGTwo academic medical centers from January 01, 2016 to December 31, 2019 (Retrospective cohorts) and February 10, 2020 to May 4, 2020 (Prospective cohorts).PARTICIPANTSOver 31,000 admissions to the intensive care units (ICUs) at two hospitals. Additionally, 777 patients with COVID-19 patients were used for prospective validation. Patients who were placed on mechanical ventilation within four hours of their admission were excluded.MAIN OUTCOME(S) and MEASURE(S)Electronic health record (EHR) data were extracted on an hourly basis, and a set of 40 features were calculated and passed to an interpretable deep-learning algorithm to predict the future need for mechanical ventilation 24 hours in advance. Additionally, commonly used clinical criteria (based on heart rate, oxygen saturation, respiratory rate, FiO2 and pH) was used to assess future need for mechanical ventilation. Performance of the algorithms were evaluated using the area under receiver-operating characteristic curve (AUC), sensitivity, specificity and positive predictive value.RESULTSAfter applying exclusion criteria, the external validation cohort included 3,888 general ICU and 402 COVID-19 patients. The performance of the model (AUC) with a 24-hour prediction horizon at the validation site was 0.882 for the general ICU population and 0.918 for patients with COVID-19. In comparison, commonly used clinical criteria and the ROX score achieved AUCs in the range of 0.773 – 0.782 and 0.768 – 0.810 for the general ICU population and patients with COVID-19, respectively.CONCLUSIONS and RELEVANCEA generalizable and transparent deep-learning algorithm improves on traditional clinical criteria to predict the need for mechanical ventilation in hospitalized patients, including those with COVID-19. Such an algorithm may help clinicians with optimizing timing of tracheal intubation, better allocation of mechanical ventilation resources and staff, and improve patient care.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Ryoungwoo Jang ◽  
Jae Ho Choi ◽  
Namkug Kim ◽  
Jae Suk Chang ◽  
Pil Whan Yoon ◽  
...  

AbstractDespite being the gold standard for diagnosis of osteoporosis, dual-energy X-ray absorptiometry (DXA) could not be widely used as a screening tool for osteoporosis. This study aimed to predict osteoporosis via simple hip radiography using deep learning algorithm. A total of 1001 datasets of proximal femur DXA with matched same-side cropped simple hip bone radiographic images of female patients aged ≥ 55 years were collected. Of these, 504 patients had osteoporosis (T-score ≤ − 2.5), and 497 patients did not have osteoporosis. The 1001 images were randomly divided into three sets: 800 images for the training, 100 images for the validation, and 101 images for the test. Based on VGG16 equipped with nonlocal neural network, we developed a deep neural network (DNN) model. We calculated the confusion matrix and evaluated the accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). We drew the receiver operating characteristic (ROC) curve. A gradient-based class activation map (Grad-CAM) overlapping the original image was also used to visualize the model performance. Additionally, we performed external validation using 117 datasets. Our final DNN model showed an overall accuracy of 81.2%, sensitivity of 91.1%, and specificity of 68.9%. The PPV was 78.5%, and the NPV was 86.1%. The area under the ROC curve value was 0.867, indicating a reasonable performance for screening osteoporosis by simple hip radiography. The external validation set confirmed a model performance with an overall accuracy of 71.8% and an AUC value of 0.700. All Grad-CAM results from both internal and external validation sets appropriately matched the proximal femur cortex and trabecular patterns of the radiographs. The DNN model could be considered as one of the useful screening tools for easy prediction of osteoporosis in the real-world clinical setting.


Sign in / Sign up

Export Citation Format

Share Document