A scalable physician-level deep learning algorithm detects universal trauma on pelvic radiographs

AbstractPelvic radiograph (PXR) is essential for detecting proximal femur and pelvis injuries in trauma patients, which is also the key component for trauma survey. None of the currently available algorithms can accurately detect all kinds of trauma-related radiographic findings on PXRs. Here, we show a universal algorithm can detect most types of trauma-related radiographic findings on PXRs. We develop a multiscale deep learning algorithm called PelviXNet trained with 5204 PXRs with weakly supervised point annotation. PelviXNet yields an area under the receiver operating characteristic curve (AUROC) of 0.973 (95% CI, 0.960–0.983) and an area under the precision-recall curve (AUPRC) of 0.963 (95% CI, 0.948–0.974) in the clinical population test set of 1888 PXRs. The accuracy, sensitivity, and specificity at the cutoff value are 0.924 (95% CI, 0.912–0.936), 0.908 (95% CI, 0.885–0.908), and 0.932 (95% CI, 0.919–0.946), respectively. PelviXNet demonstrates comparable performance with radiologists and orthopedics in detecting pelvic and hip fractures.

Download Full-text

Development and Verification of a Deep Learning Algorithm to Evaluate Small-Bowel Preparation Quality

Diagnostics ◽

10.3390/diagnostics11061127 ◽

2021 ◽

Vol 11 (6) ◽

pp. 1127

Author(s):

Ji Hyung Nam ◽

Dong Jun Oh ◽

Sumin Lee ◽

Hyun Joo Song ◽

Yun Jeong Lim

Keyword(s):

Deep Learning ◽

Small Bowel ◽

Scoring System ◽

Operating Characteristic ◽

Clinical Evidence ◽

Learning Algorithm ◽

Characteristic Curve ◽

External Validation ◽

Test Results ◽

Deep Learning Algorithm

Capsule endoscopy (CE) quality control requires an objective scoring system to evaluate the preparation of the small bowel (SB). We propose a deep learning algorithm to calculate SB cleansing scores and verify the algorithm’s performance. A 5-point scoring system based on clarity of mucosal visualization was used to develop the deep learning algorithm (400,000 frames; 280,000 for training and 120,000 for testing). External validation was performed using additional CE cases (n = 50), and average cleansing scores (1.0 to 5.0) calculated using the algorithm were compared to clinical grades (A to C) assigned by clinicians. Test results obtained using 120,000 frames exhibited 93% accuracy. The separate CE case exhibited substantial agreement between the deep learning algorithm scores and clinicians’ assessments (Cohen’s kappa: 0.672). In the external validation, the cleansing score decreased with worsening clinical grade (scores of 3.9, 3.2, and 2.5 for grades A, B, and C, respectively, p < 0.001). Receiver operating characteristic curve analysis revealed that a cleansing score cut-off of 2.95 indicated clinically adequate preparation. This algorithm provides an objective and automated cleansing score for evaluating SB preparation for CE. The results of this study will serve as clinical evidence supporting the practical use of deep learning algorithms for evaluating SB preparation quality.

Download Full-text

Deep Learning-based Detection for COVID-19 from Chest CT using Weak Label

10.1101/2020.03.12.20027185 ◽

2020 ◽

Cited By ~ 52

Author(s):

Chuansheng Zheng ◽

Xianbo Deng ◽

Qiang Fu ◽

Qiang Zhou ◽

Jiapei Feng ◽

...

Keyword(s):

Deep Learning ◽

Predictive Value ◽

Learning Algorithm ◽

Chest Ct ◽

Lung Region ◽

Deep Learning Algorithm ◽

High Negative Predictive Value ◽

Probability Threshold ◽

Weakly Supervised ◽

Learning Software

AbstractAccurate and rapid diagnosis of COVID-19 suspected cases plays a crucial role in timely quarantine and medical treatment. Developing a deep learning-based model for automatic COVID-19 detection on chest CT is helpful to counter the outbreak of SARS-CoV-2. A weakly-supervised deep learning-based software system was developed using 3D CT volumes to detect COVID-19. For each patient, the lung region was segmented using a pre-trained UNet; then the segmented 3D lung region was fed into a 3D deep neural network to predict the probability of COVID-19 infectious. 499 CT volumes collected from Dec. 13, 2019, to Jan. 23, 2020, were used for training and 131 CT volumes collected from Jan 24, 2020, to Feb 6, 2020, were used for testing. The deep learning algorithm obtained 0.959 ROC AUC and 0.976 PR AUC. There was an operating point with 0.907 sensitivity and 0.911 specificity in the ROC curve. When using a probability threshold of 0.5 to classify COVID-positive and COVID-negative, the algorithm obtained an accuracy of 0.901, a positive predictive value of 0.840 and a very high negative predictive value of 0.982. The algorithm took only 1.93 seconds to process a single patient’s CT volume using a dedicated GPU. Our weakly-supervised deep learning model can accurately predict the COVID-19 infectious probability in chest CT volumes without the need for annotating the lesions for training. The easily-trained and highperformance deep learning algorithm provides a fast way to identify COVID-19 patients, which is beneficial to control the outbreak of SARS-CoV-2. The developed deep learning software is available at https://github.com/sydney0zq/covid-19-detection.

Download Full-text

Rib fracture detection system based on deep learning

Scientific Reports ◽

10.1038/s41598-021-03002-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Liding Yao ◽

Xiaojun Guan ◽

Xiaowei Song ◽

Yanbin Tan ◽

Chun Wang ◽

...

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Detection System ◽

Rib Fracture ◽

Diagnostic Efficiency ◽

Fracture Detection ◽

Deep Learning Algorithm ◽

Experienced Radiologist ◽

Comparable Performance ◽

Detection And Diagnosis

AbstractRib fracture detection is time-consuming and demanding work for radiologists. This study aimed to introduce a novel rib fracture detection system based on deep learning which can help radiologists to diagnose rib fractures in chest computer tomography (CT) images conveniently and accurately. A total of 1707 patients were included in this study from a single center. We developed a novel rib fracture detection system on chest CT using a three-step algorithm. According to the examination time, 1507, 100 and 100 patients were allocated to the training set, the validation set and the testing set, respectively. Free Response ROC analysis was performed to evaluate the sensitivity and false positivity of the deep learning algorithm. Precision, recall, F1-score, negative predictive value (NPV) and detection and diagnosis were selected as evaluation metrics to compare the diagnostic efficiency of this system with radiologists. The radiologist-only study was used as a benchmark and the radiologist-model collaboration study was evaluated to assess the model’s clinical applicability. A total of 50,170,399 blocks (fracture blocks, 91,574; normal blocks, 50,078,825) were labelled for training. The F1-score of the Rib Fracture Detection System was 0.890 and the precision, recall and NPV values were 0.869, 0.913 and 0.969, respectively. By interacting with this detection system, the F1-score of the junior and the experienced radiologists had improved from 0.796 to 0.925 and 0.889 to 0.970, respectively; the recall scores had increased from 0.693 to 0.920 and 0.853 to 0.972, respectively. On average, the diagnosis time of radiologist assisted with this detection system was reduced by 65.3 s. The constructed Rib Fracture Detection System has a comparable performance with the experienced radiologist and is readily available to automatically detect rib fracture in the clinical setting with high efficacy, which could reduce diagnosis time and radiologists’ workload in the clinical practice.

Download Full-text

Using artificial intelligence to assist radiologists in distinguishing COVID-19 from other pulmonary infections

Journal of X-Ray Science and Technology ◽

10.3233/xst-200735 ◽

2020 ◽

pp. 1-17

Author(s):

Yanhong Yang ◽

Fleming Y.M. Lure ◽

Hengyuan Miao ◽

Ziqi Zhang ◽

Stefan Jaeger ◽

...

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Model Development ◽

Training Data ◽

Test Accuracy ◽

Pulmonary Infections ◽

Deep Learning Algorithm ◽

Average Accuracy ◽

Independent Test ◽

Comparable Performance

Background: Accurate and rapid diagnosis of coronavirus disease (COVID-19) is crucial for timely quarantine and treatment. Purpose: In this study, a deep learning algorithm-based AI model using ResUNet network was developed to evaluate the performance of radiologists with and without AI assistance in distinguishing COVID-19 infected pneumonia patients from other pulmonary infections on CT scans. Methods: For model development and validation, a total number of 694 cases with 111,066 CT slides were retrospectively collected as training data and independent test data in the study. Among them, 118 are confirmed COVID-19 infected pneumonia cases and 576 are other pulmonary infections cases (e.g. tuberculosis cases, common pneumonia cases and non-COVID-19 viral pneumonia cases). The cases were divided into training and testing datasets. The independent test was performed by evaluating and comparing the performance of three radiologists with different years of practice experience in distinguishing COVID-19 infected pneumonia cases with and without the AI assistance. Results: Our final model achieved an overall test accuracy of 0.914 with an area of the receiver operating characteristic (ROC) curve (AUC) of 0.903 in which the sensitivity and specificity are 0.918 and 0.909, respectively. The deep learning-based model then achieved a comparable performance by improving the radiologists’ performance in distinguish COVOD-19 from other pulmonary infections, yielding better average accuracy and sensitivity, from 0.941 to 0.951 and from 0.895 to 0.942, respectively, when compared to radiologists without using AI assistance. Conclusion: A deep learning algorithm-based AI model developed in this study successfully improved radiologists’ performance in distinguishing COVID-19 from other pulmonary infections using chest CT images.

Download Full-text

Open resource of clinical data from patients with pneumonia for the prediction of COVID-19 outcomes via deep learning

Nature Biomedical Engineering ◽

10.1038/s41551-020-00633-5 ◽

2020 ◽

Vol 4 (12) ◽

pp. 1197-1207

Author(s):

Wanshan Ning ◽

Shijun Lei ◽

Jingjing Yang ◽

Yukun Cao ◽

Peiran Jiang ◽

...

Keyword(s):

Deep Learning ◽

Clinical Decision Making ◽

Learning Algorithm ◽

Characteristic Curve ◽

Clinical Status ◽

Clinical Decision ◽

Viral Disease ◽

Deep Learning Algorithm ◽

Blood And Urine Samples ◽

Ct Slices

AbstractData from patients with coronavirus disease 2019 (COVID-19) are essential for guiding clinical decision making, for furthering the understanding of this viral disease, and for diagnostic modelling. Here, we describe an open resource containing data from 1,521 patients with pneumonia (including COVID-19 pneumonia) consisting of chest computed tomography (CT) images, 130 clinical features (from a range of biochemical and cellular analyses of blood and urine samples) and laboratory-confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) clinical status. We show the utility of the database for prediction of COVID-19 morbidity and mortality outcomes using a deep learning algorithm trained with data from 1,170 patients and 19,685 manually labelled CT slices. In an independent validation cohort of 351 patients, the algorithm discriminated between negative, mild and severe cases with areas under the receiver operating characteristic curve of 0.944, 0.860 and 0.884, respectively. The open database may have further uses in the diagnosis and management of patients with COVID-19.

Download Full-text

Feasibility of using deep learning to detect coronary artery disease based on facial photo

European Heart Journal ◽

10.1093/eurheartj/ehaa640 ◽

2020 ◽

Vol 41 (46) ◽

pp. 4400-4411 ◽

Cited By ~ 6

Author(s):

Shen Lin ◽

Zhigang Li ◽

Bowen Fu ◽

Sipeng Chen ◽

Xi Li ◽

...

Keyword(s):

Coronary Artery Disease ◽

Deep Learning ◽

Coronary Artery ◽

Learning Algorithm ◽

Characteristic Curve ◽

Test Group ◽

Cross Sectional ◽

Deep Learning Algorithm ◽

Increased Risk ◽

Artery Disease

Abstract Aims Facial features were associated with increased risk of coronary artery disease (CAD). We developed and validated a deep learning algorithm for detecting CAD based on facial photos. Methods and results We conducted a multicentre cross-sectional study of patients undergoing coronary angiography or computed tomography angiography at nine Chinese sites to train and validate a deep convolutional neural network for the detection of CAD (at least one ≥50% stenosis) from patient facial photos. Between July 2017 and March 2019, 5796 patients from eight sites were consecutively enrolled and randomly divided into training (90%, n = 5216) and validation (10%, n = 580) groups for algorithm development. Between April 2019 and July 2019, 1013 patients from nine sites were enrolled in test group for algorithm test. Sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) were calculated using radiologist diagnosis as the reference standard. Using an operating cut point with high sensitivity, the CAD detection algorithm had sensitivity of 0.80 and specificity of 0.54 in the test group; the AUC was 0.730 (95% confidence interval, 0.699–0.761). The AUC for the algorithm was higher than that for the Diamond–Forrester model (0.730 vs. 0.623, P < 0.001) and the CAD consortium clinical score (0.730 vs. 0.652, P < 0.001). Conclusion Our results suggested that a deep learning algorithm based on facial photos can assist in CAD detection in this Chinese cohort. This technique may hold promise for pre-test CAD probability assessment in outpatient clinics or CAD screening in community. Further studies to develop a clinical available tool are warranted.

Download Full-text

Screening Referable Diabetic Retinopathy Using a Semi-automated Deep Learning Algorithm Assisted Approach

Frontiers in Medicine ◽

10.3389/fmed.2021.740987 ◽

2021 ◽

Vol 8 ◽

Author(s):

Yueye Wang ◽

Danli Shi ◽

Zachary Tan ◽

Yong Niu ◽

Yu Jiang ◽

...

Keyword(s):

Diabetic Retinopathy ◽

Deep Learning ◽

Sensitivity And Specificity ◽

Learning Algorithm ◽

Characteristic Curve ◽

Economic Cost ◽

Positive Image ◽

Deep Learning Algorithm ◽

Image Datasets

Purpose: To assess the accuracy and efficacy of a semi-automated deep learning algorithm (DLA) assisted approach to detect vision-threatening diabetic retinopathy (DR).Methods: We developed a two-step semi-automated DLA-assisted approach to grade fundus photographs for vision-threatening referable DR. Study images were obtained from the Lingtou Cohort Study, and captured at participant enrollment in 2009–2010 (“baseline images”) and annual follow-up between 2011 and 2017. To begin, a validated DLA automatically graded baseline images for referable DR and classified them as positive, negative, or ungradable. Following, each positive image, all other available images from patients who had a positive image, and a 5% random sample of all negative images were selected and regraded by trained human graders. A reference standard diagnosis was assigned once all graders achieved consistent grading outcomes or with a senior ophthalmologist's final diagnosis. The semi-automated DLA assisted approach combined initial DLA screening and subsequent human grading for images identified as high-risk. This approach was further validated within the follow-up image datasets and its time and economic costs evaluated against fully human grading.Results: For evaluation of baseline images, a total of 33,115 images were included and automatically graded by the DLA. 2,604 images (480 positive results, 624 available other images from participants with a positive result, and 1500 random negative samples) were selected and regraded by graders. The DLA achieved an area under the receiver operating characteristic curve (AUC), sensitivity, specificity, and accuracy of 0.953, 0.970, 0.879, and 88.6%, respectively. In further validation within the follow-up image datasets, a total of 88,363 images were graded using this semi-automated approach and human grading was performed on 8975 selected images. The DLA achieved an AUC, sensitivity, and specificity of 0.914, 0.852, 0.853, respectively. Compared against fully human grading, the semi-automated DLA-assisted approach achieved an estimated 75.6% time and 90.1% economic cost saving.Conclusions: The DLA described in this study was able to achieve high accuracy, sensitivity, and specificity in grading fundus images for referable DR. Validated against long-term follow-up datasets, a semi-automated DLA-assisted approach was able to accurately identify suspect cases, and minimize misdiagnosis whilst balancing safety, time, and economic cost.

Download Full-text

Development and Prospective Validation of a Transparent Deep Learning Algorithm for Predicting Need for Mechanical Ventilation

10.1101/2020.05.30.20118109 ◽

2020 ◽

Cited By ~ 1

Author(s):

Supreeth P. Shashikumar ◽

Gabriel Wardi ◽

Paulina Paul ◽

Morgan Carlile ◽

...

Keyword(s):

Mechanical Ventilation ◽

Deep Learning ◽

Learning Algorithm ◽

Characteristic Curve ◽

External Validation ◽

Improve Patient Care ◽

Hospitalized Patients ◽

Clinical Criteria ◽

Prediction Horizon ◽

Deep Learning Algorithm

ABSTRACTIMPORTANCEObjective and early identification of hospitalized patients, and particularly those with novel coronavirus disease 2019 (COVID-19), who may require mechanical ventilation is of great importance and may aid in delivering timely treatment.OBJECTIVETo develop, externally validate and prospectively test a transparent deep learning algorithm for predicting 24 hours in advance the need for mechanical ventilation in hospitalized patients and those with COVID-19.DESIGNObservational cohort studySETTINGTwo academic medical centers from January 01, 2016 to December 31, 2019 (Retrospective cohorts) and February 10, 2020 to May 4, 2020 (Prospective cohorts).PARTICIPANTSOver 31,000 admissions to the intensive care units (ICUs) at two hospitals. Additionally, 777 patients with COVID-19 patients were used for prospective validation. Patients who were placed on mechanical ventilation within four hours of their admission were excluded.MAIN OUTCOME(S) and MEASURE(S)Electronic health record (EHR) data were extracted on an hourly basis, and a set of 40 features were calculated and passed to an interpretable deep-learning algorithm to predict the future need for mechanical ventilation 24 hours in advance. Additionally, commonly used clinical criteria (based on heart rate, oxygen saturation, respiratory rate, FiO2 and pH) was used to assess future need for mechanical ventilation. Performance of the algorithms were evaluated using the area under receiver-operating characteristic curve (AUC), sensitivity, specificity and positive predictive value.RESULTSAfter applying exclusion criteria, the external validation cohort included 3,888 general ICU and 402 COVID-19 patients. The performance of the model (AUC) with a 24-hour prediction horizon at the validation site was 0.882 for the general ICU population and 0.918 for patients with COVID-19. In comparison, commonly used clinical criteria and the ROX score achieved AUCs in the range of 0.773 – 0.782 and 0.768 – 0.810 for the general ICU population and patients with COVID-19, respectively.CONCLUSIONS and RELEVANCEA generalizable and transparent deep-learning algorithm improves on traditional clinical criteria to predict the need for mechanical ventilation in hospitalized patients, including those with COVID-19. Such an algorithm may help clinicians with optimizing timing of tracheal intubation, better allocation of mechanical ventilation resources and staff, and improve patient care.

Download Full-text

Detecting Digoxin Toxicity by Artificial Intelligence-Assisted Electrocardiography

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18073839 ◽

2021 ◽

Vol 18 (7) ◽

pp. 3839

Author(s):

Da-Wei Chang ◽

Chin-Sheng Lin ◽

Tien-Ping Tsao ◽

Chia-Cheng Lee ◽

Jiann-Torng Chen ◽

...

Keyword(s):

Artificial Intelligence ◽

Deep Learning ◽

Learning Algorithm ◽

Characteristic Curve ◽

Delayed Diagnosis ◽

Learning System ◽

Digoxin Toxicity ◽

Therapeutic Window ◽

Deep Learning Algorithm ◽

Stratified Analysis

Although digoxin is important in heart rate control, the utilization of digoxin is declining due to its narrow therapeutic window. Misdiagnosis or delayed diagnosis of digoxin toxicity is common due to the lack of awareness and the time-consuming laboratory work that is involved. Electrocardiography (ECG) may be able to detect potential digoxin toxicity based on characteristic presentations. Our study attempted to develop a deep learning model to detect digoxin toxicity based on ECG manifestations. This study included 61 ECGs from patients with digoxin toxicity and 177,066 ECGs from patients in the emergency room from November 2011 to February 2019. The deep learning algorithm was trained using approximately 80% of ECGs. The other 20% of ECGs were used to validate the performance of the Artificial Intelligence (AI) system and to conduct a human-machine competition. Area under the receiver operating characteristic curve (AUC), sensitivity, and specificity were used to evaluate the performance of ECG interpretation between humans and our deep learning system. The AUCs of our deep learning system for identifying digoxin toxicity were 0.912 and 0.929 in the validation cohort and the human-machine competition, respectively, which reached 84.6% of sensitivity and 94.6% of specificity. Interestingly, the deep learning system using only lead I (AUC = 0.960) was not worse than using complete 12 leads (0.912). Stratified analysis showed that our deep learning system was more applicable to patients with heart failure (HF) and without atrial fibrillation (AF) than those without HF and with AF. Our ECG-based deep learning system provides a high-accuracy, economical, rapid, and accessible way to detect digoxin toxicity, which can be applied as a promising decision supportive system for diagnosing digoxin toxicity in clinical practice.

Download Full-text

Development of a Deep Learning Algorithm for Periapical Disease Detection in Dental Radiographs

Diagnostics ◽

10.3390/diagnostics10060430 ◽

2020 ◽

Vol 10 (6) ◽

pp. 430 ◽

Cited By ~ 1

Author(s):

Michael G. Endres ◽

Florian Hillen ◽

Marios Salloumis ◽

Ahmad R. Sedaghat ◽

Stefan M. Niehues ◽

...

Keyword(s):

Deep Learning ◽

Learning Algorithm ◽

Ground Truth ◽

True Positive Rate ◽

Radiographic Images ◽

Data Set ◽

Radiographic Findings ◽

Panoramic Radiographs ◽

Deep Learning Algorithm ◽

The Mean

Periapical radiolucencies, which can be detected on panoramic radiographs, are one of the most common radiographic findings in dentistry and have a differential diagnosis including infections, granuloma, cysts and tumors. In this study, we seek to investigate the ability with which 24 oral and maxillofacial (OMF) surgeons assess the presence of periapical lucencies on panoramic radiographs, and we compare these findings to the performance of a predictive deep learning algorithm that we have developed using a curated data set of 2902 de-identified panoramic radiographs. The mean diagnostic positive predictive value (PPV) of OMF surgeons based on their assessment of panoramic radiographic images was 0.69 (±0.13), indicating that dentists on average falsely diagnose 31% of cases as radiolucencies. However, the mean diagnostic true positive rate (TPR) was 0.51 (±0.14), indicating that on average 49% of all radiolucencies were missed. We demonstrate that the deep learning algorithm achieves a better performance than 14 of 24 OMF surgeons within the cohort, exhibiting an average precision of 0.60 (±0.04), and an F1 score of 0.58 (±0.04) corresponding to a PPV of 0.67 (±0.05) and TPR of 0.51 (±0.05). The algorithm, trained on limited data and evaluated on clinically validated ground truth, has potential to assist OMF surgeons in detecting periapical lucencies on panoramic radiographs.

Download Full-text