External Validation Study On The Value of Deep Learning Algorithm For The Prediction of Hematoma Expansion From Noncontrast CT Scans

Author(s):  
Dong Chuang Guo ◽  
Jun Gu ◽  
Jian He ◽  
Hai Rui Chu ◽  
Na Dong ◽  
...  

Abstract Background: Hematoma expansion is an independent predictor of patient outcome and mortality. The early diagnosis of hematoma expansion is crucial for selecting clinical treatment options This study aims to explore the value of a deep learning algorithm for the prediction of hematoma expansion from noncontrast Computed tomography(NCCT) scan through external validation.Methods: 102 NCCT images of Hypertensive intracerebral hemorrhage (HICH) patients diagnosed in our hospital were retrospectively reviewed. The initial Computed tomography (CT) scan images were evaluated by a commercial Artificial intelligence (AI) software using deep learning algorithm and radiologists respectively to predict hematoma expansion and the corresponding sensitivity and specificity of the two groups were calculated and compared, Pair-wise comparisons were conducted among gold standard hematoma expansion diagnosis time, AI software diagnosis time and doctors’ reading time.Results: Among 102 HICH patients, The sensitivity, specificity and accuracy of predicting hematoma expansion in the AI group were higher than those in the doctor group(80.0% vs 66.7%,73.6% vs 58.3%,75.5% vs 60.8%),with statistically significant difference (p<0.05).The AI diagnosis time (2.8 ± 0.3s) and the doctors’ diagnosis time (11.7 ± 0.3s) were both significantly shorter than the gold standard diagnosis time (14.5 ± 8.8h) (p <0.05), AI diagnosis time was significantly shorter than that of doctors (p<0.05).Conclusions: Deep learning algorithm could effectively predict hematoma expansion at an early stage from the initial CT scan images of HICH patients after onset with high sensitivity and specificity and greatly shortened diagnosis time, which provides a new, accurate, easy-to-use and fast method for the early prediction of hematoma expansion.

Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1127
Author(s):  
Ji Hyung Nam ◽  
Dong Jun Oh ◽  
Sumin Lee ◽  
Hyun Joo Song ◽  
Yun Jeong Lim

Capsule endoscopy (CE) quality control requires an objective scoring system to evaluate the preparation of the small bowel (SB). We propose a deep learning algorithm to calculate SB cleansing scores and verify the algorithm’s performance. A 5-point scoring system based on clarity of mucosal visualization was used to develop the deep learning algorithm (400,000 frames; 280,000 for training and 120,000 for testing). External validation was performed using additional CE cases (n = 50), and average cleansing scores (1.0 to 5.0) calculated using the algorithm were compared to clinical grades (A to C) assigned by clinicians. Test results obtained using 120,000 frames exhibited 93% accuracy. The separate CE case exhibited substantial agreement between the deep learning algorithm scores and clinicians’ assessments (Cohen’s kappa: 0.672). In the external validation, the cleansing score decreased with worsening clinical grade (scores of 3.9, 3.2, and 2.5 for grades A, B, and C, respectively, p < 0.001). Receiver operating characteristic curve analysis revealed that a cleansing score cut-off of 2.95 indicated clinically adequate preparation. This algorithm provides an objective and automated cleansing score for evaluating SB preparation for CE. The results of this study will serve as clinical evidence supporting the practical use of deep learning algorithms for evaluating SB preparation quality.


Cancers ◽  
2021 ◽  
Vol 13 (4) ◽  
pp. 652 ◽  
Author(s):  
Carlo Augusto Mallio ◽  
Andrea Napolitano ◽  
Gennaro Castiello ◽  
Francesco Maria Giordano ◽  
Pasquale D'Alessio ◽  
...  

Background: Coronavirus disease 2019 (COVID-19) pneumonia and immune checkpoint inhibitor (ICI) therapy-related pneumonitis share common features. The aim of this study was to determine on chest computed tomography (CT) images whether a deep convolutional neural network algorithm is able to solve the challenge of differential diagnosis between COVID-19 pneumonia and ICI therapy-related pneumonitis. Methods: We enrolled three groups: a pneumonia-free group (n = 30), a COVID-19 group (n = 34), and a group of patients with ICI therapy-related pneumonitis (n = 21). Computed tomography images were analyzed with an artificial intelligence (AI) algorithm based on a deep convolutional neural network structure. Statistical analysis included the Mann–Whitney U test (significance threshold at p < 0.05) and the receiver operating characteristic curve (ROC curve). Results: The algorithm showed low specificity in distinguishing COVID-19 from ICI therapy-related pneumonitis (sensitivity 97.1%, specificity 14.3%, area under the curve (AUC) = 0.62). ICI therapy-related pneumonitis was identified by the AI when compared to pneumonia-free controls (sensitivity = 85.7%, specificity 100%, AUC = 0.97). Conclusions: The deep learning algorithm is not able to distinguish between COVID-19 pneumonia and ICI therapy-related pneumonitis. Awareness must be increased among clinicians about imaging similarities between COVID-19 and ICI therapy-related pneumonitis. ICI therapy-related pneumonitis can be applied as a challenge population for cross-validation to test the robustness of AI models used to analyze interstitial pneumonias of variable etiology.


2021 ◽  
Vol 8 ◽  
Author(s):  
Castela Forte ◽  
Andrei Voinea ◽  
Malina Chichirau ◽  
Galiya Yeshmagambetova ◽  
Lea M. Albrecht ◽  
...  

Background: The inclusion of facial and bodily cues (clinical gestalt) in machine learning (ML) models improves the assessment of patients' health status, as shown in genetic syndromes and acute coronary syndrome. It is unknown if the inclusion of clinical gestalt improves ML-based classification of acutely ill patients. As in previous research in ML analysis of medical images, simulated or augmented data may be used to assess the usability of clinical gestalt.Objective: To assess whether a deep learning algorithm trained on a dataset of simulated and augmented facial photographs reflecting acutely ill patients can distinguish between healthy and LPS-infused, acutely ill individuals.Methods: Photographs from twenty-six volunteers whose facial features were manipulated to resemble a state of acute illness were used to extract features of illness and generate a synthetic dataset of acutely ill photographs, using a neural transfer convolutional neural network (NT-CNN) for data augmentation. Then, four distinct CNNs were trained on different parts of the facial photographs and concatenated into one final, stacked CNN which classified individuals as healthy or acutely ill. Finally, the stacked CNN was validated in an external dataset of volunteers injected with lipopolysaccharide (LPS).Results: In the external validation set, the four individual feature models distinguished acutely ill patients with sensitivities ranging from 10.5% (95% CI, 1.3–33.1% for the skin model) to 89.4% (66.9–98.7%, for the nose model). Specificity ranged from 42.1% (20.3–66.5%) for the nose model and 94.7% (73.9–99.9%) for skin. The stacked model combining all four facial features achieved an area under the receiver characteristic operating curve (AUROC) of 0.67 (0.62–0.71) and distinguished acutely ill patients with a sensitivity of 100% (82.35–100.00%) and specificity of 42.11% (20.25–66.50%).Conclusion: A deep learning algorithm trained on a synthetic, augmented dataset of facial photographs distinguished between healthy and simulated acutely ill individuals, demonstrating that synthetically generated data can be used to develop algorithms for health conditions in which large datasets are difficult to obtain. These results support the potential of facial feature analysis algorithms to support the diagnosis of acute illness.


2021 ◽  
Author(s):  
Donghwan Yun ◽  
Semin Cho ◽  
Yong Chul Kim ◽  
Dong Ki Kim ◽  
Kook-Hwan Oh ◽  
...  

BACKGROUND Precise prediction of contrast media-induced acute kidney injury (CIAKI) is an important issue because of its relationship with worse outcomes. OBJECTIVE Herein, we examined whether a deep learning algorithm could predict the risk of intravenous CIAKI better than other machine learning and logistic regression models in patients undergoing computed tomography. METHODS A total of 14,185 cases that underwent intravenous contrast media for computed tomography under the preventive and monitoring facility in Seoul National University Hospital were reviewed. CIAKI was defined as an increase in serum creatinine ≥0.3 mg/dl within 2 days and/or ≥50% within 7 days. Using both time-varying and time-invariant features, machine learning models, such as the recurrent neural network (RNN), light gradient boosting machine, extreme boosting machine, random forest, decision tree, support vector machine, κ-nearest neighboring, and logistic regression, were developed using a training set, and their performance was compared using the area under the receiver operating characteristic curve (AUROC) in a test set. RESULTS CIAKI developed in 261 cases (1.8%). The RNN model had the highest AUROC value of 0.755 (0.708–0.802) for predicting CIAKI, which was superior to those obtained from other machine learning models. Although CIAKI was defined as an increase in serum creatinine ≥0.5 mg/dl and/or ≥25% within 3 days, the highest performance was achieved in the RNN model with an AUROC of 0.716 (0.664–0.768). In the feature ranking analysis, albumin level was the most highly contributing factor to RNN performance, followed by time-varying kidney function. CONCLUSIONS Application of a deep learning algorithm improves the predictability of intravenous CIAKI after computed tomography, representing a basis for future clinical alarming and preventive systems.


2020 ◽  
pp. 2003061
Author(s):  
Ju Gang Nam ◽  
Minchul Kim ◽  
Jongchan Park ◽  
Eui Jin Hwang ◽  
Jong Hyuk Lee ◽  
...  

We aimed to develop a deep-learning algorithm detecting 10 common abnormalities (DLAD-10) on chest radiographs and to evaluate its impact in diagnostic accuracy, timeliness of reporting, and workflow efficacy.DLAD-10 was trained with 146 717 radiographs from 108 053 patients using a ResNet34-based neural network with lesion-specific channels for 10 common radiologic abnormalities (pneumothorax, mediastinal widening, pneumoperitoneum, nodule/mass, consolidation, pleural effusion, linear atelectasis, fibrosis, calcification, and cardiomegaly). For external validation, the performance of DLAD-10 on a same-day CT-confirmed dataset (normal:abnormal, 53:147) and an open-source dataset (PadChest; normal:abnormal, 339:334) was compared to that of three radiologists. Separate simulated reading tests were conducted on another dataset adjusted to real-world disease prevalence in the emergency department, consisting of four critical, 52 urgent, and 146 non-urgent cases. Six radiologists participated in the simulated reading sessions with and without DLAD-10.DLAD-10 exhibited areas under the receiver-operating characteristic curves (AUROCs) of 0.895–1.00 in the CT-confirmed dataset and 0.913–0.997 in the PadChest dataset. DLAD-10 correctly classified significantly more critical abnormalities (95.0% [57/60]) than pooled radiologists (84.4% [152/180]; p=0.01). In simulated reading tests for emergency department patients, pooled readers detected significantly more critical (70.8% [17/24] versus 29.2% [7/24]; p=0.006) and urgent (82.7% [258/312] versus 78.2% [244/312]; p=0.04) abnormalities when aided by DLAD-10. DLAD-10 assistance shortened the mean time-to-report critical and urgent radiographs (640.5±466.3 versus 3371.0±1352.5 s and 1840.3±1141.1 versus 2127.1±1468.2, respectively; p-values<0.01) and reduced the mean interpretation time (20.5±22.8 versus 23.5±23.7 s; p<0.001).DLAD-10 showed excellent performance, improving radiologists' performance and shortening the reporting time for critical and urgent cases.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Yoichiro Yamamoto ◽  
Toyonori Tsuzuki ◽  
Jun Akatsuka ◽  
Masao Ueki ◽  
Hiromu Morikawa ◽  
...  

AbstractDeep learning algorithms have been successfully used in medical image classification. In the next stage, the technology of acquiring explainable knowledge from medical images is highly desired. Here we show that deep learning algorithm enables automated acquisition of explainable features from diagnostic annotation-free histopathology images. We compare the prediction accuracy of prostate cancer recurrence using our algorithm-generated features with that of diagnosis by expert pathologists using established criteria on 13,188 whole-mount pathology images consisting of over 86 billion image patches. Our method not only reveals findings established by humans but also features that have not been recognized, showing higher accuracy than human in prognostic prediction. Combining both our algorithm-generated features and human-established criteria predicts the recurrence more accurately than using either method alone. We confirm robustness of our method using external validation datasets including 2276 pathology images. This study opens up fields of machine learning analysis for discovering uncharted knowledge.


Author(s):  
Sarah Eskreis-Winkler ◽  
Natsuko Onishi ◽  
Katja Pinker ◽  
Jeffrey S Reiner ◽  
Jennifer Kaplan ◽  
...  

Abstract Objective To investigate the feasibility of using deep learning to identify tumor-containing axial slices on breast MRI images. Methods This IRB–approved retrospective study included consecutive patients with operable invasive breast cancer undergoing pretreatment breast MRI between January 1, 2014, and December 31, 2017. Axial tumor-containing slices from the first postcontrast phase were extracted. Each axial image was subdivided into two subimages: one of the ipsilateral cancer-containing breast and one of the contralateral healthy breast. Cases were randomly divided into training, validation, and testing sets. A convolutional neural network was trained to classify subimages into “cancer” and “no cancer” categories. Accuracy, sensitivity, and specificity of the classification system were determined using pathology as the reference standard. A two-reader study was performed to measure the time savings of the deep learning algorithm using descriptive statistics. Results Two hundred and seventy-three patients with unilateral breast cancer met study criteria. On the held-out test set, accuracy of the deep learning system for tumor detection was 92.8% (648/706; 95% confidence interval: 89.7%–93.8%). Sensitivity and specificity were 89.5% and 94.3%, respectively. Readers spent 3 to 45 seconds to scroll to the tumor-containing slices without use of the deep learning algorithm. Conclusion In breast MR exams containing breast cancer, deep learning can be used to identify the tumor-containing slices. This technology may be integrated into the picture archiving and communication system to bypass scrolling when viewing stacked images, which can be helpful during nonsystematic image viewing, such as during interdisciplinary tumor board meetings.


Sign in / Sign up

Export Citation Format

Share Document