scholarly journals Deep learning to diagnose pouch of Douglas obliteration with ultrasound sliding sign

Author(s):  
Gabriel Maicas ◽  
Mathew Leonardi ◽  
Jodie Avery ◽  
Catrina Panuccio ◽  
Gustavo Carneiro ◽  
...  

Objectives: Pouch of Douglas (POD) obliteration is a severe consequence of inflammation in the pelvis, often seen in patients with endometriosis. The sliding sign is a dynamic transvaginal ultrasound (TVS) test that can diagnose POD obliteration. We aimed to develop a deep learning (DL) model to automatically classify the state of the POD using recorded videos depicting the sliding sign test. Methods: Expert sonologists performed, interpreted, and recorded videos of consecutive patients from Sept 2018-Apr 2020. The sliding sign was classified as positive (i.e. normal) or negative (i.e. POD obliteration). A DL model based on a temporal residual network was prospectively trained with a dataset of TVS videos. The model was tested on an independent test set and its diagnostic accuracy including area under the receiver operating characteristic curve (AUC), accuracy, sensitivity, specificity, positive and negative predictive value (PPV/NPV)) was compared to the reference standard sonologist classification (positive or negative sliding sign). Results: A positive sliding sign was depicted in 646/749 (86.2%) videos, whereas 103/749 (13.8%) videos depicted a negative sliding sign. The dataset was split into training (414 videos), validation (139), and testing (196) maintaining similar positive/negative proportions. When applied to the test dataset using a threshold of 0.9, the model achieved: AUC 96.5% (95%CI,90.8-100.0%), an accuracy of 88.8% (95%CI,83.5-92.8%), sensitivity of 88.6% (95%CI,83.0-92.9%), specificity of 90.0% (95%CI,68.3-98.8%), a PPV of 98.7% (95%CI,95.4-99.7%), and an NPV of 47.7% (95%CI,36.8-58.2%). Conclusions: We have developed an accurate DL model for the prediction of the TVS-based sliding sign classification.

Life ◽  
2021 ◽  
Vol 11 (3) ◽  
pp. 200
Author(s):  
Yu-Hsuan Li ◽  
Wayne Huey-Herng Sheu ◽  
Chien-Chih Chou ◽  
Chun-Hsien Lin ◽  
Yuan-Shao Cheng ◽  
...  

Deep learning-based software is developed to assist physicians in terms of diagnosis; however, its clinical application is still under investigation. We integrated deep-learning-based software for diabetic retinopathy (DR) grading into the clinical workflow of an endocrinology department where endocrinologists grade for retinal images and evaluated the influence of its implementation. A total of 1432 images from 716 patients and 1400 images from 700 patients were collected before and after implementation, respectively. Using the grading by ophthalmologists as the reference standard, the sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) to detect referable DR (RDR) were 0.91 (0.87–0.96), 0.90 (0.87–0.92), and 0.90 (0.87–0.93) at the image level; and 0.91 (0.81–0.97), 0.84 (0.80–0.87), and 0.87 (0.83–0.91) at the patient level. The monthly RDR rate dropped from 55.1% to 43.0% after implementation. The monthly percentage of finishing grading within the allotted time increased from 66.8% to 77.6%. There was a wide range of agreement values between the software and endocrinologists after implementation (kappa values of 0.17–0.65). In conclusion, we observed the clinical influence of deep-learning-based software on graders without the retinal subspecialty. However, the validation using images from local datasets is recommended before clinical implementation.


2020 ◽  
pp. bjophthalmol-2020-317825
Author(s):  
Yonghao Li ◽  
Weibo Feng ◽  
Xiujuan Zhao ◽  
Bingqian Liu ◽  
Yan Zhang ◽  
...  

Background/aimsTo apply deep learning technology to develop an artificial intelligence (AI) system that can identify vision-threatening conditions in high myopia patients based on optical coherence tomography (OCT) macular images.MethodsIn this cross-sectional, prospective study, a total of 5505 qualified OCT macular images obtained from 1048 high myopia patients admitted to Zhongshan Ophthalmic Centre (ZOC) from 2012 to 2017 were selected for the development of the AI system. The independent test dataset included 412 images obtained from 91 high myopia patients recruited at ZOC from January 2019 to May 2019. We adopted the InceptionResnetV2 architecture to train four independent convolutional neural network (CNN) models to identify the following four vision-threatening conditions in high myopia: retinoschisis, macular hole, retinal detachment and pathological myopic choroidal neovascularisation. Focal Loss was used to address class imbalance, and optimal operating thresholds were determined according to the Youden Index.ResultsIn the independent test dataset, the areas under the receiver operating characteristic curves were high for all conditions (0.961 to 0.999). Our AI system achieved sensitivities equal to or even better than those of retina specialists as well as high specificities (greater than 90%). Moreover, our AI system provided a transparent and interpretable diagnosis with heatmaps.ConclusionsWe used OCT macular images for the development of CNN models to identify vision-threatening conditions in high myopia patients. Our models achieved reliable sensitivities and high specificities, comparable to those of retina specialists and may be applied for large-scale high myopia screening and patient follow-up.


Diagnostics ◽  
2021 ◽  
Vol 11 (6) ◽  
pp. 1127
Author(s):  
Ji Hyung Nam ◽  
Dong Jun Oh ◽  
Sumin Lee ◽  
Hyun Joo Song ◽  
Yun Jeong Lim

Capsule endoscopy (CE) quality control requires an objective scoring system to evaluate the preparation of the small bowel (SB). We propose a deep learning algorithm to calculate SB cleansing scores and verify the algorithm’s performance. A 5-point scoring system based on clarity of mucosal visualization was used to develop the deep learning algorithm (400,000 frames; 280,000 for training and 120,000 for testing). External validation was performed using additional CE cases (n = 50), and average cleansing scores (1.0 to 5.0) calculated using the algorithm were compared to clinical grades (A to C) assigned by clinicians. Test results obtained using 120,000 frames exhibited 93% accuracy. The separate CE case exhibited substantial agreement between the deep learning algorithm scores and clinicians’ assessments (Cohen’s kappa: 0.672). In the external validation, the cleansing score decreased with worsening clinical grade (scores of 3.9, 3.2, and 2.5 for grades A, B, and C, respectively, p < 0.001). Receiver operating characteristic curve analysis revealed that a cleansing score cut-off of 2.95 indicated clinically adequate preparation. This algorithm provides an objective and automated cleansing score for evaluating SB preparation for CE. The results of this study will serve as clinical evidence supporting the practical use of deep learning algorithms for evaluating SB preparation quality.


2020 ◽  
pp. 028418512097362
Author(s):  
Xiefeng Yang ◽  
Yu Lin ◽  
Zhen Xing ◽  
Dejun She ◽  
Yan Su ◽  
...  

Background Isocitrate dehydrogenase (IDH)-mutant lower-grade gliomas (LGGs) are further classified into two classes: with and without 1p/19q codeletion. IDH-mutant and 1p/19q codeleted LGGs have better prognosis compared with IDH-mutant and 1p/19q non-codeleted LGGs. Purpose To evaluate conventional magnetic resonance imaging (cMRI), diffusion-weighted imaging (DWI), susceptibility-weighted imaging (SWI), and dynamic susceptibility contrast perfusion-weighted imaging (DSC-PWI) for predicting 1p/19q codeletion status of IDH-mutant LGGs. Material and Methods We retrospectively reviewed cMRI, DWI, SWI, and DSC-PWI in 142 cases of IDH mutant LGGs with known 1p/19q codeletion status. Features of cMRI, relative ADC (rADC), intratumoral susceptibility signals (ITSSs), and the value of relative cerebral blood volume (rCBV) were compared between IDH-mutant LGGs with and without 1p/19q codeletion. Receiver operating characteristic curve and logistic regression were used to determine diagnostic performances. Results IDH-mutant and 1p/19q non-codeleted LGGs tended to present with the T2/FLAIR mismatch sign and distinct borders ( P < 0.001 and P = 0.038, respectively). Parameters of rADC, ITSSs, and rCBVmax were significantly different between the 1p/19q codeleted and 1p/19q non-codeleted groups ( P < 0.001, P = 0.017, and P < 0.001, respectively). A combination of cMRI, SWI, DWI, and DSC-PWI for predicting 1p/19q codeletion status in IDH-mutant LGGs resulted in a sensitivity, specificity, positive predictive value, negative predictive value, and an AUC of 80.36%, 78.57%, 83.30%, 75.00%, and 0.88, respectively. Conclusion 1p/19q codeletion status of IDH-mutant LGGs can be stratified using cMRI and advanced MRI techniques, including DWI, SWI, and DSC-PWI. A combination of cMRI, rADC, ITSSs, and rCBVmax may improve the diagnostic performance for predicting 1p/19q codeletion status.


Author(s):  
Michael Michail ◽  
Abdul-Rahman Ihdayhid ◽  
Andrea Comella ◽  
Udit Thakur ◽  
James D. Cameron ◽  
...  

Background: Coronary artery disease is common in patients with severe aortic stenosis. Computed tomography-derived fractional flow reserve (CT-FFR) is a clinically used modality for assessing coronary artery disease, however, its use has not been validated in patients with severe aortic stenosis. This study assesses the safety, feasibility, and validity of CT-FFR in patients with severe aortic stenosis. Methods: Prospectively recruited patients underwent standard-protocol invasive FFR and coronary CT angiography (CTA). CTA images were analyzed by central core laboratory (HeartFlow, Inc) for independent evaluation of CT-FFR. CT-FFR data were compared with FFR (ischemia defined as FFR ≤0.80). Results: Forty-two patients (68 vessels) underwent FFR and CTA; 39 patients (92.3%) and 60 vessels (88.2%) had interpretable CTA enabling CT-FFR computation. Mean age was 76.2±6.7 years (71.8% male). No patients incurred complications relating to premedication, CTA, or FFR protocol. Mean FFR and CT-FFR were 0.83±0.10 and 0.77±0.14, respectively. CT calcium score was 1373.3±1392.9 Agatston units. On per vessel analysis, there was positive correlation between FFR and CT-FFR (Pearson correlation coefficient, R =0.64, P <0.0001). Sensitivity, specificity, positive predictive value, and negative predictive values were 73.9%, 78.4%, 68.0%, and 82.9%, respectively, with 76.7% diagnostic accuracy. The area under the receiver-operating characteristic curve for CT-FFR was 0.83 (0.72–0.93, P <0.0001), which was higher than that of CTA and quantitative coronary angiography ( P =0.01 and P <0.001, respectively). Bland-Altman plot showed mean bias between FFR and CT-FFR as 0.059±0.110. On per patient analysis, the sensitivity, specificity, positive predictive, and negative predictive values were 76.5%, 77.3%, 72.2%, and 81.0% with 76.9% diagnostic accuracy. The per patient area under the receiver-operating characteristic curve analysis was 0.81 (0.67–0.95, P <0.0001). Conclusions: CT-FFR is safe and feasible in patients with severe aortic stenosis. Our data suggests that the diagnostic accuracy of CT-FFR in this cohort potentially enables its use in clinical practice and provides the foundation for future research into the use of CT-FFR for coronary evaluation pre-aortic valve replacement.


2020 ◽  
Vol 8 ◽  
pp. 205031212096646
Author(s):  
Achara Tongpoo ◽  
Pimjai Niparuck ◽  
Charuwan Sriapha ◽  
Winai Wananukul ◽  
Satariya Trakulsrichai

Objectives: GGreen pit vipers (GPV) envenomation causes consumptive coagulopathy mainly by thrombin-like enzymes. Fibrinogen levels are generally investigated to help evaluate systemic envenomation. However, tests of fibrinogen levels may not be available in every hospital. This study aimed to determine the sensitivity, specificity and accuracy for a range of various coagulation tests (20 minute whole blood clotting test (20WBCT), prothrombin time, international normalized ratio and thrombin time (TT)), comparing to the two gold standards performed in patients with GPV bite. Methods: This was the pilot study which we retrospectively reviewed fibrinogen level results including the hospital records of 24 GPV ( Trimeresurus albolabris or macrops) bite patients visiting Ramathibodi Hospital, Thailand during 2013–2017 with 65 results of fibrinogen levels. The fibrinogen levels <164 and <100 mg/dL were used as the standard cut-off points or gold standards as the abnormal low and critical levels, respectively. Results: Most were male. All had local effects. For fibrinogen levels <164 and <100 mg/dL, prolonged TT had the highest sensitivity of 57.1% and 82.4%; the negative predictive value of 74.5% and 93.6%; the accuracy of 81.0% and 92.1%; and the area under a receiver operating characteristic curve of 0.762 and 0.873, respectively. For fibrinogen levels <164, unclotted 20WBCT and prolonged TT had the highest specificity and positive predictive value of 100% all. For fibrinogen levels <100, unclotted 20WBCT had the highest specificity and positive predictive value of 100% both, while prolonged TT had the specificity and positive predictive value of 95.7% and 87.5%, respectively. One patient developed isolated thrombocytopenia without hypofibrinogenemia and coagulopathy. Conclusions: Among four coagulation tests, TT was the most sensitive and accurate test to indicate hypofibrinogenemia in GPV bite patients. In case of unavailable fibrinogen levels thrombin time might be investigated to help evaluate patients’ fibrinogen status. Isolated thrombocytopenia could occur in GPV envenomation.


2020 ◽  
Vol 10 (4) ◽  
pp. 211 ◽  
Author(s):  
Yong Joon Suh ◽  
Jaewon Jung ◽  
Bum-Joo Cho

Mammography plays an important role in screening breast cancer among females, and artificial intelligence has enabled the automated detection of diseases on medical images. This study aimed to develop a deep learning model detecting breast cancer in digital mammograms of various densities and to evaluate the model performance compared to previous studies. From 1501 subjects who underwent digital mammography between February 2007 and May 2015, craniocaudal and mediolateral view mammograms were included and concatenated for each breast, ultimately producing 3002 merged images. Two convolutional neural networks were trained to detect any malignant lesion on the merged images. The performances were tested using 301 merged images from 284 subjects and compared to a meta-analysis including 12 previous deep learning studies. The mean area under the receiver-operating characteristic curve (AUC) for detecting breast cancer in each merged mammogram was 0.952 ± 0.005 by DenseNet-169 and 0.954 ± 0.020 by EfficientNet-B5, respectively. The performance for malignancy detection decreased as breast density increased (density A, mean AUC = 0.984 vs. density D, mean AUC = 0.902 by DenseNet-169). When patients’ age was used as a covariate for malignancy detection, the performance showed little change (mean AUC, 0.953 ± 0.005). The mean sensitivity and specificity of the DenseNet-169 (87 and 88%, respectively) surpassed the mean values (81 and 82%, respectively) obtained in a meta-analysis. Deep learning would work efficiently in screening breast cancer in digital mammograms of various densities, which could be maximized in breasts with lower parenchyma density.


2020 ◽  
Vol 34 (7) ◽  
pp. 717-730 ◽  
Author(s):  
Matthew C. Robinson ◽  
Robert C. Glen ◽  
Alpha A. Lee

Abstract Machine learning methods may have the potential to significantly accelerate drug discovery. However, the increasing rate of new methodological approaches being published in the literature raises the fundamental question of how models should be benchmarked and validated. We reanalyze the data generated by a recently published large-scale comparison of machine learning models for bioactivity prediction and arrive at a somewhat different conclusion. We show that the performance of support vector machines is competitive with that of deep learning methods. Additionally, using a series of numerical experiments, we question the relevance of area under the receiver operating characteristic curve as a metric in virtual screening. We further suggest that area under the precision–recall curve should be used in conjunction with the receiver operating characteristic curve. Our numerical experiments also highlight challenges in estimating the uncertainty in model performance via scaffold-split nested cross validation.


F1000Research ◽  
2021 ◽  
Vol 9 ◽  
pp. 1244
Author(s):  
Phornwipa Panta ◽  
Win Techakehakij

Background: Screening for albuminuria is generally recommended among patients with hypertension. While the urine dipstick is commonly used for screening urine albumin, there is little evidence about its diagnostic accuracy among these patients in Thailand. This study aimed to assess the diagnostic accuracy of a dipstick in Thai hypertensive patients for detecting albuminuria. Methods: This study collected the data of 3,067 hypertensive patients, with the results of urine dipstick and urine albumin-to-creatinine ratio (ACR) from random single spot urine being examined in the same day at least once, at Lampang Hospital, Thailand, during 2018. For ACR, a reference standard of ≥ 30 mg/g was applied to indicate the presence of albuminuria. Results: The sensitivity, specificity, positive predictive value (PPV), and negative predictive value of the trace result from dipsticks were 53.6%, 94.5%, 86.5%, and 75.5%, respectively. The area under the receiver operating characteristic curve of the dipstick was 0.748. Conclusion: Using the dipstick for screening albuminuria among hypertensive patients should not be recommended for mass screening due to its low sensitivity. In response to high PPV, a trace threshold of the dipstick may be used to indicate presence of albuminuria.


Medicina ◽  
2021 ◽  
Vol 57 (11) ◽  
pp. 1148
Author(s):  
Marie Takahashi ◽  
Tomoyuki Fujioka ◽  
Toshihiro Horii ◽  
Koichiro Kimura ◽  
Mizuki Kimura ◽  
...  

Background and Objectives: This study aimed to investigate whether predictive indicators for the deterioration of respiratory status can be derived from the deep learning data analysis of initial chest computed tomography (CT) scans of patients with coronavirus disease 2019 (COVID-19). Materials and Methods: Out of 117 CT scans of 75 patients with COVID-19 admitted to our hospital between April and June 2020, we retrospectively analyzed 79 CT scans that had a definite time of onset and were performed prior to any medication intervention. Patients were grouped according to the presence or absence of increased oxygen demand after CT scan. Quantitative volume data of lung opacity were measured automatically using a deep learning-based image analysis system. The sensitivity, specificity, and area under the receiver operating characteristic curve (AUC) of the opacity volume data were calculated to evaluate the accuracy of the system in predicting the deterioration of respiratory status. Results: All 79 CT scans were included (median age, 62 years (interquartile range, 46–77 years); 56 (70.9%) were male. The volume of opacity was significantly higher for the increased oxygen demand group than for the nonincreased oxygen demand group (585.3 vs. 132.8 mL, p < 0.001). The sensitivity, specificity, and AUC were 76.5%, 68.2%, and 0.737, respectively, in the prediction of increased oxygen demand. Conclusion: Deep learning-based quantitative analysis of the affected lung volume in the initial CT scans of patients with COVID-19 can predict the deterioration of respiratory status to improve treatment and resource management.


Sign in / Sign up

Export Citation Format

Share Document