scholarly journals Accuracy of convolutional neural network-based artificial intelligence in diagnosis of gastrointestinal lesions based on endoscopic images: A systematic review and meta-analysis

2020 ◽  
Vol 08 (11) ◽  
pp. E1584-E1594
Author(s):  
Babu P. Mohan ◽  
Shahab R. Khan ◽  
Lena L. Kassab ◽  
Suresh Ponnada ◽  
Parambir S. Dulai ◽  
...  

Abstract Background and study aims Recently, a growing body of evidence has been amassed on evaluation of artificial intelligence (AI) known as deep learning in computer-aided diagnosis of gastrointestinal lesions by means of convolutional neural networks (CNN). We conducted this meta-analysis to study pooled rates of performance for CNN-based AI in diagnosis of gastrointestinal neoplasia from endoscopic images. Methods Multiple databases were searched (from inception to November 2019) and studies that reported on the performance of AI by means of CNN in the diagnosis of gastrointestinal tumors were selected. A random effects model was used and pooled accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) were calculated. Pooled rates were categorized based on the gastrointestinal location of lesion (esophagus, stomach and colorectum). Results Nineteen studies were included in our final analysis. The pooled accuracy of CNN in esophageal neoplasia was 87.2 % (76–93.6) and NPV was 92.1 % (85.9–95.7); the accuracy in lesions of stomach was 85.8 % (79.8–90.3) and NPV was 92.1 % (85.9–95.7); and in colorectal neoplasia the accuracy was 89.9 % (82–94.7) and NPV was 94.3 % (86.4–97.7). Conclusions Based on our meta-analysis, CNN-based AI achieved high accuracy in diagnosis of lesions in esophagus, stomach, and colorectum.

Endoscopy ◽  
2020 ◽  
Author(s):  
Alanna Ebigbo ◽  
Robert Mendel ◽  
Tobias Rückert ◽  
Laurin Schuster ◽  
Andreas Probst ◽  
...  

Background and aims: The accurate differentiation between T1a and T1b Barrett’s cancer has both therapeutic and prognostic implications but is challenging even for experienced physicians. We trained an Artificial Intelligence (AI) system on the basis of deep artificial neural networks (deep learning) to differentiate between T1a and T1b Barrett’s cancer white-light images. Methods: Endoscopic images from three tertiary care centres in Germany were collected retrospectively. A deep learning system was trained and tested using the principles of cross-validation. A total of 230 white-light endoscopic images (108 T1a and 122 T1b) was evaluated with the AI-system. For comparison, the images were also classified by experts specialized in endoscopic diagnosis and treatment of Barrett’s cancer. Results: The sensitivity, specificity, F1 and accuracy of the AI-system in the differentiation between T1a and T1b cancer lesions was 0.77, 0.64, 0.73 and 0.71, respectively. There was no statistically significant difference between the performance of the AI-system and that of human experts with sensitivity, specificity, F1 and accuracy of 0.63, 0.78, 0.67 and 0.70 respectively. Conclusion: This pilot study demonstrates the first multicenter application of an AI-based system in the prediction of submucosal invasion in endoscopic images of Barrett’s cancer. AI scored equal to international experts in the field, but more work is necessary to improve the system and apply it to video sequences and in a real-life setting. Nevertheless, the correct prediction of submucosal invasion in Barret´s cancer remains challenging for both experts and AI.


2020 ◽  
Vol 7 (1) ◽  
pp. e000355 ◽  
Author(s):  
Rohit Hariharan ◽  
Mark Jenkins

BackgroundCirculating tumour DNA from colorectal cancer (CRC) is a biomarker for early detection of the disease and therefore potentially useful for screening. One such biomarker is the methylated SEPT9 (mSEPT9) gene, which occurs during CRC tumourigenesis. This systematic review and meta-analysis aims to establish the sensitivity, specificity and accuracy of mSEPT9 tests for the early diagnosis of CRC.MethodsA systematic search of the relevant literature was conducted using Medline and Embase databases. Data were extracted from the eligible studies and analysed to estimate pooled sensitivity, specificity and diagnostic test accuracy.ResultsBased on 19 studies, the pooled estimates (and 95% CIs) for mSEPT9 to detect CRC were: sensitivity 69% (62–75); specificity 92% (89–95); positive likelihood ratio 9.1 (6.1–13.8); negative likelihood ratio 0.34 (0.27–0.42); diagnostic OR 27 (15–48) and area under the curve 0.89 (0.86–0.91). The test has a positive predictive value of 2.6% and negative predictive value of 99.9% in an average risk population (0.3% CRC prevalence), and 9.5% (positive predictive value) and 99.6% (negative predictive value) in a high-risk population (1.2% CRC prevalence).ConclusionThe mSEPT9 test has high specificity and moderate sensitivity for CRC and is therefore a potential alternative screening method for those declining faecal immunochemical test for occult blood (FIT) or other screening modalities. However, it is limited by its poor diagnostic performance for precancerous lesions (advanced adenomas and polyps) and its relatively high costs, and little is known about its acceptability to those declining to use the FIT.


2021 ◽  
Author(s):  
Johnson Thomas ◽  
Tracy Haertling

AbstractBackgroundCurrent classification systems for thyroid nodules are very subjective. Artificial intelligence (AI) algorithms have been used to decrease subjectivity in medical image interpretation. 1 out of 2 women over the age of 50 may have a thyroid nodule and at present the only way to exclude malignancy is through invasive procedures. Hence, there exists a need for noninvasive objective classification of thyroid nodules. Some cancers have benign appearance on ultrasonogram. Hence, we decided to create an image similarity algorithm rather than image classification algorithm.MethodsUltrasound images of thyroid nodules from patients who underwent either biopsy or thyroid surgery from February of 2012 through February of 2017 in our institution were used to create AI models. Nodules were excluded if there was no definitive diagnosis of benignity or malignancy. 482 nodules met the inclusion criteria and all available images from these nodules were used to create the AI models. Later, these AI models were used to test 103 thyroid nodules which underwent biopsy or surgery from March of 2017 through July of 2018.ResultsNegative predictive value of the image similarity model was 93.2%. Sensitivity, specificity, positive predictive value and accuracy of the model was 87.8%, 78.5%, 65.9% and 81.5% respectively.ConclusionWhen compared to published results of ACR TIRADS and ATA classification system, our image similarity model had comparable negative predictive value with better sensitivity specificity and positive predictive value. By using image similarity AI models, we can eliminate subjectivity and decrease the number of unnecessary biopsies. Using image similarity AI model, we were able to create an explainable AI model which increases physician’s confidence in the predictions.


2020 ◽  
Author(s):  
Hairu Li ◽  
Yulan Peng ◽  
Yan Wang ◽  
Hong Ai ◽  
Xiaodong Zhou ◽  
...  

Abstract Background: An accurate diagnosis for high-suspicion nodules based on the 2015 American Thyroid Association (ATA) guidelines would reduce unnecessary invasive examinations. Elastography is a useful tool for discriminating benign and malignant thyroid nodules. The aim of this study is to investigate the diagnostic efficiency of elastography for high-suspicion thyroid nodules based on the 2015 ATA guidelines in the Chinese population.Methods: Thyroid nodules with high-suspicion characteristics based on the 2015 ATA guidelines were subjected to conventional ultrasound (US) and ultrasound strain elastography (USE) examinations at 12 hospitals from 4 geographic regions across China. Cytology/histology of thyroid nodules was used as a reference method. Receiver operating characteristic (ROC) curves were plotted to evaluate the diagnostic performance of the elasticity score (ES) and strain ratio (SR). Logistic regression analysis was used to determine the predictors of malignancy.Results: Overall, a total of 1445 thyroid nodules (834 malignant, 611 benign) from 12 centers were included in the final analysis. The areas under the curve of the ES and SR were 0.828 and 0.732, respectively. The sensitivity, specificity, accuracy, positive predictive value (PPV) and negative predictive value (NPV) of the ES were 92.4%, 60.7%, 79.0%, 76.3% and 85.5%, respectively, and those of the SR were 81.1%, 50.1%, 68.9%, 65.9% and 67.9%, respectively. The combination of the Thyroid Imaging Reporting and Data System (TI-RADS) and ES led to a significant increase in the sensitivity and NPV (97.1% and 91.9%, respectively) compared with the TI-RADS alone. Logistic regression analysis showed that microcalcifications (OR=5.290), taller than wide (OR=12.710), irregular margins (OR=10.117), extrathyroidal extension (ETE; OR=6.412), the ES (OR=3.741) and the SR (OR=1.083) were independent predictors of malignant thyroid nodules. The sensitivity, specificity, accuracy, PPV and NPV of the ES were all superior in nodules ≥1 cm than in those <1 cm (95.0% vs 90.4%, 68.8% vs 56.8%, 85.9% vs 74.4%, 85.2% vs 69.9%, and 87.8% vs 84.2%, respectively).Conclusions: Elastography combined with the ES is a valuable tool for the assessment of high-suspicion thyroid nodules based on the 2015 ATA guidelines, especially in nodules ≥1 cm.


2020 ◽  
Author(s):  
Hairu Li ◽  
Yulan Peng ◽  
Yan Wang ◽  
Hong Ai ◽  
Xiaodong Zhou ◽  
...  

Abstract Background: An accurate diagnosis for high-suspicion nodules based on the 2015 American Thyroid Association (ATA) guidelines would reduce unnecessary invasive examinations. Elastography is a useful tool for discriminating benign and malignant thyroid nodules.Aims: To investigate the diagnostic efficiency of elastography for high-suspicion thyroid nodules based on the 2015 ATA guidelines in the Chinese population.Materials and Methods: Thyroid nodules with high-suspicion characteristics based on the 2015 ATA guidelines were subjected to conventional ultrasound (US) and ultrasound strain elastography (USE) examinations at 12 hospitals from 4 geographic regions across China. Cytology/histology of thyroid nodules was used as a reference method. Receiver operating characteristic (ROC) curves were plotted to evaluate the diagnostic performance of the elasticity score (ES) and strain ratio (SR). Logistic regression analysis was used to determine the predictors of malignancy.Results: Overall, a total of 1445 thyroid nodules (834 malignant, 611 benign) from 12 centers were included in the final analysis. The areas under the curve of the ES and SR were 0.828 and 0.732, respectively. The sensitivity, specificity, accuracy, positive predictive value (PPV) and negative predictive value (NPV) of the ES were 92.4%, 60.7%, 79.0%, 76.3% and 85.5%, respectively, and those of the SR were 81.1%, 50.1%, 68.9%, 65.9% and 67.9%, respectively. The combination of the Thyroid Imaging Reporting and Data System (TI-RADS) and ES led to a significant increase in the sensitivity and NPV (97.1% and 91.9%, respectively) compared with the TI-RADS alone. Logistic regression analysis showed that microcalcifications (OR=5.290), taller than wide (OR=12.710), irregular margins (OR=10.117), extrathyroidal extension (ETE; OR=6.412), the ES (OR=3.741) and the SR (OR=1.083) were independent predictors of malignant thyroid nodules. The sensitivity, specificity, accuracy, PPV and NPV of the ES were all superior in nodules ≥1 cm than in those <1 cm (95.0% vs 90.4%, 68.8% vs 56.8%, 85.9% vs 74.4%, 85.2% vs 69.9%, and 87.8% vs 84.2%, respectively).Conclusions: Elastography combined with the ES is a valuable tool for the assessment of high-suspicion thyroid nodules based on the 2015 ATA guidelines, especially in nodules ≥1 cm.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Pakanat Decharatanachart ◽  
Roongruedee Chaiteerakij ◽  
Thodsawit Tiyarattanachai ◽  
Sombat Treeprasertsuk

Abstract Background The gold standard for the diagnosis of liver fibrosis and nonalcoholic fatty liver disease (NAFLD) is liver biopsy. Various noninvasive modalities, e.g., ultrasonography, elastography and clinical predictive scores, have been used as alternatives to liver biopsy, with limited performance. Recently, artificial intelligence (AI) models have been developed and integrated into noninvasive diagnostic tools to improve their performance. Methods We systematically searched for studies on AI-assisted diagnosis of liver fibrosis and NAFLD on MEDLINE, Scopus, Web of Science and Google Scholar. The pooled sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and diagnostic odds ratio (DOR) with their 95% confidence intervals (95% CIs) were calculated using a random effects model. A summary receiver operating characteristic curve and the area under the curve was generated to determine the diagnostic accuracy of the AI-assisted system. Subgroup analyses by diagnostic modalities, population and AI classifiers were performed. Results We included 19 studies reporting the performances of AI-assisted ultrasonography, elastrography, computed tomography, magnetic resonance imaging and clinical parameters for the diagnosis of liver fibrosis and steatosis. For the diagnosis of liver fibrosis, the pooled sensitivity, specificity, PPV, NPV and DOR were 0.78 (0.71–0.85), 0.89 (0.81–0.94), 0.72 (0.58–0.83), 0.92 (0.88–0.94) and 31.58 (11.84–84.25), respectively, for cirrhosis; 0.86 (0.80–0.90), 0.87 (0.80–0.92), 0.85 (0.75–0.91), 0.88 (0.82–0.92) and 37.79 (16.01–89.19), respectively; for advanced fibrosis; and 0.86 (0.78–0.92), 0.81 (0.77–0.84), 0.88 (0.80–0.93), 0.77 (0.58–0.89) and 26.79 (14.47–49.62), respectively, for significant fibrosis. Subgroup analyses showed significant differences in performance for the diagnosis of fibrosis among different modalities. The pooled sensitivity, specificity, PPV, NPV and DOR were 0.97 (0.76–1.00), 0.91 (0.78–0.97), 0.95 (0.87–0.98), 0.93 (0.80–0.98) and 191.52 (38.82–944.81), respectively, for the diagnosis of liver steatosis. Conclusions AI-assisted systems have promising potential for the diagnosis of liver fibrosis and NAFLD. Validations of their performances are warranted before implementing these AI-assisted systems in clinical practice. Trial registration: The protocol was registered with PROSPERO (CRD42020183295).


2020 ◽  
Vol 08 (11) ◽  
pp. E1611-E1622
Author(s):  
Saurabh Chandan ◽  
Babu P. Mohan ◽  
Shahab R. Khan ◽  
Andrew Ofosu ◽  
Amaninder S. Dhaliwal ◽  
...  

Abstract Background and study aims Endoscopic ultrasound (EUS)-guided fine-needle aspiration (EUS-FNA) has limitations of inadequate sampling and false-negative results for malignancy. It has been performed using conventional smear (CS) cytology with rapid on-site evaluation (ROSE) with reasonable diagnostic accuracy. An alternative to ROSE is liquid-based cytology (LBC). Commonly used LBC techniques include precipitation-based (SurePath™) and filtration-based (ThinPrep®, CellPrep®). Data regarding the diagnostic efficacy of LBC compared with CS are limited. Methods Multiple databases were searched through March 2020 to identify studies reporting diagnostic yield of EUS-guided CS and LBC in pancreatic lesions. Pooled diagnostic odds and rates of performance for the cytologic diagnoses of benign, suspicious, and malignant lesions were calculated. Diagnostic efficacy was evaluated by pooled rates of accuracy, sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV). Results Nine studies with a total of 1308 patients were included in our final analysis. Pooled diagnostic odds of CS cytology were 1.69 (CI 1.02–2.79) and 0.39 (CI 0.19–0.8) for malignant lesions when compared to filtration-based and precipitation-based LBC techniques, respectively. For CS, precipitation-based and filtration-based LBC, pooled diagnostic accuracy was 79.7 %, 85.2 %, 77.3 %, sensitivity was 79.2 %, 83.6 %, 68.3 %, and specificity was 99.4 %, 99.5 %, 99.5 %, respectively. Conclusions The precipitation-based LBC technique (SurePath™) had superior diagnostic odds for malignant pancreatic lesions compared with CS cytology in the absence of ROSE. It showed superior accuracy and sensitivity, but comparable specificity and PPV. Diagnostic odds of CS cytology in the absence of ROSE were superior to the filtration-based LBC technique (ThinPrep®, Cellprep®) for diagnosing malignant pancreatic lesions.


2021 ◽  
pp. 193229682199937
Author(s):  
Nikita Mokhashi ◽  
Julia Grachevskaya ◽  
Lorrie Cheng ◽  
Daohai Yu ◽  
Xiaoning Lu ◽  
...  

Introduction: Artificial intelligence (AI) diabetic retinopathy (DR) software has the potential to decrease time spent by clinicians on image interpretation and expand the scope of DR screening. We performed a retrospective review to compare Eyenuk’s EyeArt software (Woodland Hills, CA) to Temple Ophthalmology optometry grading using the International Classification of Diabetic Retinopathy scale. Methods: Two hundred and sixty consecutive diabetic patients from the Temple Faculty Practice Internal Medicine clinic underwent 2-field retinal imaging. Classifications of the images by the software and optometrist were analyzed using sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and McNemar’s test. Ungradable images were analyzed to identify relationships with HbA1c, age, and ethnicity. Disagreements and a sample of 20% of agreements were adjudicated by a retina specialist. Results: On patient level comparison, sensitivity for the software was 100%, while specificity was 77.78%. PPV was 19.15%, and NPV was 100%. The 38 disagreements between software and optometrist occurred when the optometrist classified a patient’s images as non-referable while the software classified them as referable. Of these disagreements, a retina specialist agreed with the optometrist 57.9% the time (22/38). Of the agreements, the retina specialist agreed with both the program and the optometrist 96.7% of the time (28/29). There was a significant difference in numbers of ungradable photos in older patients (≥60) vs younger patients (<60) (p=0.003). Conclusions: The AI program showed high sensitivity with acceptable specificity for a screening algorithm. The high NPV indicates that the software is unlikely to miss DR but may refer patients unnecessarily.


Sign in / Sign up

Export Citation Format

Share Document