Prediction of visceral pleural invasion in lung cancer on CT: deep learning model achieves a radiologist-level performance with adaptive sensitivity and specificity to clinical needs

Author(s):  
Hyewon Choi ◽  
Hyungjin Kim ◽  
Wonju Hong ◽  
Jongsoo Park ◽  
Eui Jin Hwang ◽  
...  
Author(s):  
Hsu-Heng Yen ◽  
Ping-Yu Wu ◽  
Pei-Yuan Su ◽  
Chia-Wei Yang ◽  
Yang-Yuan Chen ◽  
...  

Abstract Purpose Management of peptic ulcer bleeding is clinically challenging. Accurate characterization of the bleeding during endoscopy is key for endoscopic therapy. This study aimed to assess whether a deep learning model can aid in the classification of bleeding peptic ulcer disease. Methods Endoscopic still images of patients (n = 1694) with peptic ulcer bleeding for the last 5 years were retrieved and reviewed. Overall, 2289 images were collected for deep learning model training, and 449 images were validated for the performance test. Two expert endoscopists classified the images into different classes based on their appearance. Four deep learning models, including Mobile Net V2, VGG16, Inception V4, and ResNet50, were proposed and pre-trained by ImageNet with the established convolutional neural network algorithm. A comparison of the endoscopists and trained deep learning model was performed to evaluate the model’s performance on a dataset of 449 testing images. Results The results first presented the performance comparisons of four deep learning models. The Mobile Net V2 presented the optimal performance of the proposal models. The Mobile Net V2 was chosen for further comparing the performance with the diagnostic results obtained by one senior and one novice endoscopists. The sensitivity and specificity were acceptable for the prediction of “normal” lesions in both 3-class and 4-class classifications. For the 3-class category, the sensitivity and specificity were 94.83% and 92.36%, respectively. For the 4-class category, the sensitivity and specificity were 95.40% and 92.70%, respectively. The interobserver agreement of the testing dataset of the model was moderate to substantial with the senior endoscopist. The accuracy of the determination of endoscopic therapy required and high-risk endoscopic therapy of the deep learning model was higher than that of the novice endoscopist. Conclusions In this study, the deep learning model performed better than inexperienced endoscopists. Further improvement of the model may aid in clinical decision-making during clinical practice, especially for trainee endoscopist.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 8536-8536
Author(s):  
Gouji Toyokawa ◽  
Fahdi Kanavati ◽  
Seiya Momosaki ◽  
Kengo Tateishi ◽  
Hiroaki Takeoka ◽  
...  

8536 Background: Lung cancer is the leading cause of cancer-related death in many countries, and its prognosis remains unsatisfactory. Since treatment approaches differ substantially based on the subtype, such as adenocarcinoma (ADC), squamous cell carcinoma (SCC) and small cell lung cancer (SCLC), an accurate histopathological diagnosis is of great importance. However, if the specimen is solely composed of poorly differentiated cancer cells, distinguishing between histological subtypes can be difficult. The present study developed a deep learning model to classify lung cancer subtypes from whole slide images (WSIs) of transbronchial lung biopsy (TBLB) specimens, in particular with the aim of using this model to evaluate a challenging test set of indeterminate cases. Methods: Our deep learning model consisted of two separately trained components: a convolutional neural network tile classifier and a recurrent neural network tile aggregator for the WSI diagnosis. We used a training set consisting of 638 WSIs of TBLB specimens to train a deep learning model to classify lung cancer subtypes (ADC, SCC and SCLC) and non-neoplastic lesions. The training set consisted of 593 WSIs for which the diagnosis had been determined by pathologists based on the visual inspection of Hematoxylin-Eosin (HE) slides and of 45 WSIs of indeterminate cases (64 ADCs and 19 SCCs). We then evaluated the models using five independent test sets. For each test set, we computed the receiver operator curve (ROC) area under the curve (AUC). Results: We applied the model to an indeterminate test set of WSIs obtained from TBLB specimens that pathologists had not been able to conclusively diagnose by examining the HE-stained specimens alone. Overall, the model achieved ROC AUCs of 0.993 (confidence interval [CI] 0.971-1.0) and 0.996 (0.981-1.0) for ADC and SCC, respectively. We further evaluated the model using five independent test sets consisting of both TBLB and surgically resected lung specimens (combined total of 2490 WSIs) and obtained highly promising results with ROC AUCs ranging from 0.94 to 0.99. Conclusions: In this study, we demonstrated that a deep learning model could be trained to predict lung cancer subtypes in indeterminate TBLB specimens. The extremely promising results obtained show that if deployed in clinical practice, a deep learning model that is capable of aiding pathologists in diagnosing indeterminate cases would be extremely beneficial as it would allow a diagnosis to be obtained sooner and reduce costs that would result from further investigations.


2021 ◽  
Vol 32 ◽  
pp. S926-S927
Author(s):  
G. Toyokawa ◽  
Y. Yamada ◽  
N. Haratake ◽  
Y. Shiraishi ◽  
T. Takenaka ◽  
...  

2020 ◽  
Vol 11 (12) ◽  
pp. 3615-3622 ◽  
Author(s):  
Lei Cong ◽  
Wanbing Feng ◽  
Zhigang Yao ◽  
Xiaoming Zhou ◽  
Wei Xiao

2021 ◽  
Vol 11 ◽  
Author(s):  
Xianyu Zhang ◽  
Hui Li ◽  
Chaoyun Wang ◽  
Wen Cheng ◽  
Yuntao Zhu ◽  
...  

Background: Breast ultrasound is the first choice for breast tumor diagnosis in China, but the Breast Imaging Reporting and Data System (BI-RADS) categorization routinely used in the clinic often leads to unnecessary biopsy. Radiologists have no ability to predict molecular subtypes with important pathological information that can guide clinical treatment.Materials and Methods: This retrospective study collected breast ultrasound images from two hospitals and formed training, test and external test sets after strict selection, which included 2,822, 707, and 210 ultrasound images, respectively. An optimized deep learning model (DLM) was constructed with the training set, and the performance was verified in both the test set and the external test set. Diagnostic results were compared with the BI-RADS categorization determined by radiologists. We divided breast cancer into different molecular subtypes according to hormone receptor (HR) and human epidermal growth factor receptor 2 (HER2) expression. The ability to predict molecular subtypes using the DLM was confirmed in the test set.Results: In the test set, with pathological results as the gold standard, the accuracy, sensitivity and specificity were 85.6, 98.7, and 63.1%, respectively, according to the BI-RADS categorization. The same set achieved an accuracy, sensitivity, and specificity of 89.7, 91.3, and 86.9%, respectively, when using the DLM. For the test set, the area under the curve (AUC) was 0.96. For the external test set, the AUC was 0.90. The diagnostic accuracy was 92.86% with the DLM in BI-RADS 4a patients. Approximately 70.76% of the cases were judged as benign tumors. Unnecessary biopsy was theoretically reduced by 67.86%. However, the false negative rate was 10.4%. A good prediction effect was shown for the molecular subtypes of breast cancer with the DLM. The AUC were 0.864, 0.811, and 0.837 for the triple-negative subtype, HER2 (+) subtype and HR (+) subtype predictions, respectively.Conclusion: This study showed that the DLM was highly accurate in recognizing breast tumors from ultrasound images. Thus, the DLM can greatly reduce the incidence of unnecessary biopsy, especially for patients with BI-RADS 4a. In addition, the predictive ability of this model for molecular subtypes was satisfactory,which has specific clinical application value.


2022 ◽  
Vol 8 ◽  
Author(s):  
Yan Yi ◽  
Li Mao ◽  
Cheng Wang ◽  
Yubo Guo ◽  
Xiao Luo ◽  
...  

Background: The identification of aortic dissection (AD) at baseline plays a crucial role in clinical practice. Non-contrast CT scans are widely available, convenient, and easy to perform. However, the detection of AD on non-contrast CT scans by radiologists currently lacks sensitivity and is suboptimal.Methods: A total of 452 patients who underwent aortic CT angiography (CTA) were enrolled retrospectively from two medical centers in China to form the internal cohort (341 patients, 139 patients with AD, 202 patients with non-AD) and the external testing cohort (111 patients, 46 patients with AD, 65 patients with non-AD). The internal cohort was divided into the training cohort (n = 238), validation cohort (n = 35), and internal testing cohort (n = 68). Morphological characteristics were extracted from the aortic segmentation. A deep-integrated model based on the Gaussian Naive Bayes algorithm was built to differentiate AD from non-AD, using the combination of the three-dimensional (3D) deep-learning model score and morphological characteristics. The areas under the receiver operating characteristic curve (AUCs), accuracy, sensitivity, and specificity were used to evaluate the model performance. The proposed model was also compared with the subjective assessment of radiologists.Results: After the combination of all the morphological characteristics, our proposed deep-integrated model significantly outperformed the 3D deep-learning model (AUC: 0.948 vs. 0.803 in the internal testing cohort and 0.969 vs. 0.814 in the external testing cohort, both p < 0.05). The accuracy, sensitivity, and specificity of our model reached 0.897, 0.862, and 0.923 in the internal testing cohort and 0.730, 0.978, and 0.554 in the external testing cohort, respectively. The accuracy for AD detection showed no significant difference between our model and the radiologists (p > 0.05).Conclusion: The proposed model presented good performance for AD detection on non-contrast CT scans; thus, early diagnosis and prompt treatment would be available.


2019 ◽  
Vol 40 (Supplement_1) ◽  
Author(s):  
C Galloway ◽  
D Treiman ◽  
J Shreibati ◽  
M Schram ◽  
Z Karbaschi ◽  
...  

Abstract Background Electrocardiographic predictors of atrial fibrillation (AF) from a non-AF ECG–such as p wave abnormalities and supraventricular ectopy–have been extensively documented. However, risk prediction tools for AF utilize little if any of the wealth of information available from the ECG. Better AF prediction from the ECG may improve efficiency of screening and performance of AF risk tools. Deep learning methods have the potential to extract an unlimited number of features from the ECG to improve prediction of AF. Purpose We hypothesize that a deep learning model can identify patterns predictive of AF during normal sinus rhythm. To test the hypothesis, we trained and tested a neural network to predict AF from normal sinus rhythm ambulatory ECG data. Methods We trained a deep convolutional neural network to detect features of AF that are present in single-lead ECGs with normal sinus rhythm, recorded using a Food and Drug Administration (FDA)-cleared, smartphone-enabled device. A cohort of 27,526 patients with at least 50 ECGs recorded between January 7, 2013, and September, 19, 2018, and the FDA-cleared automated findings of Normal and Atrial Fibrillation associated with those ECGs, were used for model development. Specifically, we trained the deep learning model on 1,984,581 Normal ECGs from 19,267 patients with 1) only Normal ECG recordings, or 2) at least 30% ECGs with AF. Of the 27,526 patients, an internal set of 8,259 patients with 841,776 Normal ECGs was saved for testing (validation). Results Among 8,259 patients in the test set, 3,467 patients had at least 30% of their ECGs with an automated finding of AF. When the deep learning model was run on 841,776 Normal ECGs, it was able to predict whether the ECG was from a patient with no AF or with 30% or more AF, with an area under the curve (AUC) of 0.80. Using an operating point with equal sensitivity and specificity, the model's sensitivity and specificity were 73.1%. Using an operating point with high specificity (90.0%), the model's sensitivity was 48.0%. When the model was applied to a randomly-selected, broader cohort of 15,000 patients (at least 50 ECGs recorded, any amount of AF), a positive, non-linear relationship between neural network output and AF burden per patient was observed (Figure). Model Output vs AF Burden Per Patient Conclusions A deep learning model was able to predict AF from ECGs in normal sinus rhythm that were recorded on a smartphone-enabled device. The use of deep learning, if prospectively validated, may facilitate AF screening in patients with paroxysmal disease or warn patients who are at high risk for developing AF. Acknowledgement/Funding AliveCor


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fahdi Kanavati ◽  
Gouji Toyokawa ◽  
Seiya Momosaki ◽  
Hiroaki Takeoka ◽  
Masaki Okamoto ◽  
...  

AbstractThe differentiation between major histological types of lung cancer, such as adenocarcinoma (ADC), squamous cell carcinoma (SCC), and small-cell lung cancer (SCLC) is of crucial importance for determining optimum cancer treatment. Hematoxylin and Eosin (H&E)-stained slides of small transbronchial lung biopsy (TBLB) are one of the primary sources for making a diagnosis; however, a subset of cases present a challenge for pathologists to diagnose from H&E-stained slides alone, and these either require further immunohistochemistry or are deferred to surgical resection for definitive diagnosis. We trained a deep learning model to classify H&E-stained Whole Slide Images of TBLB specimens into ADC, SCC, SCLC, and non-neoplastic using a training set of 579 WSIs. The trained model was capable of classifying an independent test set of 83 challenging indeterminate cases with a receiver operator curve area under the curve (AUC) of 0.99. We further evaluated the model on four independent test sets—one TBLB and three surgical, with combined total of 2407 WSIs—demonstrating highly promising results with AUCs ranging from 0.94 to 0.99.


BMC Medicine ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Huan Yang ◽  
Lili Chen ◽  
Zhiqiang Cheng ◽  
Minglei Yang ◽  
Jianbo Wang ◽  
...  

Abstract Background Targeted therapy and immunotherapy put forward higher demands for accurate lung cancer classification, as well as benign versus malignant disease discrimination. Digital whole slide images (WSIs) witnessed the transition from traditional histopathology to computational approaches, arousing a hype of deep learning methods for histopathological analysis. We aimed at exploring the potential of deep learning models in the identification of lung cancer subtypes and cancer mimics from WSIs. Methods We initially obtained 741 WSIs from the First Affiliated Hospital of Sun Yat-sen University (SYSUFH) for the deep learning model development, optimization, and verification. Additional 318 WSIs from SYSUFH, 212 from Shenzhen People’s Hospital, and 422 from The Cancer Genome Atlas were further collected for multi-centre verification. EfficientNet-B5- and ResNet-50-based deep learning methods were developed and compared using the metrics of recall, precision, F1-score, and areas under the curve (AUCs). A threshold-based tumour-first aggregation approach was proposed and implemented for the label inferencing of WSIs with complex tissue components. Four pathologists of different levels from SYSUFH reviewed all the testing slides blindly, and the diagnosing results were used for quantitative comparisons with the best performing deep learning model. Results We developed the first deep learning-based six-type classifier for histopathological WSI classification of lung adenocarcinoma, lung squamous cell carcinoma, small cell lung carcinoma, pulmonary tuberculosis, organizing pneumonia, and normal lung. The EfficientNet-B5-based model outperformed ResNet-50 and was selected as the backbone in the classifier. Tested on 1067 slides from four cohorts of different medical centres, AUCs of 0.970, 0.918, 0.963, and 0.978 were achieved, respectively. The classifier achieved high consistence to the ground truth and attending pathologists with high intraclass correlation coefficients over 0.873. Conclusions Multi-cohort testing demonstrated our six-type classifier achieved consistent and comparable performance to experienced pathologists and gained advantages over other existing computational methods. The visualization of prediction heatmap improved the model interpretability intuitively. The classifier with the threshold-based tumour-first label inferencing method exhibited excellent accuracy and feasibility in classifying lung cancers and confused nonneoplastic tissues, indicating that deep learning can resolve complex multi-class tissue classification that conforms to real-world histopathological scenarios.


Sign in / Sign up

Export Citation Format

Share Document