Deep learning to predict subtypes of poorly differentiated lung cancer from biopsy whole slide images.

8536 Background: Lung cancer is the leading cause of cancer-related death in many countries, and its prognosis remains unsatisfactory. Since treatment approaches differ substantially based on the subtype, such as adenocarcinoma (ADC), squamous cell carcinoma (SCC) and small cell lung cancer (SCLC), an accurate histopathological diagnosis is of great importance. However, if the specimen is solely composed of poorly differentiated cancer cells, distinguishing between histological subtypes can be difficult. The present study developed a deep learning model to classify lung cancer subtypes from whole slide images (WSIs) of transbronchial lung biopsy (TBLB) specimens, in particular with the aim of using this model to evaluate a challenging test set of indeterminate cases. Methods: Our deep learning model consisted of two separately trained components: a convolutional neural network tile classifier and a recurrent neural network tile aggregator for the WSI diagnosis. We used a training set consisting of 638 WSIs of TBLB specimens to train a deep learning model to classify lung cancer subtypes (ADC, SCC and SCLC) and non-neoplastic lesions. The training set consisted of 593 WSIs for which the diagnosis had been determined by pathologists based on the visual inspection of Hematoxylin-Eosin (HE) slides and of 45 WSIs of indeterminate cases (64 ADCs and 19 SCCs). We then evaluated the models using five independent test sets. For each test set, we computed the receiver operator curve (ROC) area under the curve (AUC). Results: We applied the model to an indeterminate test set of WSIs obtained from TBLB specimens that pathologists had not been able to conclusively diagnose by examining the HE-stained specimens alone. Overall, the model achieved ROC AUCs of 0.993 (confidence interval [CI] 0.971-1.0) and 0.996 (0.981-1.0) for ADC and SCC, respectively. We further evaluated the model using five independent test sets consisting of both TBLB and surgically resected lung specimens (combined total of 2490 WSIs) and obtained highly promising results with ROC AUCs ranging from 0.94 to 0.99. Conclusions: In this study, we demonstrated that a deep learning model could be trained to predict lung cancer subtypes in indeterminate TBLB specimens. The extremely promising results obtained show that if deployed in clinical practice, a deep learning model that is capable of aiding pathologists in diagnosing indeterminate cases would be extremely beneficial as it would allow a diagnosis to be obtained sooner and reduce costs that would result from further investigations.

Download Full-text

A deep learning model for the classification of indeterminate lung carcinoma in biopsy whole slide images

Scientific Reports ◽

10.1038/s41598-021-87644-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Fahdi Kanavati ◽

Gouji Toyokawa ◽

Seiya Momosaki ◽

Hiroaki Takeoka ◽

Masaki Okamoto ◽

...

Keyword(s):

Lung Cancer ◽

Deep Learning ◽

Area Under The Curve ◽

Learning Model ◽

Primary Sources ◽

Independent Test ◽

Hematoxylin And Eosin ◽

Deep Learning Model ◽

Whole Slide Images

AbstractThe differentiation between major histological types of lung cancer, such as adenocarcinoma (ADC), squamous cell carcinoma (SCC), and small-cell lung cancer (SCLC) is of crucial importance for determining optimum cancer treatment. Hematoxylin and Eosin (H&E)-stained slides of small transbronchial lung biopsy (TBLB) are one of the primary sources for making a diagnosis; however, a subset of cases present a challenge for pathologists to diagnose from H&E-stained slides alone, and these either require further immunohistochemistry or are deferred to surgical resection for definitive diagnosis. We trained a deep learning model to classify H&E-stained Whole Slide Images of TBLB specimens into ADC, SCC, SCLC, and non-neoplastic using a training set of 579 WSIs. The trained model was capable of classifying an independent test set of 83 challenging indeterminate cases with a receiver operator curve area under the curve (AUC) of 0.99. We further evaluated the model on four independent test sets—one TBLB and three surgical, with combined total of 2407 WSIs—demonstrating highly promising results with AUCs ranging from 0.94 to 0.99.

Download Full-text

Deep learning-based six-type classifier for lung cancer and mimics from histopathological whole slide images: a retrospective study

BMC Medicine ◽

10.1186/s12916-021-01953-2 ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Huan Yang ◽

Lili Chen ◽

Zhiqiang Cheng ◽

Minglei Yang ◽

Jianbo Wang ◽

...

Keyword(s):

Lung Cancer ◽

Deep Learning ◽

Intraclass Correlation ◽

Learning Model ◽

Normal Lung ◽

Histopathological Analysis ◽

Learning Methods ◽

Cell Lung Carcinoma ◽

Deep Learning Model ◽

Whole Slide Images

Abstract Background Targeted therapy and immunotherapy put forward higher demands for accurate lung cancer classification, as well as benign versus malignant disease discrimination. Digital whole slide images (WSIs) witnessed the transition from traditional histopathology to computational approaches, arousing a hype of deep learning methods for histopathological analysis. We aimed at exploring the potential of deep learning models in the identification of lung cancer subtypes and cancer mimics from WSIs. Methods We initially obtained 741 WSIs from the First Affiliated Hospital of Sun Yat-sen University (SYSUFH) for the deep learning model development, optimization, and verification. Additional 318 WSIs from SYSUFH, 212 from Shenzhen People’s Hospital, and 422 from The Cancer Genome Atlas were further collected for multi-centre verification. EfficientNet-B5- and ResNet-50-based deep learning methods were developed and compared using the metrics of recall, precision, F1-score, and areas under the curve (AUCs). A threshold-based tumour-first aggregation approach was proposed and implemented for the label inferencing of WSIs with complex tissue components. Four pathologists of different levels from SYSUFH reviewed all the testing slides blindly, and the diagnosing results were used for quantitative comparisons with the best performing deep learning model. Results We developed the first deep learning-based six-type classifier for histopathological WSI classification of lung adenocarcinoma, lung squamous cell carcinoma, small cell lung carcinoma, pulmonary tuberculosis, organizing pneumonia, and normal lung. The EfficientNet-B5-based model outperformed ResNet-50 and was selected as the backbone in the classifier. Tested on 1067 slides from four cohorts of different medical centres, AUCs of 0.970, 0.918, 0.963, and 0.978 were achieved, respectively. The classifier achieved high consistence to the ground truth and attending pathologists with high intraclass correlation coefficients over 0.873. Conclusions Multi-cohort testing demonstrated our six-type classifier achieved consistent and comparable performance to experienced pathologists and gained advantages over other existing computational methods. The visualization of prediction heatmap improved the model interpretability intuitively. The classifier with the threshold-based tumour-first label inferencing method exhibited excellent accuracy and feasibility in classifying lung cancers and confused nonneoplastic tissues, indicating that deep learning can resolve complex multi-class tissue classification that conforms to real-world histopathological scenarios.

Download Full-text

Classification of Clinically Significant Prostate Cancer on Multi-Parametric MRI: A Validation Study Comparing Deep Learning and Radiomics

Cancers ◽

10.3390/cancers14010012 ◽

2021 ◽

Vol 14 (1) ◽

pp. 12

Author(s):

Jose M. Castillo T. ◽

Muhammad Arif ◽

Martijn P. A. Starmans ◽

Wiro J. Niessen ◽

Chris H. Bangma ◽

...

Keyword(s):

Prostate Cancer ◽

Deep Learning ◽

Characteristic Curve ◽

Model Development ◽

Learning Model ◽

Multiparametric Mri ◽

Data Sets ◽

Data Set ◽

Test Sets ◽

Deep Learning Model

The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.

Download Full-text

1140P Validation of a deep learning model for the classification of lung cancer in a large cohort of biopsied samples

Annals of Oncology ◽

10.1016/j.annonc.2021.08.781 ◽

2021 ◽

Vol 32 ◽

pp. S926-S927

Author(s):

G. Toyokawa ◽

Y. Yamada ◽

N. Haratake ◽

Y. Shiraishi ◽

T. Takenaka ◽

...

Keyword(s):

Lung Cancer ◽

Deep Learning ◽

Learning Model ◽

Large Cohort ◽

Deep Learning Model

Download Full-text

Deep Learning Model as a New Trend in Computer-aided Diagnosis of Tumor Pathology for Lung Cancer

Journal of Cancer ◽

10.7150/jca.43268 ◽

2020 ◽

Vol 11 (12) ◽

pp. 3615-3622 ◽

Cited By ~ 1

Author(s):

Lei Cong ◽

Wanbing Feng ◽

Zhigang Yao ◽

Xiaoming Zhou ◽

Wei Xiao

Keyword(s):

Lung Cancer ◽

Deep Learning ◽

Learning Model ◽

Computer Aided Diagnosis ◽

Tumor Pathology ◽

Computer Aided ◽

Deep Learning Model ◽

Aided Diagnosis

Download Full-text

IGRNet: A Deep Learning Model for Non-Invasive, Real-Time Diagnosis of Prediabetes through Electrocardiograms

Sensors ◽

10.3390/s20092556 ◽

2020 ◽

Vol 20 (9) ◽

pp. 2556

Author(s):

Liyang Wang ◽

Yao Mu ◽

Jing Zhao ◽

Xiaoya Wang ◽

Huilian Che

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Real Time ◽

Clinical Symptoms ◽

Characteristic Curve ◽

Learning Model ◽

Machine Learning Algorithms ◽

Test Set ◽

Non Invasive ◽

Deep Learning Model

The clinical symptoms of prediabetes are mild and easy to overlook, but prediabetes may develop into diabetes if early intervention is not performed. In this study, a deep learning model—referred to as IGRNet—is developed to effectively detect and diagnose prediabetes in a non-invasive, real-time manner using a 12-lead electrocardiogram (ECG) lasting 5 s. After searching for an appropriate activation function, we compared two mainstream deep neural networks (AlexNet and GoogLeNet) and three traditional machine learning algorithms to verify the superiority of our method. The diagnostic accuracy of IGRNet is 0.781, and the area under the receiver operating characteristic curve (AUC) is 0.777 after testing on the independent test set including mixed group. Furthermore, the accuracy and AUC are 0.856 and 0.825, respectively, in the normal-weight-range test set. The experimental results indicate that IGRNet diagnoses prediabetes with high accuracy using ECGs, outperforming existing other machine learning methods; this suggests its potential for application in clinical practice as a non-invasive, prediabetes diagnosis technology.

Download Full-text

Evaluating the Accuracy of Breast Cancer and Molecular Subtype Diagnosis by Ultrasound Image Deep Learning Model

Frontiers in Oncology ◽

10.3389/fonc.2021.623506 ◽

2021 ◽

Vol 11 ◽

Author(s):

Xianyu Zhang ◽

Hui Li ◽

Chaoyun Wang ◽

Wen Cheng ◽

Yuntao Zhu ◽

...

Keyword(s):

Breast Cancer ◽

Deep Learning ◽

Sensitivity And Specificity ◽

Molecular Subtypes ◽

Learning Model ◽

Breast Ultrasound ◽

Ultrasound Images ◽

Test Set ◽

External Test ◽

Deep Learning Model

Background: Breast ultrasound is the first choice for breast tumor diagnosis in China, but the Breast Imaging Reporting and Data System (BI-RADS) categorization routinely used in the clinic often leads to unnecessary biopsy. Radiologists have no ability to predict molecular subtypes with important pathological information that can guide clinical treatment.Materials and Methods: This retrospective study collected breast ultrasound images from two hospitals and formed training, test and external test sets after strict selection, which included 2,822, 707, and 210 ultrasound images, respectively. An optimized deep learning model (DLM) was constructed with the training set, and the performance was verified in both the test set and the external test set. Diagnostic results were compared with the BI-RADS categorization determined by radiologists. We divided breast cancer into different molecular subtypes according to hormone receptor (HR) and human epidermal growth factor receptor 2 (HER2) expression. The ability to predict molecular subtypes using the DLM was confirmed in the test set.Results: In the test set, with pathological results as the gold standard, the accuracy, sensitivity and specificity were 85.6, 98.7, and 63.1%, respectively, according to the BI-RADS categorization. The same set achieved an accuracy, sensitivity, and specificity of 89.7, 91.3, and 86.9%, respectively, when using the DLM. For the test set, the area under the curve (AUC) was 0.96. For the external test set, the AUC was 0.90. The diagnostic accuracy was 92.86% with the DLM in BI-RADS 4a patients. Approximately 70.76% of the cases were judged as benign tumors. Unnecessary biopsy was theoretically reduced by 67.86%. However, the false negative rate was 10.4%. A good prediction effect was shown for the molecular subtypes of breast cancer with the DLM. The AUC were 0.864, 0.811, and 0.837 for the triple-negative subtype, HER2 (+) subtype and HR (+) subtype predictions, respectively.Conclusion: This study showed that the DLM was highly accurate in recognizing breast tumors from ultrasound images. Thus, the DLM can greatly reduce the incidence of unnecessary biopsy, especially for patients with BI-RADS 4a. In addition, the predictive ability of this model for molecular subtypes was satisfactory,which has specific clinical application value.

Download Full-text

Generating synthetic contrast enhancement from non-contrast chest computed tomography using a generative adversarial network

Scientific Reports ◽

10.1038/s41598-021-00058-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jae Won Choi ◽

Yeon Jin Cho ◽

Ji Young Ha ◽

Seul Bi Lee ◽

Seunghyun Lee ◽

...

Keyword(s):

Deep Learning ◽

Lymph Nodes ◽

Similarity Index ◽

Learning Model ◽

Mediastinal Lymph Nodes ◽

Test Set ◽

Observer Study ◽

Noise Ratio ◽

Deep Learning Model ◽

Lesion Conspicuity

AbstractThis study aimed to evaluate a deep learning model for generating synthetic contrast-enhanced CT (sCECT) from non-contrast chest CT (NCCT). A deep learning model was applied to generate sCECT from NCCT. We collected three separate data sets, the development set (n = 25) for model training and tuning, test set 1 (n = 25) for technical evaluation, and test set 2 (n = 12) for clinical utility evaluation. In test set 1, image similarity metrics were calculated. In test set 2, the lesion contrast-to-noise ratio of the mediastinal lymph nodes was measured, and an observer study was conducted to compare lesion conspicuity. Comparisons were performed using the paired t-test or Wilcoxon signed-rank test. In test set 1, sCECT showed a lower mean absolute error (41.72 vs 48.74; P < .001), higher peak signal-to-noise ratio (17.44 vs 15.97; P < .001), higher multiscale structural similarity index measurement (0.84 vs 0.81; P < .001), and lower learned perceptual image patch similarity metric (0.14 vs 0.15; P < .001) than NCCT. In test set 2, the contrast-to-noise ratio of the mediastinal lymph nodes was higher in the sCECT group than in the NCCT group (6.15 ± 5.18 vs 0.74 ± 0.69; P < .001). The observer study showed for all reviewers higher lesion conspicuity in NCCT with sCECT than in NCCT alone (P ≤ .001). Synthetic CECT generated from NCCT improves the depiction of mediastinal lymph nodes.

Download Full-text

A Deep Learning Model for Classification of Endoscopic Gastroesophageal Reflux Disease

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18052428 ◽

2021 ◽

Vol 18 (5) ◽

pp. 2428

Author(s):

Chi-Chih Wang ◽

Yu-Ching Chiu ◽

Wei-Liang Chen ◽

Tzu-Wei Yang ◽

Ming-Chang Tsai ◽

...

Keyword(s):

Deep Learning ◽

Gastroesophageal Reflux Disease ◽

Gastroesophageal Reflux ◽

Reflux Disease ◽

Automatic Classification ◽

Learning Model ◽

Test Set ◽

Proposed Model ◽

Deep Learning Model

Gastroesophageal reflux disease (GERD) is a common disease with high prevalence, and its endoscopic severity can be evaluated using the Los Angeles classification (LA grade). This paper proposes a deep learning model (i.e., GERD-VGGNet) that employs convolutional neural networks for automatic classification and interpretation of routine GERD LA grade. The proposed model employs a data augmentation technique, a two-stage no-freezing fine-tuning policy, and an early stopping criterion. As a result, the proposed model exhibits high generalizability. A dataset of images from 464 patients was used for model training and validation. An additional 32 patients served as a test set to evaluate the accuracy of both the model and our trainees. Experimental results demonstrate that the best model for the development set exhibited an overall accuracy of 99.2% (grade A–B), 100% (grade C–D), and 100% (normal group) using narrow-band image (NBI) endoscopy. On the test set, the proposed model resulted in an accuracy of 87.9%, which was significantly higher than the results of the trainees (75.0% and 65.6%). The proposed GERD-VGGNet model can assist automatic classification of GERD in conventional and NBI environments and thereby increase the accuracy of interpretation of the results by inexperienced endoscopists.

Download Full-text