Lung Cancer and Granuloma Identification Using a Deep Learning Model to Extract 3-Dimensional Radiomics Features in CT Imaging

Author(s):  
Xiaofeng Lin ◽  
Han Jiao ◽  
Zhiyong Pang ◽  
Huai Chen ◽  
Weijie Wu ◽  
...  
2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 8536-8536
Author(s):  
Gouji Toyokawa ◽  
Fahdi Kanavati ◽  
Seiya Momosaki ◽  
Kengo Tateishi ◽  
Hiroaki Takeoka ◽  
...  

8536 Background: Lung cancer is the leading cause of cancer-related death in many countries, and its prognosis remains unsatisfactory. Since treatment approaches differ substantially based on the subtype, such as adenocarcinoma (ADC), squamous cell carcinoma (SCC) and small cell lung cancer (SCLC), an accurate histopathological diagnosis is of great importance. However, if the specimen is solely composed of poorly differentiated cancer cells, distinguishing between histological subtypes can be difficult. The present study developed a deep learning model to classify lung cancer subtypes from whole slide images (WSIs) of transbronchial lung biopsy (TBLB) specimens, in particular with the aim of using this model to evaluate a challenging test set of indeterminate cases. Methods: Our deep learning model consisted of two separately trained components: a convolutional neural network tile classifier and a recurrent neural network tile aggregator for the WSI diagnosis. We used a training set consisting of 638 WSIs of TBLB specimens to train a deep learning model to classify lung cancer subtypes (ADC, SCC and SCLC) and non-neoplastic lesions. The training set consisted of 593 WSIs for which the diagnosis had been determined by pathologists based on the visual inspection of Hematoxylin-Eosin (HE) slides and of 45 WSIs of indeterminate cases (64 ADCs and 19 SCCs). We then evaluated the models using five independent test sets. For each test set, we computed the receiver operator curve (ROC) area under the curve (AUC). Results: We applied the model to an indeterminate test set of WSIs obtained from TBLB specimens that pathologists had not been able to conclusively diagnose by examining the HE-stained specimens alone. Overall, the model achieved ROC AUCs of 0.993 (confidence interval [CI] 0.971-1.0) and 0.996 (0.981-1.0) for ADC and SCC, respectively. We further evaluated the model using five independent test sets consisting of both TBLB and surgically resected lung specimens (combined total of 2490 WSIs) and obtained highly promising results with ROC AUCs ranging from 0.94 to 0.99. Conclusions: In this study, we demonstrated that a deep learning model could be trained to predict lung cancer subtypes in indeterminate TBLB specimens. The extremely promising results obtained show that if deployed in clinical practice, a deep learning model that is capable of aiding pathologists in diagnosing indeterminate cases would be extremely beneficial as it would allow a diagnosis to be obtained sooner and reduce costs that would result from further investigations.


2021 ◽  
Vol 32 ◽  
pp. S926-S927
Author(s):  
G. Toyokawa ◽  
Y. Yamada ◽  
N. Haratake ◽  
Y. Shiraishi ◽  
T. Takenaka ◽  
...  

2020 ◽  
Vol 11 (12) ◽  
pp. 3615-3622 ◽  
Author(s):  
Lei Cong ◽  
Wanbing Feng ◽  
Zhigang Yao ◽  
Xiaoming Zhou ◽  
Wei Xiao

2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Shih-Cheng Huang ◽  
Tanay Kothari ◽  
Imon Banerjee ◽  
Chris Chute ◽  
Robyn L. Ball ◽  
...  

2020 ◽  
Vol 3 (1) ◽  
Author(s):  
Shih-Cheng Huang ◽  
Tanay Kothari ◽  
Imon Banerjee ◽  
Chris Chute ◽  
Robyn L. Ball ◽  
...  

2019 ◽  
Author(s):  
Shih-Cheng Huang ◽  
Tanay Kothari ◽  
Imon Banerjee ◽  
Chris Chute ◽  
Robyn L. Ball ◽  
...  

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Fahdi Kanavati ◽  
Gouji Toyokawa ◽  
Seiya Momosaki ◽  
Hiroaki Takeoka ◽  
Masaki Okamoto ◽  
...  

AbstractThe differentiation between major histological types of lung cancer, such as adenocarcinoma (ADC), squamous cell carcinoma (SCC), and small-cell lung cancer (SCLC) is of crucial importance for determining optimum cancer treatment. Hematoxylin and Eosin (H&E)-stained slides of small transbronchial lung biopsy (TBLB) are one of the primary sources for making a diagnosis; however, a subset of cases present a challenge for pathologists to diagnose from H&E-stained slides alone, and these either require further immunohistochemistry or are deferred to surgical resection for definitive diagnosis. We trained a deep learning model to classify H&E-stained Whole Slide Images of TBLB specimens into ADC, SCC, SCLC, and non-neoplastic using a training set of 579 WSIs. The trained model was capable of classifying an independent test set of 83 challenging indeterminate cases with a receiver operator curve area under the curve (AUC) of 0.99. We further evaluated the model on four independent test sets—one TBLB and three surgical, with combined total of 2407 WSIs—demonstrating highly promising results with AUCs ranging from 0.94 to 0.99.


BMC Medicine ◽  
2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Huan Yang ◽  
Lili Chen ◽  
Zhiqiang Cheng ◽  
Minglei Yang ◽  
Jianbo Wang ◽  
...  

Abstract Background Targeted therapy and immunotherapy put forward higher demands for accurate lung cancer classification, as well as benign versus malignant disease discrimination. Digital whole slide images (WSIs) witnessed the transition from traditional histopathology to computational approaches, arousing a hype of deep learning methods for histopathological analysis. We aimed at exploring the potential of deep learning models in the identification of lung cancer subtypes and cancer mimics from WSIs. Methods We initially obtained 741 WSIs from the First Affiliated Hospital of Sun Yat-sen University (SYSUFH) for the deep learning model development, optimization, and verification. Additional 318 WSIs from SYSUFH, 212 from Shenzhen People’s Hospital, and 422 from The Cancer Genome Atlas were further collected for multi-centre verification. EfficientNet-B5- and ResNet-50-based deep learning methods were developed and compared using the metrics of recall, precision, F1-score, and areas under the curve (AUCs). A threshold-based tumour-first aggregation approach was proposed and implemented for the label inferencing of WSIs with complex tissue components. Four pathologists of different levels from SYSUFH reviewed all the testing slides blindly, and the diagnosing results were used for quantitative comparisons with the best performing deep learning model. Results We developed the first deep learning-based six-type classifier for histopathological WSI classification of lung adenocarcinoma, lung squamous cell carcinoma, small cell lung carcinoma, pulmonary tuberculosis, organizing pneumonia, and normal lung. The EfficientNet-B5-based model outperformed ResNet-50 and was selected as the backbone in the classifier. Tested on 1067 slides from four cohorts of different medical centres, AUCs of 0.970, 0.918, 0.963, and 0.978 were achieved, respectively. The classifier achieved high consistence to the ground truth and attending pathologists with high intraclass correlation coefficients over 0.873. Conclusions Multi-cohort testing demonstrated our six-type classifier achieved consistent and comparable performance to experienced pathologists and gained advantages over other existing computational methods. The visualization of prediction heatmap improved the model interpretability intuitively. The classifier with the threshold-based tumour-first label inferencing method exhibited excellent accuracy and feasibility in classifying lung cancers and confused nonneoplastic tissues, indicating that deep learning can resolve complex multi-class tissue classification that conforms to real-world histopathological scenarios.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 1556-1556
Author(s):  
Alexander S. Rich ◽  
Barry Leybovich ◽  
Melissa Estevez ◽  
Jamie Irvine ◽  
Nisha Singh ◽  
...  

1556 Background: Identifying patients with a particular cancer and determining the date of that diagnosis from EHR data is important for selecting real world research cohorts and conducting downstream analyses. However, cancer diagnoses and their dates are often not accurately recorded in the EHR in a structured form. We developed a unified deep learning model for identifying patients with NSCLC and their initial and advanced diagnosis date(s). Methods: The study used a cohort of 52,834 patients with lung cancer ICD codes from the nationwide deidentified Flatiron Health EHR-derived database. For all patients in the cohort, abstractors used an in-house technology-enabled platform to identify an NSCLC diagnosis, advanced disease, and relevant diagnosis date(s) via chart review. Advanced NSCLC was defined as stage IIIB or IV disease at diagnosis or early stage disease that recurred or progressed. The deep learning model was trained on 38,517 patients, with a separate 14,317 patient test cohort. The model input was a set of sentences containing keywords related to (a)NSCLC, extracted from a patient’s EHR documents. Each sentence was associated with a date, using the document timestamp or, if present, a date mentioned explicitly in the sentence. The sentences were processed by a GRU network, followed by an attentional network that integrated across sentences, outputting a prediction of whether the patient had been diagnosed with (a)NSCLC and the diagnosis date(s) if so. We measured sensitivity and positive predictive value (PPV) of extracting the presence of initial and advanced diagnoses in the test cohort. Among patients with both model-extracted and abstracted diagnosis dates, we also measured 30-day accuracy, defined as the proportion of patients where the dates match to within 30 days. Real world overall survival (rwOS) for patients abstracted vs. model-extracted as advanced was calculated using Kaplan-Meier methods (index date: abstracted vs. model-extracted advanced diagnosis date). Results: Results in the Table show the sensitivity, PPV, and accuracy of the model extracted diagnoses and dates. RwOS was similar using model extracted aNSCLC diagnosis dates (median = 13.7) versus abstracted diagnosis dates (median = 13.3), with a difference of 0.4 months (95% CI = [0.0, 0.8]). Conclusions: Initial and advanced diagnosis of NSCLC and dates of diagnosis can be accurately extracted from unstructured clinical text using a deep learning algorithm. This can further enable the use of EHR data for research on real-world treatment patterns and outcomes analysis, and other applications such as clinical trials matching. Future work should aim to understand the impact of model errors on downstream analyses.[Table: see text]


Sign in / Sign up

Export Citation Format

Share Document