scholarly journals An annotation-free whole-slide training approach to pathological classification of lung cancer types using deep learning

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Chi-Long Chen ◽  
Chi-Chung Chen ◽  
Wei-Hsiang Yu ◽  
Szu-Hua Chen ◽  
Yu-Chan Chang ◽  
...  

AbstractDeep learning for digital pathology is hindered by the extremely high spatial resolution of whole-slide images (WSIs). Most studies have employed patch-based methods, which often require detailed annotation of image patches. This typically involves laborious free-hand contouring on WSIs. To alleviate the burden of such contouring and obtain benefits from scaling up training with numerous WSIs, we develop a method for training neural networks on entire WSIs using only slide-level diagnoses. Our method leverages the unified memory mechanism to overcome the memory constraint of compute accelerators. Experiments conducted on a data set of 9662 lung cancer WSIs reveal that the proposed method achieves areas under the receiver operating characteristic curve of 0.9594 and 0.9414 for adenocarcinoma and squamous cell carcinoma classification on the testing set, respectively. Furthermore, the method demonstrates higher classification performance than multiple-instance learning as well as strong localization results for small lesions through class activation mapping.

2021 ◽  
Author(s):  
Mizuho Mori ◽  
Yoshiko Ariji ◽  
Motoki Fukuda ◽  
Tomoya Kitano ◽  
Takuma Funakoshi ◽  
...  

Abstract Objectives The aim of the present study was to create and test an automatic system for assessing the technical quality of positioning in periapical radiography of the maxillary canines using deep learning classification and segmentation techniques. Methods We created and tested two deep learning systems using 500 periapical radiographs (250 each of good- and bad-quality images). We assigned 350, 70, and 80 images as the training, validation, and test datasets, respectively. The learning model of system 1 was created with only the classification process, whereas system 2 consisted of both the segmentation and classification models. In each model, 500 epochs of training were performed using AlexNet and U-net for classification and segmentation, respectively. The segmentation results were evaluated by the intersection over union method, with values of 0.6 or more considered as success. The classification results were compared between the two systems. Results The segmentation performance of system 2 was recall, precision, and F measure of 0.937, 0.961, and 0.949, respectively. System 2 showed better classification performance values than those obtained by system 1. The area under the receiver operating characteristic curve values differed significantly between system 1 (0.649) and system 2 (0.927). Conclusions The deep learning systems we created appeared to have potential benefits in evaluation of the technical positioning quality of periapical radiographs through the use of segmentation and classification functions.


Cancers ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 12
Author(s):  
Jose M. Castillo T. ◽  
Muhammad Arif ◽  
Martijn P. A. Starmans ◽  
Wiro J. Niessen ◽  
Chris H. Bangma ◽  
...  

The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.


Heart ◽  
2018 ◽  
Vol 104 (23) ◽  
pp. 1921-1928 ◽  
Author(s):  
Ming-Zher Poh ◽  
Yukkee Cheung Poh ◽  
Pak-Hei Chan ◽  
Chun-Ka Wong ◽  
Louise Pun ◽  
...  

ObjectiveTo evaluate the diagnostic performance of a deep learning system for automated detection of atrial fibrillation (AF) in photoplethysmographic (PPG) pulse waveforms.MethodsWe trained a deep convolutional neural network (DCNN) to detect AF in 17 s PPG waveforms using a training data set of 149 048 PPG waveforms constructed from several publicly available PPG databases. The DCNN was validated using an independent test data set of 3039 smartphone-acquired PPG waveforms from adults at high risk of AF at a general outpatient clinic against ECG tracings reviewed by two cardiologists. Six established AF detectors based on handcrafted features were evaluated on the same test data set for performance comparison.ResultsIn the validation data set (3039 PPG waveforms) consisting of three sequential PPG waveforms from 1013 participants (mean (SD) age, 68.4 (12.2) years; 46.8% men), the prevalence of AF was 2.8%. The area under the receiver operating characteristic curve (AUC) of the DCNN for AF detection was 0.997 (95% CI 0.996 to 0.999) and was significantly higher than all the other AF detectors (AUC range: 0.924–0.985). The sensitivity of the DCNN was 95.2% (95% CI 88.3% to 98.7%), specificity was 99.0% (95% CI 98.6% to 99.3%), positive predictive value (PPV) was 72.7% (95% CI 65.1% to 79.3%) and negative predictive value (NPV) was 99.9% (95% CI 99.7% to 100%) using a single 17 s PPG waveform. Using the three sequential PPG waveforms in combination (<1 min in total), the sensitivity was 100.0% (95% CI 87.7% to 100%), specificity was 99.6% (95% CI 99.0% to 99.9%), PPV was 87.5% (95% CI 72.5% to 94.9%) and NPV was 100% (95% CI 99.4% to 100%).ConclusionsIn this evaluation of PPG waveforms from adults screened for AF in a real-world primary care setting, the DCNN had high sensitivity, specificity, PPV and NPV for detecting AF, outperforming other state-of-the-art methods based on handcrafted features.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
O. Obulesu ◽  
Suresh Kallam ◽  
Gaurav Dhiman ◽  
Rizwan Patan ◽  
Ramana Kadiyala ◽  
...  

Cancer is a complicated worldwide health issue with an increasing death rate in recent years. With the swift blooming of the high throughput technology and several machine learning methods that have unfolded in recent years, progress in cancer disease diagnosis has been made based on subset features, providing awareness of the efficient and precise disease diagnosis. Hence, progressive machine learning techniques that can, fortunately, differentiate lung cancer patients from healthy persons are of great concern. This paper proposes a novel Wilcoxon Signed-Rank Gain Preprocessing combined with Generative Deep Learning called Wilcoxon Signed Generative Deep Learning (WS-GDL) method for lung cancer disease diagnosis. Firstly, test significance analysis and information gain eliminate redundant and irrelevant attributes and extract many informative and significant attributes. Then, using a generator function, the Generative Deep Learning method is used to learn the deep features. Finally, a minimax game (i.e., minimizing error with maximum accuracy) is proposed to diagnose the disease. Numerical experiments on the Thoracic Surgery Data Set are used to test the WS-GDL method's disease diagnosis performance. The WS-GDL approach may create relevant and significant attributes and adaptively diagnose the disease by selecting optimal learning model parameters. Quantitative experimental results show that the WS-GDL method achieves better diagnosis performance and higher computing efficiency in computational time, computational complexity, and false-positive rate compared to state-of-the-art approaches.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Xiaoshuo Li ◽  
Wenjun Tan ◽  
Pan Liu ◽  
Qinghua Zhou ◽  
Jinzhu Yang

Novel coronavirus pneumonia (NCP) has become a global pandemic disease, and computed tomography-based (CT) image analysis and recognition are one of the important tools for clinical diagnosis. In order to assist medical personnel to achieve an efficient and fast diagnosis of patients with new coronavirus pneumonia, this paper proposes an assisted diagnosis algorithm based on ensemble deep learning. The method combines the Stacked Generalization ensemble learning with the VGG16 deep learning to form a cascade classifier, and the information constituting the cascade classifier comes from multiple subsets of the training set, each of which is used to collect deviant information about the generalization behavior of the data set, such that this deviant information fills the cascade classifier. The algorithm was experimentally validated for classifying patients with novel coronavirus pneumonia, patients with common pneumonia (CP), and normal controls, and the algorithm achieved a prediction accuracy of 93.57%, sensitivity of 94.21%, specificity of 93.93%, precision of 89.40%, and F1-score of 91.74% for the three categories. The results show that the method proposed in this paper has good classification performance and can significantly improve the performance of deep neural networks for multicategory prediction tasks.


Healthcare ◽  
2020 ◽  
Vol 8 (3) ◽  
pp. 291 ◽  
Author(s):  
Chunwu Yin ◽  
Zhanbo Chen

Disease classification based on machine learning has become a crucial research topic in the fields of genetics and molecular biology. Generally, disease classification involves a supervised learning style; i.e., it requires a large number of labelled samples to achieve good classification performance. However, in the majority of the cases, labelled samples are hard to obtain, so the amount of training data are limited. However, many unclassified (unlabelled) sequences have been deposited in public databases, which may help the training procedure. This method is called semi-supervised learning and is very useful in many applications. Self-training can be implemented using high- to low-confidence samples to prevent noisy samples from affecting the robustness of semi-supervised learning in the training process. The deep forest method with the hyperparameter settings used in this paper can achieve excellent performance. Therefore, in this work, we propose a novel combined deep learning model and semi-supervised learning with self-training approach to improve the performance in disease classification, which utilizes unlabelled samples to update a mechanism designed to increase the number of high-confidence pseudo-labelled samples. The experimental results show that our proposed model can achieve good performance in disease classification and disease-causing gene identification.


2021 ◽  
pp. 002203452110404
Author(s):  
J. Hao ◽  
W. Liao ◽  
Y.L. Zhang ◽  
J. Peng ◽  
Z. Zhao ◽  
...  

Digital dentistry plays a pivotal role in dental health care. A critical step in many digital dental systems is to accurately delineate individual teeth and the gingiva in the 3-dimension intraoral scanned mesh data. However, previous state-of-the-art methods are either time-consuming or error prone, hence hindering their clinical applicability. This article presents an accurate, efficient, and fully automated deep learning model trained on a data set of 4,000 intraoral scanned data annotated by experienced human experts. On a holdout data set of 200 scans, our model achieves a per-face accuracy, average-area accuracy, and area under the receiver operating characteristic curve of 96.94%, 98.26%, and 0.9991, respectively, significantly outperforming the state-of-the-art baselines. In addition, our model takes only about 24 s to generate segmentation outputs, as opposed to >5 min by the baseline and 15 min by human experts. A clinical performance test of 500 patients with malocclusion and/or abnormal teeth shows that 96.9% of the segmentations are satisfactory for clinical applications, 2.9% automatically trigger alarms for human improvement, and only 0.2% of them need rework. Our research demonstrates the potential for deep learning to improve the efficacy and efficiency of dental treatment and digital dentistry.


2021 ◽  
pp. 20200611
Author(s):  
Masako Nishiyama ◽  
Kenichiro Ishibashi ◽  
Yoshiko Ariji ◽  
Motoki Fukuda ◽  
Wataru Nishiyama ◽  
...  

Objective: The present study aimed to verify the classification performance of deep learning (DL) models for diagnosing fractures of the mandibular condyle on panoramic radiographs using data sets from two hospitals and to compare their internal and external validities. Methods: Panoramic radiographs of 100 condyles with and without fractures were collected from two hospitals and a fivefold cross-validation method was employed to construct and evaluate the DL models. The internal and external validities of classification performance were evaluated as accuracy, sensitivity, specificity, and area under the receiver operating characteristic curve (AUC). Results: For internal validity, high classification performance was obtained, with AUC values of >0.85. Conversely, external validity for the data sets from the two hospitals exhibited low performance. Using combined data sets from both hospitals, the DL model exhibited high performance, which was slightly superior or equal to that of the internal validity but without a statistically significant difference. Conclusion: The constructed DL model can be clinically employed for diagnosing fractures of the mandibular condyle using panoramic radiographs. However, the domain shift phenomenon should be considered when generalizing DL systems.


2020 ◽  
pp. 221-233
Author(s):  
Yijiang Chen ◽  
Andrew Janowczyk ◽  
Anant Madabhushi

PURPOSE Deep learning (DL), a class of approaches involving self-learned discriminative features, is increasingly being applied to digital pathology (DP) images for tasks such as disease identification and segmentation of tissue primitives (eg, nuclei, glands, lymphocytes). One application of DP is in telepathology, which involves digitally transmitting DP slides over the Internet for secondary diagnosis by an expert at a remote location. Unfortunately, the places benefiting most from telepathology often have poor Internet quality, resulting in prohibitive transmission times of DP images. Image compression may help, but the degree to which image compression affects performance of DL algorithms has been largely unexplored. METHODS We investigated the effects of image compression on the performance of DL strategies in the context of 3 representative use cases involving segmentation of nuclei (n = 137), segmentation of lymph node metastasis (n = 380), and lymphocyte detection (n = 100). For each use case, test images at various levels of compression (JPEG compression quality score ranging from 1-100 and JPEG2000 compression peak signal-to-noise ratio ranging from 18-100 dB) were evaluated by a DL classifier. Performance metrics including F1 score and area under the receiver operating characteristic curve were computed at the various compression levels. RESULTS Our results suggest that DP images can be compressed by 85% while still maintaining the performance of the DL algorithms at 95% of what is achievable without any compression. Interestingly, the maximum compression level sustainable by DL algorithms is similar to where pathologists also reported difficulties in providing accurate interpretations. CONCLUSION Our findings seem to suggest that in low-resource settings, DP images can be significantly compressed before transmission for DL-based telepathology applications.


2020 ◽  
Author(s):  
Hyung Jun Park ◽  
Dae Yon Jung ◽  
Wonjun Ji ◽  
Chang-Min Choi

BACKGROUND Detecting bacteremia among surgical in-patients is more obscure than other patients due to the inflammatory condition caused by the surgery. The previous criteria such as systemic inflammatory response syndrome or Sepsis-3 are not available for use in general wards, and thus, many clinicians usually rely on practical senses to diagnose postoperative infection. OBJECTIVE This study aims to evaluate the performance of continuous monitoring with a deep learning model for early detection of bacteremia for surgical in-patients in the general ward and the intensive care unit (ICU). METHODS In this retrospective cohort study, we included 36,023 consecutive patients who underwent general surgery between October and December 2017 at a tertiary referral hospital in South Korea. The primary outcome was the area under the receiver operating characteristic curve (AUROC) and the area under the precision-recall curve (AUPRC) for detecting bacteremia by the deep learning model, and the secondary outcome was the feature explainability of the model by occlusion analysis. RESULTS Out of the 36,023 patients in the data set, 720 cases of bacteremia were included. Our deep learning–based model showed an AUROC of 0.97 (95% CI 0.974-0.981) and an AUPRC of 0.17 (95% CI 0.147-0.203) for detecting bacteremia in surgical in-patients. For predicting bacteremia within the previous 24-hour period, the AUROC and AUPRC values were 0.93 and 0.15, respectively. Occlusion analysis showed that vital signs and laboratory measurements (eg, kidney function test and white blood cell group) were the most important variables for detecting bacteremia. CONCLUSIONS A deep learning model based on time series electronic health records data had a high detective ability for bacteremia for surgical in-patients in the general ward and the ICU. The model may be able to assist clinicians in evaluating infection among in-patients, ordering blood cultures, and prescribing antibiotics with real-time monitoring.


Sign in / Sign up

Export Citation Format

Share Document