Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images

AbstractPurposesThe machine-assisted recognition of colorectal cancer using pathological images has been mainly focused on supervised learning approaches that suffer from a significant bottleneck of requiring a large number of labeled training images. The process of generating high quality image labels is time-consuming, labor-intensive, and thus lags behind the quick accumulation of pathological images. We hypothesize that semi-supervised deep learning, a method that leverages a small number of labeled images together with a large quantity of unlabeled images, can provide a powerful alternative strategy for colorectal cancer recognition.MethodWe proposed semi-supervised classifiers based on deep learning that provide pathological predictions at both patch-level and the level of whole slide image (WSI). First, we developed a semi-supervised deep learning framework based on the mean teacher method, to predict the cancer probability of an individual patch by utilizing patch-level data generated by dividing a WSI into many patches. Second, we developed a patient-level method utilizing a cluster-based and positive sensitivity strategy on WSIs to predict whether the WSI or the associated patient has cancer or not. We demonstrated the general utility of the semi-supervised learning method for colorectal cancer prediction utilizing a large data set (13,111 WSIs from 8,803 subjects) gathered from 13 centers across China, the United States and Germany. On this data set, we compared the performances of our proposed semi-supervised learning method with those from the prevailing supervised learning methods and six professional pathologists.ResultsOur results confirmed that semi-supervised learning model overperformed supervised learning models when a small portion of massive data was labeled, and performed as well as a supervised learning model when using massive labeled data. Specifically, when a small amount of training patches (~3,150) was labeled, the proposed semi-supervised learning model plus ~40,950 unlabeled patches performed better than the supervised learning model (AUC: 0.90 ± 0.06 vs. 0.84 ± 0.07,P value = 0.02). When more labeled training patches (~6,300) were available, the semi-supervised learning model plus ~37,800 unlabeled patches still performed significantly better than a supervised learning model (AUC: 0.98 ± 0.01vs. 0.92 ± 0.04, P value = 0.0004), and its performance had no significant difference compared with a supervised learning model trained on massive labeled patches (~44,100) (AUC: 0.98 ± 0.01 vs. 0.987 ± 0.01, P value = 0.134). Through extensive patient-level testing of 12,183 WSIs in 12 centers, we found no significant difference on patient-level diagnoses between the semi-supervised learning model (~6,300 labeled, ~37,800 unlabeled training patches) and a supervised learning model (~44,100 labeled training patches) (average AUC: 97.40% vs. 97.96%, P value = 0.117). Moreover, the diagnosis accuracy of the semi-supervised learning model was close to that of human pathologists (average AUC: 97.17% vs. 96.91%).ConclusionsWe reported that semi-supervised learning can achieve excellent performance at patch-level and patient-level diagnoses for colorectal cancer through a multi-center study. This finding is particularly useful since massive labeled data are usually not readily available. We demonstrated that our newly proposed semi-supervised learning method can accurately predict colorectal cancer that matched the average accuracy of pathologists. We thus suggested that semi-supervised learning has great potentials to build artificial intelligence (AI) platforms for medical sciences and clinical practices including pathological diagnosis. These new platforms will dramatically reduce the cost and the number of labeled data required for training, which in turn will allow for broader adoptions of AI-empowered systems for cancer image analyses.

Download Full-text

Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images

Nature Communications ◽

10.1038/s41467-021-26643-8 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Gang Yu ◽

Kai Sun ◽

Chao Xu ◽

Xing-Hua Shi ◽

Chong Wu ◽

...

Keyword(s):

Artificial Intelligence ◽

Colorectal Cancer ◽

Deep Learning ◽

Supervised Learning ◽

Area Under The Curve ◽

Patient Level ◽

Significant Difference ◽

The Mean ◽

Whole Slide Images ◽

Better Than

AbstractMachine-assisted pathological recognition has been focused on supervised learning (SL) that suffers from a significant annotation bottleneck. We propose a semi-supervised learning (SSL) method based on the mean teacher architecture using 13,111 whole slide images of colorectal cancer from 8803 subjects from 13 independent centers. SSL (~3150 labeled, ~40,950 unlabeled; ~6300 labeled, ~37,800 unlabeled patches) performs significantly better than the SL. No significant difference is found between SSL (~6300 labeled, ~37,800 unlabeled) and SL (~44,100 labeled) at patch-level diagnoses (area under the curve (AUC): 0.980 ± 0.014 vs. 0.987 ± 0.008, P value = 0.134) and patient-level diagnoses (AUC: 0.974 ± 0.013 vs. 0.980 ± 0.010, P value = 0.117), which is close to human pathologists (average AUC: 0.969). The evaluation on 15,000 lung and 294,912 lymph node images also confirm SSL can achieve similar performance as that of SL with massive annotations. SSL dramatically reduces the annotations, which has great potential to effectively build expert-level pathological artificial intelligence platforms in practice.

Download Full-text

Accurate recognition of colorectal cancer with semi-supervised deep learning on pathological images

10.21203/rs.3.rs-75912/v1 ◽

2020 ◽

Author(s):

Gang Yu ◽

Ting Xie ◽

Chao Xu ◽

Xing-Hua Shi ◽

Chong Wu ◽

...

Keyword(s):

Colorectal Cancer ◽

Deep Learning ◽

Supervised Learning ◽

Medical Sciences ◽

Patient Level ◽

General Utility ◽

Significant Difference ◽

Multi Center Study ◽

The Cost ◽

Whole Slide Images

Abstract Background: The machine-assisted recognition of colorectal cancer has been mainly focused on supervised deep learning that suffer from a significant bottleneck of requiring massive labeled data. We hypothesize that semi-supervised deep learning leveraging a small number of labeled data can provide a powerful alternative strategy.Method: We proposed a semi-supervised model based on mean teacher that provide pathological predictions at both patch-level and patient-level. We demonstrated the general utility of the model utilizing 13,111 whole slide images from 8,803 subjects gathered from 13 centers. We compared our proposed method with the prevailing supervised learning and six pathologists.Results: with a small amount of labeled training patches (~3,150 labeled, ~40,950 unlabeled or ~6,300 labeled,~37,800 unlabeled), the semi-supervised model performed significantly better than the supervised model (AUC: 0.90 ± 0.06 vs. 0.84 ± 0.07, P value = 0.02 or AUC: 0.98 ± 0.01 vs 0.92 ± 0.04, P value = 0.0004). Moreover, we found no significant difference between the supervised model using massive ~44,100 labeled patches and the semi-supervised model (~6,300 labeled, ~37,800 unlabeled) at patch-level diagnoses (AUC: 0.98 ± 0.01 vs 0.987 ± 0.01, P value = 0.134) and patient-level diagnoses (average AUC: 97.40% vs. 97.96%, P value = 0.117) . Our model was close to human pathologists (average AUC: 97.17% vs. 96.91%).Conclusions: We reported that semi-supervised learning can achieve excellent performance through a multi-center study. We thus suggested that semi-supervised learning has great potentials to build artificial intelligence (AI) platforms, which will dramatically reduce the cost of labeled data and greatly facilitate the development and application of AI in medical sciences.

Download Full-text

Segmentation of Polyp Instruments using UNet based deep learning model

Nordic Machine Intelligence ◽

10.5617/nmi.9145 ◽

2021 ◽

Vol 1 (1) ◽

pp. 44-46

Author(s):

Ashar Mirza ◽

Rishav Kumar Rajak

Keyword(s):

Deep Learning ◽

Image Data ◽

Learning Model ◽

Learning Method ◽

Data Set ◽

Test Dataset ◽

Segmentation Task ◽

Deep Learning Model

In this paper, we present a UNet architecture-based deep learning method that is used to segment polyp and instruments from the image data set provided in the MedAI Challenge2021. For the polyp segmentation task, we developed a UNet based algorithm for segmenting polyps in images taken from endoscopies. The main focus of this task is to achieve high segmentation metrics on the supplied test dataset. Similarly for the polyp segmentation task, in the instrument segmentation task, we have developed UNet based algorithms for segmenting instruments present in colonoscopy videos.

Download Full-text

A strategy learning model for autonomous agents based on classification

International Journal of Applied Mathematics and Computer Science ◽

10.1515/amcs-2015-0035 ◽

2015 ◽

Vol 25 (3) ◽

pp. 471-482 ◽

Cited By ~ 7

Author(s):

Bartłomiej Śnieżyński

Keyword(s):

Reinforcement Learning ◽

Supervised Learning ◽

Learning Process ◽

Autonomous Agents ◽

Good Alternative ◽

Learning Model ◽

Learning Method ◽

Complex Environments ◽

Agent Based ◽

Proposed Model

AbstractIn this paper we propose a strategy learning model for autonomous agents based on classification. In the literature, the most commonly used learning method in agent-based systems is reinforcement learning. In our opinion, classification can be considered a good alternative. This type of supervised learning can be used to generate a classifier that allows the agent to choose an appropriate action for execution. Experimental results show that this model can be successfully applied for strategy generation even if rewards are delayed. We compare the efficiency of the proposed model and reinforcement learning using the farmer-pest domain and configurations of various complexity. In complex environments, supervised learning can improve the performance of agents much faster that reinforcement learning. If an appropriate knowledge representation is used, the learned knowledge may be analyzed by humans, which allows tracking the learning process

Download Full-text

Classification of Clinically Significant Prostate Cancer on Multi-Parametric MRI: A Validation Study Comparing Deep Learning and Radiomics

Cancers ◽

10.3390/cancers14010012 ◽

2021 ◽

Vol 14 (1) ◽

pp. 12

Author(s):

Jose M. Castillo T. ◽

Muhammad Arif ◽

Martijn P. A. Starmans ◽

Wiro J. Niessen ◽

Chris H. Bangma ◽

...

Keyword(s):

Prostate Cancer ◽

Deep Learning ◽

Characteristic Curve ◽

Model Development ◽

Learning Model ◽

Multiparametric Mri ◽

Data Sets ◽

Data Set ◽

Test Sets ◽

Deep Learning Model

The computer-aided analysis of prostate multiparametric MRI (mpMRI) could improve significant-prostate-cancer (PCa) detection. Various deep-learning- and radiomics-based methods for significant-PCa segmentation or classification have been reported in the literature. To be able to assess the generalizability of the performance of these methods, using various external data sets is crucial. While both deep-learning and radiomics approaches have been compared based on the same data set of one center, the comparison of the performances of both approaches on various data sets from different centers and different scanners is lacking. The goal of this study was to compare the performance of a deep-learning model with the performance of a radiomics model for the significant-PCa diagnosis of the cohorts of various patients. We included the data from two consecutive patient cohorts from our own center (n = 371 patients), and two external sets of which one was a publicly available patient cohort (n = 195 patients) and the other contained data from patients from two hospitals (n = 79 patients). Using multiparametric MRI (mpMRI), the radiologist tumor delineations and pathology reports were collected for all patients. During training, one of our patient cohorts (n = 271 patients) was used for both the deep-learning- and radiomics-model development, and the three remaining cohorts (n = 374 patients) were kept as unseen test sets. The performances of the models were assessed in terms of their area under the receiver-operating-characteristic curve (AUC). Whereas the internal cross-validation showed a higher AUC for the deep-learning approach, the radiomics model obtained AUCs of 0.88, 0.91 and 0.65 on the independent test sets compared to AUCs of 0.70, 0.73 and 0.44 for the deep-learning model. Our radiomics model that was based on delineated regions resulted in a more accurate tool for significant-PCa classification in the three unseen test sets when compared to a fully automated deep-learning model.

Download Full-text

Semi supervised inspection algorithm of automatic packaging curve based on deep learning

Journal of Computational Methods in Sciences and Engineering ◽

10.3233/jcm-215690 ◽

2021 ◽

pp. 1-10

Author(s):

Yong He

Keyword(s):

Deep Learning ◽

Supervised Learning ◽

Optimization Algorithm ◽

Posterior Probability ◽

Detection Method ◽

Detection System ◽

Experimental Results ◽

Detection Accuracy ◽

Data Set ◽

Packaging Process

The current automatic packaging process is complex, requires high professional knowledge, poor universality, and difficult to apply in multi-objective and complex background. In view of this problem, automatic packaging optimization algorithm has been widely paid attention to. However, the traditional automatic packaging detection accuracy is low, the practicability is poor. Therefore, a semi-supervised detection method of automatic packaging curve based on deep learning and semi-supervised learning is proposed. Deep learning is used to extract features and posterior probability to classify unlabeled data. KDD CUP99 data set was used to verify the accuracy of the algorithm. Experimental results show that this method can effectively improve the performance of automatic packaging curve semi-supervised detection system.

Download Full-text

Real-time automated diagnosis of colorectal cancer invasion depth using a deep learning model with multimodal data (with video)

Gastrointestinal Endoscopy ◽

10.1016/j.gie.2021.11.049 ◽

2021 ◽

Author(s):

Zihua Lu ◽

Youming Xu ◽

Liwen Yao ◽

Wei Zhou ◽

Wei Gong ◽

...

Keyword(s):

Colorectal Cancer ◽

Deep Learning ◽

Real Time ◽

Learning Model ◽

Cancer Invasion ◽

Automated Diagnosis ◽

Invasion Depth ◽

Multimodal Data ◽

Deep Learning Model

Download Full-text

Simulation and Recognition of Concrete Lining Infiltration Degree via an Indoor Experiment

Geofluids ◽

10.1155/2020/8873315 ◽

2020 ◽

Vol 2020 ◽

pp. 1-11

Author(s):

Dongsheng Wang ◽

Jun Feng ◽

Xinpeng Zhao ◽

Yeping Bai ◽

Yujie Wang ◽

...

Keyword(s):

Deep Learning ◽

Cement Mortar ◽

Recognition Accuracy ◽

Image Data ◽

Concrete Lining ◽

Learning Method ◽

Recognition Method ◽

Data Set ◽

Recognition Model ◽

High Recognition Accuracy

It is difficult to form a method for recognizing the degree of infiltration of a tunnel lining. To solve this problem, we propose a recognition method by using a deep convolutional neural network. We carry out laboratory tests, prepare cement mortar specimens with different saturation levels, simulate different degrees of infiltration of tunnel concrete linings, and establish an infrared thermal image data set with different degrees of infiltration. Then, based on a deep learning method, the data set is trained using the Faster R-CNN+ResNet101 network, and a recognition model is established. The experiments show that the recognition model established by the deep learning method can be used to select cement mortar specimens with different degrees of infiltration by using an accurately minimized rectangular outer frame. This model shows that the classification recognition model for tunnel concrete lining infiltration established by the indoor experimental method has high recognition accuracy.

Download Full-text

Ground-truth uncertainty-aware metrics for machine learning applications on seismic image interpretation: Application to faults and horizon extraction

The Leading Edge ◽

10.1190/tle39100734.1 ◽

2020 ◽

Vol 39 (10) ◽

pp. 734-741

Author(s):

Sébastien Guillon ◽

Frédéric Joncour ◽

Pierre-Emmanuel Barrallon ◽

Laurent Castanié

Keyword(s):

Deep Learning ◽

Image Interpretation ◽

Ground Truth ◽

Learning Model ◽

Seismic Interpretation ◽

Data Set ◽

Seismic Image ◽

Machine Learning Applications ◽

Deep Learning Model

We propose new metrics to measure the performance of a deep learning model applied to seismic interpretation tasks such as fault and horizon extraction. Faults and horizons are thin geologic boundaries (1 pixel thick on the image) for which a small prediction error could lead to inappropriately large variations in common metrics (precision, recall, and intersection over union). Through two examples, we show how classical metrics could fail to indicate the true quality of fault or horizon extraction. Measuring the accuracy of reconstruction of thin objects or boundaries requires introducing a tolerance distance between ground truth and prediction images to manage the uncertainties inherent in their delineation. We therefore adapt our metrics by introducing a tolerance function and illustrate their ability to manage uncertainties in seismic interpretation. We compare classical and new metrics through different examples and demonstrate the robustness of our metrics. Finally, we show on a 3D West African data set how our metrics are used to tune an optimal deep learning model.

Download Full-text