Artificial Intelligence for automated Gleason Grading in prostate cancer biopsies

2019 ◽  
Vol 18 (1) ◽  
pp. e724
Author(s):  
F-E. Marginean ◽  
A. Krzyzanowska ◽  
I. Arvidsson ◽  
A. Simoulis ◽  
E. Sjöblom ◽  
...  
2021 ◽  
Vol 1 (1) ◽  
Author(s):  
Ellery Wulczyn ◽  
Kunal Nagpal ◽  
Matthew Symonds ◽  
Melissa Moran ◽  
Markus Plass ◽  
...  

Abstract Background Gleason grading of prostate cancer is an important prognostic factor, but suffers from poor reproducibility, particularly among non-subspecialist pathologists. Although artificial intelligence (A.I.) tools have demonstrated Gleason grading on-par with expert pathologists, it remains an open question whether and to what extent A.I. grading translates to better prognostication. Methods In this study, we developed a system to predict prostate cancer-specific mortality via A.I.-based Gleason grading and subsequently evaluated its ability to risk-stratify patients on an independent retrospective cohort of 2807 prostatectomy cases from a single European center with 5–25 years of follow-up (median: 13, interquartile range 9–17). Results Here, we show that the A.I.’s risk scores produced a C-index of 0.84 (95% CI 0.80–0.87) for prostate cancer-specific mortality. Upon discretizing these risk scores into risk groups analogous to pathologist Grade Groups (GG), the A.I. has a C-index of 0.82 (95% CI 0.78–0.85). On the subset of cases with a GG provided in the original pathology report (n = 1517), the A.I.’s C-indices are 0.87 and 0.85 for continuous and discrete grading, respectively, compared to 0.79 (95% CI 0.71–0.86) for GG obtained from the reports. These represent improvements of 0.08 (95% CI 0.01–0.15) and 0.07 (95% CI 0.00–0.14), respectively. Conclusions Our results suggest that A.I.-based Gleason grading can lead to effective risk stratification, and warrants further evaluation for improving disease management.


Author(s):  
Kimmo Kartasalo ◽  
Wouter Bulten ◽  
Brett Delahunt ◽  
Po-Hsuan Cameron Chen ◽  
Hans Pinckaers ◽  
...  

2019 ◽  
Vol 18 (8) ◽  
pp. e3082
Author(s):  
A. Krzyzanowska ◽  
F.E. Marginean ◽  
I. Arvidsson ◽  
A. Simoulis ◽  
N.C. Overgaard ◽  
...  

2022 ◽  
Author(s):  
Wouter Bulten ◽  
Kimmo Kartasalo ◽  
Po-Hsuan Cameron Chen ◽  
Peter Ström ◽  
Hans Pinckaers ◽  
...  

AbstractArtificial intelligence (AI) has shown promise for diagnosing prostate cancer in biopsies. However, results have been limited to individual studies, lacking validation in multinational settings. Competitions have been shown to be accelerators for medical imaging innovations, but their impact is hindered by lack of reproducibility and independent validation. With this in mind, we organized the PANDA challenge—the largest histopathology competition to date, joined by 1,290 developers—to catalyze development of reproducible AI algorithms for Gleason grading using 10,616 digitized prostate biopsies. We validated that a diverse set of submitted algorithms reached pathologist-level performance on independent cross-continental cohorts, fully blinded to the algorithm developers. On United States and European external validation sets, the algorithms achieved agreements of 0.862 (quadratically weighted κ, 95% confidence interval (CI), 0.840–0.884) and 0.868 (95% CI, 0.835–0.900) with expert uropathologists. Successful generalization across different patient populations, laboratories and reference standards, achieved by a variety of algorithmic approaches, warrants evaluating AI-based Gleason grading in prospective clinical trials.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Renata Zelic ◽  
Francesca Giunchi ◽  
Luca Lianas ◽  
Cecilia Mascia ◽  
Gianluigi Zanetti ◽  
...  

AbstractVirtual microscopy (VM) holds promise to reduce subjectivity as well as intra- and inter-observer variability for the histopathological evaluation of prostate cancer. We evaluated (i) the repeatability (intra-observer agreement) and reproducibility (inter-observer agreement) of the 2014 Gleason grading system and other selected features using standard light microscopy (LM) and an internally developed VM system, and (ii) the interchangeability of LM and VM. Two uro-pathologists reviewed 413 cores from 60 Swedish men diagnosed with non-metastatic prostate cancer 1998–2014. Reviewer 1 performed two reviews using both LM and VM. Reviewer 2 performed one review using both methods. The intra- and inter-observer agreement within and between LM and VM were assessed using Cohen’s kappa and Bland and Altman’s limits of agreement. We found good repeatability and reproducibility for both LM and VM, as well as interchangeability between LM and VM, for primary and secondary Gleason pattern, Gleason Grade Groups, poorly formed glands, cribriform pattern and comedonecrosis but not for the percentage of Gleason pattern 4. Our findings confirm the non-inferiority of VM compared to LM. The repeatability and reproducibility of percentage of Gleason pattern 4 was poor regardless of method used warranting further investigation and improvement before it is used in clinical practice.


Sign in / Sign up

Export Citation Format

Share Document