Diagnostic test accuracy, inter- and intrarater reliability of a knee osteoarthritis machine learning decision-aid algorithm in a near-clinical setting

2021 ◽  
Vol 29 ◽  
pp. S337-S338
Author(s):  
M.W. Brejnebøl ◽  
P. Hansen ◽  
M. Axelsen ◽  
M. Lundemann ◽  
R. Bachmann ◽  
...  
2020 ◽  
Vol 46 (3) ◽  
pp. 383-400 ◽  
Author(s):  
Lucas M. Fleuren ◽  
Thomas L. T. Klausch ◽  
Charlotte L. Zwager ◽  
Linda J. Schoonmade ◽  
Tingjie Guo ◽  
...  

2020 ◽  
Vol 27 (7) ◽  
pp. 1092-1101 ◽  
Author(s):  
Ryan J Crowley ◽  
Yuan Jin Tan ◽  
John P A Ioannidis

Abstract Objective Machine learning (ML) diagnostic tools have significant potential to improve health care. However, methodological pitfalls may affect diagnostic test accuracy studies used to appraise such tools. We aimed to evaluate the prevalence and reporting of design characteristics within the literature. Further, we sought to empirically assess whether design features may be associated with different estimates of diagnostic accuracy. Materials and Methods We systematically retrieved 2 × 2 tables (n = 281) describing the performance of ML diagnostic tools, derived from 114 publications in 38 meta-analyses, from PubMed. Data extracted included test performance, sample sizes, and design features. A mixed-effects metaregression was run to quantify the association between design features and diagnostic accuracy. Results Participant ethnicity and blinding in test interpretation was unreported in 90% and 60% of studies, respectively. Reporting was occasionally lacking for rudimentary characteristics such as study design (28% unreported). Internal validation without appropriate safeguards was used in 44% of studies. Several design features were associated with larger estimates of accuracy, including having unreported (relative diagnostic odds ratio [RDOR], 2.11; 95% confidence interval [CI], 1.43-3.1) or case-control study designs (RDOR, 1.27; 95% CI, 0.97-1.66), and recruiting participants for the index test (RDOR, 1.67; 95% CI, 1.08-2.59). Discussion Significant underreporting of experimental details was present. Study design features may affect estimates of diagnostic performance in the ML diagnostic test accuracy literature. Conclusions The present study identifies pitfalls that threaten the validity, generalizability, and clinical value of ML diagnostic tools and provides recommendations for improvement.


Author(s):  
Janwillem W.H. Kocks ◽  
Heinze J.H. Andringa ◽  
Ellen van Heijst ◽  
Renaud Louis ◽  
Inigo Ojanguren Arranz ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document