scholarly journals Diagnostic Accuracy Studies in Radiology: How to Recognize and Address Potential Sources of Bias

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Athanasios Pavlou ◽  
Robert M. Kurtz ◽  
Jae W. Song

Accuracy is an important parameter of a diagnostic test. Studies that attempt to determine a test’s accuracy can suffer from various forms of bias. As radiology is a diagnostic specialty, many radiologists may design a diagnostic accuracy study or review one to understand how it may apply to their practice. Radiologists also frequently serve as consultants to other physicians regarding the selection of the most appropriate diagnostic exams. In these roles, understanding how to critically appraise the literature is important for all radiologists. The purpose of this review is to provide a framework for evaluating potential sources of study design biases that are found in diagnostic accuracy studies and to explain their impact on sensitivity and specificity estimates. To help the reader understand these biases, we also present examples from the radiology literature.

BMJ Open ◽  
2018 ◽  
Vol 8 (7) ◽  
pp. e020627 ◽  
Author(s):  
Iosief Abraha ◽  
Diego Serraino ◽  
Alessandro Montedori ◽  
Mario Fusco ◽  
Gianni Giovannini ◽  
...  

ObjectivesTo assess the accuracy of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes in identifying patients diagnosed with incident carcinoma in situ and invasive breast cancer in three Italian administrative databases.DesignA diagnostic accuracy study comparing ICD-9-CM codes for carcinoma in situ (233.0) and for invasive breast cancer (174.x) with medical chart (as a reference standard). Case definition: (1) presence of a primary nodular lesion in the breast and (2) cytological or histological documentation of cancer from a primary or metastatic site.SettingAdministrative databases from Umbria Region, Azienda Sanitaria Locale (ASL) Napoli 3 Sud (NA) and Friuli VeneziaGiulia (FVG) Region.ParticipantsWomen with breast carcinoma in situ (n=246) or invasive breast cancer (n=384) diagnosed (in primary position) between 2012 and 2014.Outcome measuresSensitivity and specificity for codes 233.0 and 174.x.ResultsFor invasive breast cancer the sensitivities were 98% (95% CI 93% to 99%) for Umbria, 96% (95% CI 91% to 99%) for NA and 100% (95% CI 97% to 100%) for FVG. Specificities were 90% (95% CI 82% to 95%) for Umbria, 91% (95% CI 83% to 96%) for NA and 91% (95% CI 84% to 96%) for FVG.For carcinoma in situ the sensitivities were 100% (95% CI 93% to 100%) for Umbria, 100% (95% CI 95% to 100%) for NA and 100% (95% CI 96% to 100%) for FVG. Specificities were 98% (95% CI 93% to 100%) for Umbria, 86% (95% CI 78% to 92%) for NA and 90% (95% CI 82% to 95%) for FVG.ConclusionsAdministrative healthcare databases from Umbria, NA and FVG are accurate in identifying hospitalised news cases of carcinoma of the breast. The proposed case definition is a powerful tool to perform research on large populations of newly diagnosed patients with breast cancer.


2021 ◽  
Author(s):  
Jérémie F. Cohen ◽  
Daniël A. Korevaar ◽  
Douglas G. Altman ◽  
David E. Bruns ◽  
Constantine A. Gatsonis ◽  
...  

Diagnostic accuracy studies are, like other clinical studies, at risk of bias due to shortcomings in design and conduct, and the results of a diagnostic accuracy study may not apply to other patient groups and settings. Readers of study reports need to be informed about study design and conduct, in sufficient detail to judge the trustworthiness and applicability of the study findings. The STARD statement (Standards for Reporting of Diagnostic Accuracy Studies) was developed to improve the completeness and transparency of reports of diagnostic accuracy studies. STARD contains a list of essential items that can be used as a checklist, by authors, reviewers and other readers, to ensure that a report of a diagnostic accuracy study contains the necessary information. STARD was recently updated. All updated STARD materials, including the checklist, are available at http://www.equator-network.org/reporting-guidelines/stard. Here, we present the STARD 2015 explanation and elaboration document. Through commented examples of appropriate reporting, we clarify the rationale for each of the 30 items on the STARD 2015 checklist, and describe what is expected from authors in developing sufficiently informative study reports. This article is the reprint with Russian translation of the original that can be observed here: Cohen JF, Korevaar DA, Altman DG, et al. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open 2016;6:e012799. doi: 10.1136/bmjopen-2016-012799


2020 ◽  
Vol 29 (10) ◽  
pp. 2958-2971 ◽  
Author(s):  
Maria Stark ◽  
Antonia Zapf

Introduction In a confirmatory diagnostic accuracy study, sensitivity and specificity are considered as co-primary endpoints. For the sample size calculation, the prevalence of the target population must be taken into account to obtain a representative sample. In this context, a general problem arises. With a low or high prevalence, the study may be overpowered in one subpopulation. One further issue is the correct pre-specification of the true prevalence. With an incorrect assumption about the prevalence, an over- or underestimated sample size will result. Methods To obtain the desired power independent of the prevalence, a method for an optimal sample size calculation for the comparison of a diagnostic experimental test with a prespecified minimum sensitivity and specificity is proposed. To face the problem of an incorrectly pre-specified prevalence, a blinded one-time re-estimation design of the sample size based on the prevalence and a blinded repeated re-estimation design of the sample size based on the prevalence are evaluated by a simulation study. Both designs are compared to a fixed design and additionally among each other. Results The type I error rates of both blinded re-estimation designs are not inflated. Their empirical overall power equals the desired theoretical power and both designs offer unbiased estimates of the prevalence. The repeated re-estimation design reveals no advantages concerning the mean squared error of the re-estimated prevalence or sample size compared to the one-time re-estimation design. The appropriate size of the internal pilot study in the one-time re-estimation design is 50% of the initially calculated sample size. Conclusions A one-time re-estimation design of the prevalence based on the optimal sample size calculation is recommended in single-arm diagnostic accuracy studies.


BMJ Open ◽  
2013 ◽  
Vol 3 (4) ◽  
pp. e002394 ◽  
Author(s):  
David McShefferty ◽  
William M Whitmer ◽  
Iain R C Swan ◽  
Michael A Akeroyd

BMJ ◽  
2021 ◽  
pp. n423 ◽  
Author(s):  
Maya Moshe ◽  
Anna Daunt ◽  
Barnaby Flower ◽  
Bryony Simmons ◽  
Jonathan C Brown ◽  
...  

Abstract Objective To evaluate the performance of new lateral flow immunoassays (LFIAs) suitable for use in a national coronavirus disease 2019 (covid-19) seroprevalence programme (real time assessment of community transmission 2—React 2). Design Diagnostic accuracy study. Setting Laboratory analyses were performed in the United Kingdom at Imperial College, London and university facilities in London. Research clinics for finger prick sampling were run in two affiliated NHS trusts. Participants Sensitivity analyses were performed on sera stored from 320 previous participants in the React 2 programme with confirmed previous severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. Specificity analyses were performed on 1000 prepandemic serum samples. 100 new participants with confirmed previous SARS-CoV-2 infection attended study clinics for finger prick testing. Interventions Laboratory sensitivity and specificity analyses were performed for seven LFIAs on a minimum of 200 serum samples from participants with confirmed SARS-CoV-2 infection and 500 prepandemic serum samples, respectively. Three LFIAs were found to have a laboratory sensitivity superior to the finger prick sensitivity of the LFIA currently used in React 2 seroprevalence studies (84%). These LFIAs were then further evaluated through finger prick testing on participants with confirmed previous SARS-CoV-2 infection: two LFIAs (Surescreen, Panbio) were evaluated in clinics in June-July 2020 and the third LFIA (AbC-19) in September 2020. A spike protein enzyme linked immunoassay and hybrid double antigen binding assay were used as laboratory reference standards. Main outcome measures The accuracy of LFIAs in detecting immunoglobulin G (IgG) antibodies to SARS-CoV-2 compared with two reference standards. Results The sensitivity and specificity of seven new LFIAs that were analysed using sera varied from 69% to 100%, and from 98.6% to 100%, respectively (compared with the two reference standards). Sensitivity on finger prick testing was 77% (95% confidence interval 61.4% to 88.2%) for Panbio, 86% (72.7% to 94.8%) for Surescreen, and 69% (53.8% to 81.3%) for AbC-19 compared with the reference standards. Sensitivity for sera from matched clinical samples performed on AbC-19 was significantly higher with serum than finger prick at 92% (80.0% to 97.7%, P=0.01). Antibody titres varied considerably among cohorts. The numbers of positive samples identified by finger prick in the lowest antibody titre quarter varied among LFIAs. Conclusions One new LFIA was identified with clinical performance suitable for potential inclusion in seroprevalence studies. However, none of the LFIAs tested had clearly superior performance to the LFIA currently used in React 2 seroprevalence surveys, and none showed sufficient sensitivity and specificity to be considered for routine clinical use.


Sign in / Sign up

Export Citation Format

Share Document