scholarly journals Understanding Sources of Bias in Diagnostic Accuracy Studies

2013 ◽  
Vol 137 (4) ◽  
pp. 558-565 ◽  
Author(s):  
Robert L. Schmidt ◽  
Rachel E. Factor

Context.—Accuracy is an important feature of any diagnostic test. There has been an increasing awareness of deficiencies in study design that can create bias in estimates of test accuracy. Many pathologists are unaware of these sources of bias. Objective.—To explain the causes and increase awareness of several common types of bias that result from deficiencies in the design of diagnostic accuracy studies. Data Sources.—We cite examples from the literature and provide calculations to illustrate the impact of study design features on estimates of diagnostic accuracy. In a companion article by Schmidt et al in this issue, we use these principles to evaluate diagnostic studies associated with a specific diagnostic test for risk of bias and reporting quality. Conclusions.—There are several sources of bias that are unique to diagnostic accuracy studies. Because pathologists are both consumers and producers of such studies, it is important that they be aware of the risk of bias.

2020 ◽  
Vol 27 (7) ◽  
pp. 1092-1101 ◽  
Author(s):  
Ryan J Crowley ◽  
Yuan Jin Tan ◽  
John P A Ioannidis

Abstract Objective Machine learning (ML) diagnostic tools have significant potential to improve health care. However, methodological pitfalls may affect diagnostic test accuracy studies used to appraise such tools. We aimed to evaluate the prevalence and reporting of design characteristics within the literature. Further, we sought to empirically assess whether design features may be associated with different estimates of diagnostic accuracy. Materials and Methods We systematically retrieved 2 × 2 tables (n = 281) describing the performance of ML diagnostic tools, derived from 114 publications in 38 meta-analyses, from PubMed. Data extracted included test performance, sample sizes, and design features. A mixed-effects metaregression was run to quantify the association between design features and diagnostic accuracy. Results Participant ethnicity and blinding in test interpretation was unreported in 90% and 60% of studies, respectively. Reporting was occasionally lacking for rudimentary characteristics such as study design (28% unreported). Internal validation without appropriate safeguards was used in 44% of studies. Several design features were associated with larger estimates of accuracy, including having unreported (relative diagnostic odds ratio [RDOR], 2.11; 95% confidence interval [CI], 1.43-3.1) or case-control study designs (RDOR, 1.27; 95% CI, 0.97-1.66), and recruiting participants for the index test (RDOR, 1.67; 95% CI, 1.08-2.59). Discussion Significant underreporting of experimental details was present. Study design features may affect estimates of diagnostic performance in the ML diagnostic test accuracy literature. Conclusions The present study identifies pitfalls that threaten the validity, generalizability, and clinical value of ML diagnostic tools and provides recommendations for improvement.


2013 ◽  
Vol 137 (4) ◽  
pp. 566-575 ◽  
Author(s):  
Robert L. Schmidt ◽  
Rachel E. Factor ◽  
Benjamin L. Witt ◽  
Lester J. Layfield

Context.—The quality of diagnostic accuracy studies is determined by 2 key factors: risk of bias and comparability. Bias can distort accuracy estimates and poor reporting impairs comparability. While diagnostic accuracy studies for fine-needle aspiration cytology (FNAC) are frequently published, the methodologic issues associated with this body of literature have never been reviewed. Objective.—To assess the quality of design and reporting of diagnostic test accuracy studies in FNAC. Data Sources.—Diagnostic accuracy studies were identified by a Medline (US National Library of Medicine) search. Sixty-four FNAC diagnostic test accuracy studies were randomly selected for structured review with the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) survey. Studies were divided between 2 time periods: 2000-2001 and 2009-2011. Conclusions.—Diagnostic test accuracy studies of FNAC suffer from numerous deficiencies in study design, which negatively affect the reliability of accuracy estimates.


2019 ◽  
Author(s):  
Karoline Lukaschek ◽  
Milena Frank ◽  
Kathrin Halfter ◽  
Antonius Schneider ◽  
Jochen Gensichen

Abstract Background: As primary contacts, general practitioners can play a pivotal role in identifying suicidal behaviour in their patients. A brief instrument could help in identifying vulnerable patients. We summarized the available studies reporting diagnostic accuracy of short screening instruments for suicidal behaviour in primary care or the general population in a narrative synthesis. Methods: The databases MEDLINE, EMBASE, PsychINFO, PSYNDEX, and Cochrane Library were searched in January 2019 without any time constraints. Risk of bias and applicability concerns were assessed using the QUADAS-2 tool. The certainty of evidence was rated via GRADEpro. The authors followed the PRISMA extensions for Diagnostic Test Accuracy Studies. Results: We identified a total of 9 969 studies with our search strategy. After the selection process, six relevant studies fulfilled all criteria and were included. They used the following index tests: Kessler Psychological Distress Scale, Suicidal Ideation Screening Questionnaire, Suicidal Ideation Attributes Scale, Gate question suicide attempt, Gate question suicidal ideation, Feeling suicidal, Wishing you were dead, Thoughts of death and Patient-Health-Questionaire-9 - item 9. The diagnostic accuracy measurements sensitivity and specificity had a wide range (sensitivity: 26% - 100%, specificity: 64% - 99%). Risk of bias was rated moderate and concerns regarding applicability acceptable. A required sensitivity of at least 80% and specificity of 50% with a moderate to high GRADE rating was achieved by six of nine index tests. Conclusions: The identified studies were heterogeneous regarding sample size, index test and reference standard. Even though screening of suicidal behaviour in primary care is already recommended by several guidelines, there are only few screeners in primary care that have been examined regarding their diagnostic accuracy. Although they can assist GPs in their judgement of suicidal behaviour of patients at risk, the final assessment is always based on the clinical judgement of the attending physician. Further diagnostic test accuracy studies of promising short questionnaires are needed. Registration: The study protocol was registered at PROSPERO (ID: CRD42019122173).


2018 ◽  
Vol 146 (6) ◽  
pp. 747-756
Author(s):  
J.M. Hughes ◽  
C. Penney ◽  
S. Boyd ◽  
P. Daley

AbstractCommercial point-of-care (POC) diagnostic tests for Group A Streptococcus, Streptococcus pneumoniae, and influenza virus have large potential diagnostic and financial impact. Many published reports on test performance, often funded by diagnostics companies, are prone to bias. The Standards for Reporting of Diagnostic Accuracy (STARD 2015) are a protocol to encourage accurate, transparent reporting. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool evaluates risk of bias and transportability of results. We used these tools to evaluate diagnostic test accuracy studies of POC studies for three respiratory pathogens. For the 96 studies analysed, compliance was <25% for 14/34 STARD 2015 standards, and 3/7 QUADAS-2 domains showed a high risk of bias. All reports lacked reporting of at least one criterion. These biases should be considered in the interpretation of study results.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Pakpoom Subsoontorn ◽  
Manupat Lohitnavy ◽  
Chuenjid Kongkaew

AbstractMany recent studies reported coronavirus point-of-care tests (POCTs) based on isothermal amplification. However, the performances of these tests have not been systematically evaluated. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy was used as a guideline for conducting this systematic review. We searched peer-reviewed and preprint articles in PubMed, BioRxiv and MedRxiv up to 28 September 2020 to identify studies that provide data to calculate sensitivity, specificity and diagnostic odds ratio (DOR). Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) was applied for assessing quality of included studies and Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies (PRISMA-DTA) was followed for reporting. We included 81 studies from 65 research articles on POCTs of SARS, MERS and COVID-19. Most studies had high risk of patient selection and index test bias but low risk in other domains. Diagnostic specificities were high (> 0.95) for included studies while sensitivities varied depending on type of assays and sample used. Most studies (n = 51) used reverse transcription loop-mediated isothermal amplification (RT-LAMP) to diagnose coronaviruses. RT-LAMP of RNA purified from COVID-19 patient samples had pooled sensitivity at 0.94 (95% CI: 0.90–0.96). RT-LAMP of crude samples had substantially lower sensitivity at 0.78 (95% CI: 0.65–0.87). Abbott ID Now performance was similar to RT-LAMP of crude samples. Diagnostic performances by CRISPR and RT-LAMP on purified RNA were similar. Other diagnostic platforms including RT- recombinase assisted amplification (RT-RAA) and SAMBA-II also offered high sensitivity (> 0.95). Future studies should focus on the use of un-bias patient cohorts, double-blinded index test and detection assays that do not require RNA extraction.


2020 ◽  
Vol 122 ◽  
pp. 129-141 ◽  
Author(s):  
Holger J. Schünemann ◽  
Reem A. Mustafa ◽  
Jan Brozek ◽  
Karen R. Steingart ◽  
Mariska Leeflang ◽  
...  

Author(s):  
S. Lindner ◽  
K. von Rudno ◽  
J. Gawlitza ◽  
J. Hardt ◽  
F. Sandra-Petrescu ◽  
...  

Abstract Purpose This study investigates whether contrast enema (CE) and flexible endoscopy (FE) should be performed routinely after low anterior resection (LAR) before ileostomy reversal. Additionally, the impact of previous anastomotic leakage (AL) on diagnostic test accuracy (DTA) was assessed. Methods This is a retrospective analysis of prospectively collected tertiary care data of two centers. Consecutive rectal cancer patients undergoing LAR with loop ileostomy formation were included. Before ileostomy reversal, all patients were assessed by CE and FE. DTA of FE and CE for asymptomatic AL in patients who had previously suffered from clinically relevant AL (group 1) compared with those without apparent AL after LAR (group 0) were assessed separately. Results Two hundred ninety-three patients were included in the analysis, 86 in group 1 and 207 in group 0. Overall sensitivity for detection of asymptomatic AL was 76% (FE) and 60% (CE). Specificity was 100% for both tests. DTA of FE was equal or superior to CE in all subgroups. Prevalence of asymptomatic AL at the time of testing was 1.4% in group 0 and 25.6% in group 1. Conclusion Flexible endoscopy is the more accurate diagnostic test for the detection of asymptomatic anastomotic leaks prior to ileostomy reversal. Contrast enema showed no gain of information. In the group without complications after the initial rectal resection, 104 must be tested to find one leak prior to reversal. In those patients, routine diagnostic testing additional to digital rectal examination may be questioned.


2016 ◽  
Vol 17 (5) ◽  
pp. 706 ◽  
Author(s):  
Young Jun Choi ◽  
Mi Sun Chung ◽  
Hyun Jung Koo ◽  
Ji Eun Park ◽  
Hee Mang Yoon ◽  
...  

2016 ◽  
Vol 17 (1) ◽  
pp. 3-8 ◽  
Author(s):  
S. Buczinski ◽  
G. Fecteau ◽  
M. Chigerwe ◽  
J. M. Vandeweerd

AbstractCalves are highly dependent of colostrum (and antibody) intake because they are born agammaglobulinemic. The transfer of passive immunity in calves can be assessed directly by dosing immunoglobulin G (IgG) or by refractometry or Brix refractometry. The latter are easier to perform routinely in the field. This paper presents a protocol for a systematic review meta-analysis to assess the diagnostic accuracy of refractometry or Brix refractometry versus dosage of IgG as a reference standard test. With this review protocol we aim to be able to report refractometer and Brix refractometer accuracy in terms of sensitivity and specificity as well as to quantify the impact of any study characteristic on test accuracy.


2021 ◽  
Author(s):  
Jérémie F. Cohen ◽  
Daniël A. Korevaar ◽  
Douglas G. Altman ◽  
David E. Bruns ◽  
Constantine A. Gatsonis ◽  
...  

Diagnostic accuracy studies are, like other clinical studies, at risk of bias due to shortcomings in design and conduct, and the results of a diagnostic accuracy study may not apply to other patient groups and settings. Readers of study reports need to be informed about study design and conduct, in sufficient detail to judge the trustworthiness and applicability of the study findings. The STARD statement (Standards for Reporting of Diagnostic Accuracy Studies) was developed to improve the completeness and transparency of reports of diagnostic accuracy studies. STARD contains a list of essential items that can be used as a checklist, by authors, reviewers and other readers, to ensure that a report of a diagnostic accuracy study contains the necessary information. STARD was recently updated. All updated STARD materials, including the checklist, are available at http://www.equator-network.org/reporting-guidelines/stard. Here, we present the STARD 2015 explanation and elaboration document. Through commented examples of appropriate reporting, we clarify the rationale for each of the 30 items on the STARD 2015 checklist, and describe what is expected from authors in developing sufficiently informative study reports. This article is the reprint with Russian translation of the original that can be observed here: Cohen JF, Korevaar DA, Altman DG, et al. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open 2016;6:e012799. doi: 10.1136/bmjopen-2016-012799


Sign in / Sign up

Export Citation Format

Share Document