GRADE guidelines: 21 part 1. Study design, risk of bias, and indirectness in rating the certainty across a body of evidence for test accuracy

Context.—Accuracy is an important feature of any diagnostic test. There has been an increasing awareness of deficiencies in study design that can create bias in estimates of test accuracy. Many pathologists are unaware of these sources of bias. Objective.—To explain the causes and increase awareness of several common types of bias that result from deficiencies in the design of diagnostic accuracy studies. Data Sources.—We cite examples from the literature and provide calculations to illustrate the impact of study design features on estimates of diagnostic accuracy. In a companion article by Schmidt et al in this issue, we use these principles to evaluate diagnostic studies associated with a specific diagnostic test for risk of bias and reporting quality. Conclusions.—There are several sources of bias that are unique to diagnostic accuracy studies. Because pathologists are both consumers and producers of such studies, it is important that they be aware of the risk of bias.

Download Full-text

Diagnostic tools for neurosyphilis: a systematic review

BMC Infectious Diseases ◽

10.1186/s12879-021-06264-8 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Gustavo Henrique Pereira Boog ◽

João Vitor Ziroldo Lopes ◽

João Vitor Mahler ◽

Marina Solti ◽

Lucas Tokio Kawahara ◽

...

Keyword(s):

New Technologies ◽

Clinical Suspicion ◽

Risk Of Bias ◽

Diagnostic Methods ◽

Diagnostic Tools ◽

Test Accuracy ◽

Cerebrospinal Fluid Analysis ◽

Evaluation Methodologies ◽

Fluid Analysis ◽

Criteria For Diagnosis

Abstract Purpose Increasing incidences of syphilis highlight the preoccupation with the occurrence of neurosyphilis. This study aimed to understand the current diagnostic tools and their performance to detect neurosyphilis, including new technologies and the variety of existing methods. Methods We searched databases to select articles that reported neurosyphilis diagnostic methods and assessed their accuracy, presenting sensitivity and specificity values. Information was synthesized in tables. The risk of bias was examined using the Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy recommendations. Results Fourteen studies were included. The main finding was a remarkable diversity of tests, which had varied purposes, techniques, and evaluation methodologies. There was no uniform criterion or gold standard to define neurosyphilis. The current basis for its diagnosis is clinical suspicion and cerebrospinal fluid analysis. There are new promising tests such as PCR tests and chemokine measurement assays. Conclusions The diagnosis of neurosyphilis is still a challenge, despite the variety of existing and developing tests. We believe that the multiplicity of reference standards adopted as criteria for diagnosis reveals the imprecision of the current definitions of neurosyphilis. An important next step for the scientific community is to create a universally accepted diagnostic definition for this disease.

Download Full-text

The Usefulness of the Pressure Algometer in the Diagnosis and Treatment of Orofacial Pain Patients: A Systematic Review

Occupational Therapy International ◽

10.1155/2020/5168457 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Agata Kamińska ◽

Bartosz Dalewski ◽

Ewa Sobolewska

Keyword(s):

Study Design ◽

Orofacial Pain ◽

Pressure Pain Threshold ◽

Reliability And Validity ◽

Risk Of Bias ◽

Small Samples ◽

Menstrual Phase ◽

Efficient Approach ◽

Pressure Algometer ◽

Pain Patients

Study Design. Data were obtained from PubMed, Dentistry and Oral Sciences Source, ProQuest, Scopus, Medline (EBSCO), and ScienceDirect databases. Literature search was performed from 1 December 2017 through 12 January 2018. The titles and abstracts from electronic search results were screened for keywords and evaluated by two observers, with the following inclusion criteria: published since 1997, written in English, and encompassing human research. Exclusion criteria were as follows: articles published earlier than 1997, not written in English, animal studies, studies with the use of medicaments, and articles examining receptor interactions. Objectives. The pressure pain threshold (PPT) may be an efficient approach to screen and evaluate orofacial pain. However, the results of previous PPT studies have varied greatly. The aim of this paper was to determine whether the PPT is an efficient approach for screening and evaluating orofacial pain. Methods. The search yielded 123 articles. After removal of duplicates and screening of abstracts, 32 articles were selected for further evaluation. The Cochrane Collaboration tool for assessing the risk of bias was used for the evaluation of the studies. Results. The studies covered a total of 4403 adult patients, aged 16-62, and 30 children. The studies investigated the reliability and validity of the PPT (measured by a pressure algometer) in TMD patients. The PPT was investigated in relation to headache, menstrual cycle, oral contraception, occlusal interference, and occlusal appliances. Generally, the risk of bias was low to unclear. Some structural limitations were inherent in the studies, such as small samples and short duration of the testing involved. Also, the analyzed studies lacked consistency in study design and patient management. Pressure increase values differed from 20 kPa/s to 50 kPa/s and from 0.5 kg/cm2/s to 2 kg/cm2/s. Descriptions of the PPT examination points also varied, from very precise and repeatable to a simple listing of anatomical points. The number of measurements varied from 1 to 5 at each visit. The intervals ranged from 5 seconds to 15 minutes. However, some studies confirmed that the pressure algometer is an effective tool for determining the source of orofacial pain. Conclusions. Based on the analyzed articles, the authors argue that the PPT is not an efficient approach for screening and evaluating orofacial pain. What is more, it should not be used as the only diagnostics tool for patients with orofacial pain. Importantly, however, additional factors should be considered in the future for the evaluation of the PPT, including body symmetry and posture, hormone levels and the menstrual phase in women, and the use of medications and its influence on the PPT. Further clinical trials should also be performed on the PPT, examining head and neck pain patients, with more precise study design and larger samples.

Download Full-text

Risk of bias and limits of reporting in diagnostic accuracy studies for commercial point-of-care tests for respiratory pathogens

Epidemiology and Infection ◽

10.1017/s0950268818000596 ◽

2018 ◽

Vol 146 (6) ◽

pp. 747-756

Author(s):

J.M. Hughes ◽

C. Penney ◽

S. Boyd ◽

P. Daley

Keyword(s):

Diagnostic Accuracy ◽

Test Performance ◽

Point Of Care ◽

Risk Of Bias ◽

Financial Impact ◽

Test Accuracy ◽

Respiratory Pathogens ◽

Study Results ◽

Group A ◽

Point Of Care Tests

AbstractCommercial point-of-care (POC) diagnostic tests for Group A Streptococcus, Streptococcus pneumoniae, and influenza virus have large potential diagnostic and financial impact. Many published reports on test performance, often funded by diagnostics companies, are prone to bias. The Standards for Reporting of Diagnostic Accuracy (STARD 2015) are a protocol to encourage accurate, transparent reporting. The Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) tool evaluates risk of bias and transportability of results. We used these tools to evaluate diagnostic test accuracy studies of POC studies for three respiratory pathogens. For the 96 studies analysed, compliance was <25% for 14/34 STARD 2015 standards, and 3/7 QUADAS-2 domains showed a high risk of bias. All reports lacked reporting of at least one criterion. These biases should be considered in the interpretation of study results.

Download Full-text

Risk of bias tools in systematic reviews of health interventions: an analysis of PROSPERO-registered protocols

Systematic Reviews ◽

10.1186/s13643-019-1172-8 ◽

2019 ◽

Vol 8 (1) ◽

Cited By ~ 9

Author(s):

Kelly Farrah ◽

Kelsey Young ◽

Matthew C. Tunis ◽

Linlu Zhao

Keyword(s):

Random Sample ◽

Study Design ◽

Systematic Reviews ◽

Keyword Search ◽

Randomized Study ◽

Risk Of Bias ◽

Health Interventions ◽

Tool Selection ◽

Newcastle Ottawa Scale ◽

Education And Awareness

Abstract Background Systematic reviews of health interventions are increasingly incorporating evidence outside of randomized controlled trials (RCT). While non-randomized study (NRS) types may be more prone to bias compared to RCT, the tools used to evaluate risk of bias (RoB) in NRS are less straightforward and no gold standard tool exists. The objective of this study was to evaluate the planned use of RoB tools in systematic reviews of health interventions, specifically for reviews that planned to incorporate evidence from RCT and/or NRS. Methods We evaluated a random sample of non-Cochrane protocols for systematic reviews of interventions registered in PROSPERO between January 1 and October 12, 2018. For each protocol, we extracted data on the types of studies to be included (RCT and/or NRS) as well as the name and number of RoB tools planned to be used according to study design. We then conducted a longitudinal analysis of the most commonly reported tools in the random sample. Using keywords and name variants for each tool, we searched PROSPERO records by year since the inception of the database (2011 to December 7, 2018), restricting the keyword search to the “Risk of bias (quality) assessment” field. Results In total, 471 randomly sampled PROSPERO protocols from 2018 were included in the analysis. About two-thirds (63%) of these planned to include NRS, while 37% restricted study design to RCT or quasi-RCT. Over half of the protocols that planned to include NRS listed only a single RoB tool, most frequently the Cochrane RoB Tool. The Newcastle-Ottawa Scale and ROBINS-I were the most commonly reported tools for NRS (39% and 33% respectively) for systematic reviews that planned to use multiple RoB tools. Looking at trends over time, the planned use of the Cochrane RoB Tool and ROBINS-I seems to be increasing. Conclusions While RoB tool selection for RCT was consistent, with the Cochrane RoB Tool being the most frequently reported in PROSPERO protocols, RoB tools for NRS varied widely. Results suggest a need for more education and awareness on the appropriate use of RoB tools for NRS. Given the heterogeneity of study designs comprising NRS, multiple RoB tools tailored to specific designs may be required.

Download Full-text

STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. Translation to Russian

Digital Diagnostics ◽

10.17816/dd71031 ◽

2021 ◽

Author(s):

Jérémie F. Cohen ◽

Daniël A. Korevaar ◽

Douglas G. Altman ◽

David E. Bruns ◽

Constantine A. Gatsonis ◽

...

Keyword(s):

Diagnostic Accuracy ◽

Study Design ◽

Clinical Studies ◽

Russian Translation ◽

Risk Of Bias ◽

Reporting Guidelines ◽

Diagnostic Accuracy Study ◽

Sufficient Detail ◽

Accuracy Study ◽

Patient Groups

Diagnostic accuracy studies are, like other clinical studies, at risk of bias due to shortcomings in design and conduct, and the results of a diagnostic accuracy study may not apply to other patient groups and settings. Readers of study reports need to be informed about study design and conduct, in sufficient detail to judge the trustworthiness and applicability of the study findings. The STARD statement (Standards for Reporting of Diagnostic Accuracy Studies) was developed to improve the completeness and transparency of reports of diagnostic accuracy studies. STARD contains a list of essential items that can be used as a checklist, by authors, reviewers and other readers, to ensure that a report of a diagnostic accuracy study contains the necessary information. STARD was recently updated. All updated STARD materials, including the checklist, are available at http://www.equator-network.org/reporting-guidelines/stard. Here, we present the STARD 2015 explanation and elaboration document. Through commented examples of appropriate reporting, we clarify the rationale for each of the 30 items on the STARD 2015 checklist, and describe what is expected from authors in developing sufficiently informative study reports. This article is the reprint with Russian translation of the original that can be observed here: Cohen JF, Korevaar DA, Altman DG, et al. STARD 2015 guidelines for reporting diagnostic accuracy studies: explanation and elaboration. BMJ Open 2016;6:e012799. doi: 10.1136/bmjopen-2016-012799

Download Full-text

OP136 Full Texts Versus Conference Abstract Data: How Does The Message Change?

International Journal of Technology Assessment in Health Care ◽

10.1017/s0266462318001538 ◽

2018 ◽

Vol 34 (S1) ◽

pp. 50-51

Author(s):

Jo Varley-Campbell ◽

Chris Cooper ◽

Helen Coelho ◽

Sophie Dodman ◽

Max Barnish ◽

...

Keyword(s):

Sensitivity And Specificity ◽

Full Text ◽

Preterm Labor ◽

Risk Of Bias ◽

Test Accuracy ◽

Conference Abstract ◽

High Quality Evidence ◽

Abstract Data ◽

Using Data ◽

Accuracy Data

Introduction:High quality evidence for test accuracy can be scarce. We assessed the test accuracy of two tests (Actim Partus and PartoSure) for the prediction of preterm birth. Twenty published full-text papers were included whilst conference abstracts were excluded. Since systematic reviews of diagnostic tests on other topics may need to rely on data from conference abstracts, we test whether the findings of our review would change with conference abstracts included.Methods:Conference citations previously excluded (n=108) were re-screened for inclusion using the following criteria: i) the diagnostic test was Actim Partus or PartoSure ii) test accuracy data of preterm delivery within seven days was reported iii) the population was women with signs/symptoms of preterm labor with intact membranes. Relevant test accuracy data were extracted and used to calculate sensitivity and specificity. Pooled sensitivity and specificity for each test were run using data from full-text papers and conference abstracts combined. These values were compared with the pooled sensitivities and specificities produced for the systematic review using full-text papers only.Results:Preliminary pooled sensitivities of the sixteen full-text Actim Partus studies and sixteen full-texts and two abstracts were 0.77 (95% confidence interval (CI) 0.68, 0.83) and 0.76 (95% CI 0.69, 0.83) respectively whilst pooled specificities were 0.81 (95% CI 0.76, 0.85).and 0.80 (95% CI 0.75, 0.84) respectively. Preliminary, pooled sensitivities of the four full-text PartoSure studies and four full-texts and three abstracts were 0.83 (95% CI 0.61, 0.94) and 0.82 (95% CI 0.65, 0.92), respectively, whilst pooled specificities were 0.95 (95% CI 0.89, 0.98) and 0.96 (95% CI 0.94, 0.97), respectively.Conclusions:Our findings suggest that the test accuracy results would not alter substantially with the inclusion of conference abstracts. However, work is ongoing to investigate how the assessment of heterogeneity and risk of bias across studies would alter given the difficulties associated with limited methodological reporting from conference abstracts.

Download Full-text

Minimizing risk of bias in clinical implant research study design

Periodontology 2000 ◽

10.1111/prd.12279 ◽

2019 ◽

Vol 81 (1) ◽

pp. 18-28

Author(s):

Georgios A. Kotsakis

Keyword(s):

Study Design ◽

Research Study ◽

Risk Of Bias ◽

Research Study Design

Download Full-text

Relationship of the Medial Patellofemoral Ligament Origin on the Distal Femur to the Distal Femoral Physis: A Systematic Review

The American Journal of Sports Medicine ◽

10.1177/0363546520904685 ◽

2020 ◽

Vol 49 (1) ◽

pp. 261-266 ◽

Cited By ~ 1

Author(s):

Kyle R. Sochacki ◽

Kevin G. Shea ◽

Kunal Varshneya ◽

Marc R. Safran ◽

Geoffrey D. Abrams ◽

...

Keyword(s):

Systematic Review ◽

Quality Assessment ◽

Study Design ◽

Medial Patellofemoral Ligament ◽

Distal Femur ◽

Assessment Tool ◽

Risk Of Bias ◽

Skeletally Immature ◽

Quality Assessment Tool ◽

Femoral Physis

Background: The relationship between the medial patellofemoral ligament (MPFL) and the distal femoral physis has been reported in multiple studies. Purpose: To determine the distance from the MPFL central origin on the distal femur to the medial distal femoral physis in skeletally immature participants. Study Design: Systematic review. Methods: A systematic review was performed according to the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines. Multiple databases were searched for studies investigating the anatomic origin of the MPFL on the distal femur and its relationship to the medial distal femoral physis in skeletally immature participants. Study methodological quality was analyzed with the Anatomical Quality Assessment tool, with studies categorized as low risk, high risk, or unclear risk of bias. Continuous variable data were reported as mean ± SD. Categorical variable data were reported as frequency with percentage. Results: Seven articles were analyzed (298 femurs, 53.7% male patients; mean age, 11.7 ± 3.4 years). There was low risk of bias based on the Anatomical Quality Assessment tool. The distance from the MPFL origin to the distal femoral physis ranged from 3.7 mm proximal to the physis to 10.0 mm distal to the physis in individual studies. Six of 7 studies reported that the MPFL origin on the distal femur lies distal to the medial distal femoral physis in the majority of specimens. The MPFL originated distal to the medial distal femoral physis in 92.8% of participants at a mean distance of 6.9 ± 2.4 mm. Conclusion: The medial patellofemoral ligament originates distal to the medial distal femoral physis in the majority of cases at a mean proximal-to-distal distance of 7 mm distal to the physis. However, this is variable in the literature owing to study design and patient age and sex.

Download Full-text

Quality Appraisal of Diagnostic Accuracy Studies in Fine-Needle Aspiration Cytology: A Survey of Risk of Bias and Comparability

Archives of Pathology & Laboratory Medicine ◽

10.5858/arpa.2012-0199-ra ◽

2013 ◽

Vol 137 (4) ◽

pp. 566-575 ◽

Cited By ~ 11

Author(s):

Robert L. Schmidt ◽

Rachel E. Factor ◽

Benjamin L. Witt ◽

Lester J. Layfield

Keyword(s):

Diagnostic Accuracy ◽

Diagnostic Test ◽

Needle Aspiration ◽

Risk Of Bias ◽

Diagnostic Test Accuracy ◽

Test Accuracy ◽

Aspiration Cytology ◽

Fine Needle ◽

Needle Aspiration Cytology

Context.—The quality of diagnostic accuracy studies is determined by 2 key factors: risk of bias and comparability. Bias can distort accuracy estimates and poor reporting impairs comparability. While diagnostic accuracy studies for fine-needle aspiration cytology (FNAC) are frequently published, the methodologic issues associated with this body of literature have never been reviewed. Objective.—To assess the quality of design and reporting of diagnostic test accuracy studies in FNAC. Data Sources.—Diagnostic accuracy studies were identified by a Medline (US National Library of Medicine) search. Sixty-four FNAC diagnostic test accuracy studies were randomly selected for structured review with the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) survey. Studies were divided between 2 time periods: 2000-2001 and 2009-2011. Conclusions.—Diagnostic test accuracy studies of FNAC suffer from numerous deficiencies in study design, which negatively affect the reliability of accuracy estimates.

Download Full-text