Observed Differences in Diagnostic Test Accuracy between Patient Subgroups: Is It Real or Due to Reference Standard Misclassification?

Abstract Before a new test is introduced in clinical practice, its accuracy should be assessed. In the past decade, researchers have put an increased emphasis on exploring differences in test sensitivity and specificity between patient subgroups. If the reference standard is imperfect and the prevalence of the target condition differs among subgroups, apparent differences in test sensitivity and specificity between subgroups may be caused by reference standard misclassification. We provide guidance on how to determine whether observed differences may be explained by reference standard misclassification. Such misclassification may be ascertained by examining how the apparent sensitivity and specificity change with the prevalence of the target condition in the subgroups.

Download Full-text

Use of Smartphones to Detect Diabetic Retinopathy: Scoping Review and Meta-Analysis of Diagnostic Test Accuracy Studies (Preprint)

10.2196/preprints.16658 ◽

2019 ◽

Author(s):

Choon Han Tan ◽

Bhone Myint Kyaw ◽

Helen Smith ◽

Colin S Tan ◽

Lorainne Tudor Car

Keyword(s):

Diabetic Retinopathy ◽

Diagnostic Accuracy ◽

Diagnostic Test ◽

Sensitivity And Specificity ◽

Meta Analysis ◽

Diagnostic Test Accuracy ◽

Fundus Photography ◽

Diabetic Patients ◽

Test Accuracy ◽

Reference Standard

BACKGROUND Diabetic retinopathy (DR), a common complication of diabetes mellitus, is the leading cause of impaired vision in adults worldwide. Smartphone ophthalmoscopy involves using a smartphone camera for digital retinal imaging. Utilizing smartphones to detect DR is potentially more affordable, accessible, and easier to use than conventional methods. OBJECTIVE This study aimed to determine the diagnostic accuracy of various smartphone ophthalmoscopy approaches for detecting DR in diabetic patients. METHODS We performed an electronic search on the Medical Literature Analysis and Retrieval System Online (MEDLINE), EMBASE, and Cochrane Library for literature published from January 2000 to November 2018. We included studies involving diabetic patients, which compared the diagnostic accuracy of smartphone ophthalmoscopy for detecting DR to an accurate or commonly employed reference standard, such as indirect ophthalmoscopy, slit-lamp biomicroscopy, and tabletop fundus photography. Two reviewers independently screened studies against the inclusion criteria, extracted data, and assessed the quality of included studies using the Quality Assessment of Diagnostic Accuracy Studies–2 tool, with disagreements resolved via consensus. Sensitivity and specificity were pooled using the random effects model. A summary receiver operating characteristic (SROC) curve was constructed. This review is reported in line with the Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies guidelines. RESULTS In all, nine studies involving 1430 participants were included. Most studies were of high quality, except one study with limited applicability because of its reference standard. The pooled sensitivity and specificity for detecting any DR was 87% (95% CI 74%-94%) and 94% (95% CI 81%-98%); mild nonproliferative DR (NPDR) was 39% (95% CI 10%-79%) and 95% (95% CI 91%-98%); moderate NPDR was 71% (95% CI 57%-81%) and 95% (95% CI 88%-98%); severe NPDR was 80% (95% CI 49%-94%) and 97% (95% CI 88%-99%); proliferative DR (PDR) was 92% (95% CI 79%-97%) and 99% (95% CI 96%-99%); diabetic macular edema was 79% (95% CI 63%-89%) and 93% (95% CI 82%-97%); and referral-warranted DR was 91% (95% CI 86%-94%) and 89% (95% CI 56%-98%). The area under SROC curve ranged from 0.879 to 0.979. The diagnostic odds ratio ranged from 11.3 to 1225. CONCLUSIONS We found heterogeneous evidence showing that smartphone ophthalmoscopy performs well in detecting DR. The diagnostic accuracy for PDR was highest. Future studies should standardize reference criteria and classification criteria and evaluate other available forms of smartphone ophthalmoscopy in primary care settings.

Download Full-text

Ultrasonography for the diagnosis of extra-cranial carotid occlusion – diagnostic test accuracy meta-analysis

VASA ◽

10.1024/0301-1526/a000850 ◽

2020 ◽

Vol 49 (3) ◽

pp. 195-204

Author(s):

Djamila M. Rojoa ◽

Ahmad Q. D. Lodhi ◽

Nikos Kontopodis ◽

Christos V. Ioannou ◽

Nicos Labropoulos ◽

...

Keyword(s):

Systematic Review ◽

Diagnostic Test ◽

Sensitivity And Specificity ◽

Meta Analysis ◽

Diagnostic Test Accuracy ◽

Carotid Occlusion ◽

Test Accuracy ◽

Free Text ◽

Correct Identification ◽

Reference Standard

Summary: Background: The correct diagnosis of internal carotid artery (ICA) occlusion is crucial as it limits unnecessary intervention, whereas correct identification of patients with severe ICA stenosis is paramount in decision making and selecting patients who would benefit from intervention. We aimed to evaluate the accuracy of ultrasonography (US) in the diagnosis of ICA occlusion. Methods: We conducted a systematic review in compliance with the Preferred Reporting Items for a Systematic Review and Meta-analysis (PRISMA) of diagnostic test accuracy studies. We interrogated electronic bibliographic sources using a combination of free text and thesaurus terms to identify studies assessing the diagnostic accuracy of US in ICA occlusion. We used a mixed-effects logistic regression bivariate model to estimate summary sensitivity and specificity. We developed hierarchical summary receiver operating characteristic (HSROC) curves. Results: We identified 23 studies reporting a total of 5,675 arteries of which 722 were proven to be occluded by the reference standard. The reference standard was digital subtraction or cerebral angiography in all but two studies, which used surgery to ascertain a carotid occlusion. The pooled estimates for sensitivity and specificity were 0.97 (95% confidence interval (CI) 0.94 to 0.99) and 0.99 (95% CI 0.98 to 1.00), respectively. The diagnostic odds ratio was 3,846.15 (95% CI 1,375.74 to 10,752.65). The positive and negative likelihood ratio were 114.71 (95% CI 58.84 to 223.63) and 0.03 (95% CI 0.01 to 0.06), respectively. Conclusions: US is a reliable and accurate method in diagnosing ICA occlusion. US can be used as a screening tool with cross-sectional imaging being reserved for ambiguous cases.

Download Full-text

Use of Smartphones to Detect Diabetic Retinopathy: Scoping Review and Meta-Analysis of Diagnostic Test Accuracy Studies

Journal of Medical Internet Research ◽

10.2196/16658 ◽

2020 ◽

Vol 22 (5) ◽

pp. e16658

Author(s):

Choon Han Tan ◽

Bhone Myint Kyaw ◽

Helen Smith ◽

Colin S Tan ◽

Lorainne Tudor Car

Keyword(s):

Diabetic Retinopathy ◽

Diagnostic Accuracy ◽

Diagnostic Test ◽

Sensitivity And Specificity ◽

Meta Analysis ◽

Diagnostic Test Accuracy ◽

Fundus Photography ◽

Diabetic Patients ◽

Test Accuracy ◽

Reference Standard

Background Diabetic retinopathy (DR), a common complication of diabetes mellitus, is the leading cause of impaired vision in adults worldwide. Smartphone ophthalmoscopy involves using a smartphone camera for digital retinal imaging. Utilizing smartphones to detect DR is potentially more affordable, accessible, and easier to use than conventional methods. Objective This study aimed to determine the diagnostic accuracy of various smartphone ophthalmoscopy approaches for detecting DR in diabetic patients. Methods We performed an electronic search on the Medical Literature Analysis and Retrieval System Online (MEDLINE), EMBASE, and Cochrane Library for literature published from January 2000 to November 2018. We included studies involving diabetic patients, which compared the diagnostic accuracy of smartphone ophthalmoscopy for detecting DR to an accurate or commonly employed reference standard, such as indirect ophthalmoscopy, slit-lamp biomicroscopy, and tabletop fundus photography. Two reviewers independently screened studies against the inclusion criteria, extracted data, and assessed the quality of included studies using the Quality Assessment of Diagnostic Accuracy Studies–2 tool, with disagreements resolved via consensus. Sensitivity and specificity were pooled using the random effects model. A summary receiver operating characteristic (SROC) curve was constructed. This review is reported in line with the Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies guidelines. Results In all, nine studies involving 1430 participants were included. Most studies were of high quality, except one study with limited applicability because of its reference standard. The pooled sensitivity and specificity for detecting any DR was 87% (95% CI 74%-94%) and 94% (95% CI 81%-98%); mild nonproliferative DR (NPDR) was 39% (95% CI 10%-79%) and 95% (95% CI 91%-98%); moderate NPDR was 71% (95% CI 57%-81%) and 95% (95% CI 88%-98%); severe NPDR was 80% (95% CI 49%-94%) and 97% (95% CI 88%-99%); proliferative DR (PDR) was 92% (95% CI 79%-97%) and 99% (95% CI 96%-99%); diabetic macular edema was 79% (95% CI 63%-89%) and 93% (95% CI 82%-97%); and referral-warranted DR was 91% (95% CI 86%-94%) and 89% (95% CI 56%-98%). The area under SROC curve ranged from 0.879 to 0.979. The diagnostic odds ratio ranged from 11.3 to 1225. Conclusions We found heterogeneous evidence showing that smartphone ophthalmoscopy performs well in detecting DR. The diagnostic accuracy for PDR was highest. Future studies should standardize reference criteria and classification criteria and evaluate other available forms of smartphone ophthalmoscopy in primary care settings.

Download Full-text

Metadta: A Stata Command for Meta-analysis and Meta-regression of Diagnostic Test Accuracy Data – A Tutorial.

10.21203/rs.3.rs-721114/v1 ◽

2021 ◽

Author(s):

Victoria Nyawira Nyaga ◽

Marc Arbyn

Keyword(s):

Bladder Cancer ◽

Diagnostic Test ◽

Sensitivity And Specificity ◽

Fixed Effects ◽

Meta Analysis ◽

Diagnostic Test Accuracy ◽

Forest Plot ◽

Test Accuracy ◽

Statistical Procedures ◽

Meta Regression

Abstract BackgroundAlthough statistical procedures for pooling of several epidemiological metrics are generally available in statistical packages, those for meta-analysis of diagnostic test accuracy studies including options for multivariate regression are lacking. Fitting regression models and the processing of the estimates often entails lengthy and tedious calculations. Therefore, packaging appropriate statistical procedures in a robust and user-friendly program is of great interest to the scientific community. Methodsmetadta is a statistical program for pooling of diagnostic accuracy test data in Stata. It implements both the bivariate random-effects and fixed-effects model, allows for meta-regression, and presents the results in tables, a forest plot and/or summary receiver operating characteristic (SROC) plot. For a model without covariates, it also quantifies heterogeneity using an I2 statistic that accounts for the mean-variance relationship, and correlation between sensitivity and specificity, a typical characteristic of diagnostic data. To demonstrate metadta, we applied the program on two published meta-analyses on: 1) the sensitivity and specificity of cytology and other markers including telomerase for primary diagnosis of bladder cancer; and 2) the accuracy of human papillomavirus testing on self-collected versus clinician-collected samples to detect cervical precancer.ResultsWithout requiring a continuity correction, metadta generated a pooled sensitivity and specificity of 0.77 [95% CI: 0.70, 0.82] and 0.91 [95% CI: 0.75, 0.97] respectively of telomerase for the diagnosis of primary bladder cancer. metadta allowed to assess the relative accuracy of human Papilloma virus (HPV) testing on self- versus clinician-taken specimens in matched studies taking into account two covariates. Under the condition of using assays based on target-amplification, HPV tests were similarly sensitive to detect cervical pre-cancer, irrespective of clinical setting. ConclusionThe metadta program implements state of art statistical procedures in an attempt to close the gap between methodological statisticians and systematic reviewers. With metadta, we hope to popularize even further, the use of appropriate statistical methods for diagnostic meta-analysis.

Download Full-text

A tutorial for meta-analysis of diagnostic tests for low-prevalence diseases: Bayesian models and software

Methodology ◽

10.5964/meth.4015 ◽

2020 ◽

Vol 16 (3) ◽

pp. 258-277

Author(s):

Johny J. Pambabay-Calero ◽

Sergio A. Bauz-Olvera ◽

Ana B. Nieto-Librero ◽

Maria Purificación Galindo-Villardón ◽

Ana B. Sánchez-García

Keyword(s):

Diagnostic Test ◽

Sensitivity And Specificity ◽

Meta Analysis ◽

Diagnostic Test Accuracy ◽

Mixed Effects Models ◽

Test Accuracy ◽

R Program ◽

Correct Model ◽

Analytical Integration ◽

Low Prevalence

Although measures such as sensitivity and specificity are used in the study of diagnostic test accuracy, these are not appropriate for integrating heterogeneous studies. Therefore, it is essential to assess in detail all related aspects prior to integrating a set of studies so that the correct model can then be selected. This work describes the scheme employed for making decisions regarding the use of the R, STATA and SAS statistical programs. We used the R Program Meta-Analysis of Diagnostic Accuracy package for determining the correlation between sensitivity and specificity. This package considers fixed, random and mixed effects models and provides excellent summaries and assesses heterogeneity. For selecting various cutoff points in the meta-analysis, we used the STATA module for meta-analytical integration of diagnostic test accuracy studies, which produces bivariate outputs for heterogeneity.

Download Full-text

Diagnostic test accuracy using digital retinal imaging in the detection of any diabetic retinopathy by graders in Vietnam, against a reference standard from the UK

Acta Ophthalmologica ◽

10.1111/j.1755-3768.2022.210 ◽

2022 ◽

Vol 100 (S267) ◽

Author(s):

Katie Curran ◽

Nathan Congdon ◽

Tung Hoang ◽

Huong Tran ◽

Hue Nguyen ◽

...

Keyword(s):

Diabetic Retinopathy ◽

Diagnostic Test ◽

Retinal Imaging ◽

Diagnostic Test Accuracy ◽

Test Accuracy ◽

Reference Standard ◽

The Uk

Download Full-text

Introduction to diagnostic test accuracy studies

Acta Endocrinologica ◽

10.1530/eje-20-1239 ◽

2021 ◽

Vol 184 (2) ◽

pp. E5-E9

Author(s):

Alice J Sitch ◽

Olaf M Dekkers ◽

Barnaby R Scholefield ◽

Yemisi Takwoingi

Keyword(s):

Diagnostic Accuracy ◽

Diagnostic Test ◽

Sensitivity And Specificity ◽

Diagnostic Tests ◽

Clinical Situation ◽

Diagnostic Test Accuracy ◽

Test Accuracy ◽

Trade Off ◽

Specificity Test ◽

Uncertainty Of Results

Diagnostic accuracy studies are fundamental for the assessment of diagnostic tests. Researchers need to understand the implications of their chosen design, opting for comparative designs where possible. Researchers should analyse test accuracy studies using the appropriate methods, acknowledging the uncertainty of results and avoiding overstating conclusions and ignoring the clinical situation which should inform the trade-off between sensitivity and specificity. Test accuracy studies should be reported with transparency using the STAndards for the Reporting of Diagnostic accuracy studies (STARD) checklist.

Download Full-text

Implications of covariate induced test dependence on the diagnostic accuracy of latent class analysis in pulmonary tuberculosis

10.21203/rs.3.rs-900353/v1 ◽

2021 ◽

Author(s):

Alfred Kipyegon Keter ◽

Lutgarde Lynen ◽

Alastair van Heerden ◽

Els Goetghebeur ◽

Bart K.M. Jacobs

Keyword(s):

Pulmonary Tuberculosis ◽

Diagnostic Accuracy ◽

Diagnostic Test ◽

Sensitivity And Specificity ◽

Diagnostic Tests ◽

Latent Class ◽

Test Accuracy ◽

Test Results ◽

Test Sensitivity ◽

Correct Estimation

Abstract Background Lack of a perfect reference standard for pulmonary tuberculosis (PTB) diagnosis complicates assessment of accuracy of new diagnostic tests. Alternative strategies such as discrepant resolution and use of composite reference standards may lead to incorrect inferences on disease prevalence and diagnostic test sensitivity and specificity. Latent class analysis (LCA), a statistical method for analyzing diagnostic test results in the absence of a gold standard, allows correct estimation under strict assumptions. The model assumes that the diagnostic tests are independent conditional on the true disease status and that the diagnostic test sensitivity and specificity remain constant across subpopulations. These assumptions are violated when a factor such as severe comorbidity affects the prevalence and/or alters the diagnostic test performance. We aim to provide guidance on correct estimation of the prevalence and diagnostic test accuracy based on LCA when a known factor induces dependence among the diagnostic tests. If unaccounted for, this dependence may lead to misleading inferences. Methods Through likelihood evaluation and simulation we examined implications of likely model violations on estimation of prevalence, sensitivity and specificity among passive case-finding presumptive PTB patients with or without HIV. We generated independent results for five diagnostic tests conditional on PTB and HIV. We performed Bayesian LCA, separately for five and three diagnostic tests using four working models with or without constant PTB prevalence and diagnostic test accuracy across HIV subpopulations. Results In evaluating three diagnostic tests, the models accounting for heterogeneity in diagnostic accuracy produced consistent estimates while the models ignoring it produced biased estimates. The model ignoring heterogeneity in PTB prevalence is less problematic. When evaluating five diagnostic tests, the models were robust to violation of the assumptions. Conclusions Well-chosen covariate-specific adaptations of the model can avoid bias implied by recognized heterogeneity in PTB patient populations generating otherwise dependent test results in LCA.

Download Full-text

Diagnostic accuracy of the 4AT for delirium detection in older adults: systematic review and meta-analysis

Age and Ageing ◽

10.1093/ageing/afaa224 ◽

2020 ◽

Author(s):

Zoë Tieges ◽

Alasdair M J Maclullich ◽

Atul Anand ◽

Claire Brookes ◽

Marica Cassarino ◽

...

Keyword(s):

Systematic Review ◽

Older Adults ◽

Clinical Practice ◽

Diagnostic Accuracy ◽

Diagnostic Test ◽

Methodological Quality ◽

Meta Analysis ◽

Diagnostic Test Accuracy ◽

Test Accuracy ◽

Cochrane Central Register

Abstract Objective Detection of delirium in hospitalised older adults is recommended in national and international guidelines. The 4 ‘A’s Test (4AT) is a short (<2 minutes) instrument for delirium detection that is used internationally as a standard tool in clinical practice. We performed a systematic review and meta-analysis of diagnostic test accuracy of the 4AT for delirium detection. Methods We searched MEDLINE, EMBASE, PsycINFO, CINAHL, clinicaltrials.gov and the Cochrane Central Register of Controlled Trials, from 2011 (year of 4AT release on the website www.the4AT.com) until 21 December 2019. Inclusion criteria were: older adults (≥65 years); diagnostic accuracy study of the 4AT index test when compared to delirium reference standard (standard diagnostic criteria or validated tool). Methodological quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. Pooled estimates of sensitivity and specificity were generated from a bivariate random effects model. Results Seventeen studies (3,702 observations) were included. Settings were acute medicine, surgery, a care home and the emergency department. Three studies assessed performance of the 4AT in stroke. The overall prevalence of delirium was 24.2% (95% CI 17.8–32.1%; range 10.5–61.9%). The pooled sensitivity was 0.88 (95% CI 0.80–0.93) and the pooled specificity was 0.88 (95% CI 0.82–0.92). Excluding the stroke studies, the pooled sensitivity was 0.86 (95% CI 0.77–0.92) and the pooled specificity was 0.89 (95% CI 0.83–0.93). The methodological quality of studies varied but was moderate to good overall. Conclusions The 4AT shows good diagnostic test accuracy for delirium in the 17 available studies. These findings support its use in routine clinical practice in delirium detection. PROSPERO Registration number CRD42019133702.

Download Full-text

33 Diagnostic Test Accuracy of the 4AT for Delirium Detection: Systematic Review and Meta-Analysis

Age and Ageing ◽

10.1093/ageing/afab029.12 ◽

2021 ◽

Vol 50 (Supplement_1) ◽

pp. i7-i11

Author(s):

Z Tieges ◽

A M J MacLullich ◽

A Anand ◽

M Cassaroni ◽

M O'Connor ◽

...

Keyword(s):

Systematic Review ◽

Older Adults ◽

Diagnostic Test ◽

Sensitivity And Specificity ◽

Assessment Tool ◽

Meta Analysis ◽

Assessment Method ◽

Diagnostic Test Accuracy ◽

Test Accuracy ◽

Cochrane Central Register

Abstract Introduction Detection of delirium in hospitalised older adults is recommended in national and international guidelines. The 4 ‘A’s Test (4AT; www.the4AT.com) is a short (<2 min) instrument for delirium detection that is used internationally as a standard tool in clinical practice. We performed a systematic review and meta-analysis of diagnostic test accuracy of the 4AT for delirium detection. Methods We searched the following electronic databases through Ovid: MEDLINE, Embase, and PsycINFO. Additional databases were searched: CINAHL (EBSCOhost), clinicaltrials.gov and Cochrane Central Register of Controlled Trials from 2011 (4AT publication) until 21 December 2019. Inclusion criteria: older adults (≥65) across any setting of care except critical care; validation study of the 4AT against a delirium reference standard (standard diagnostic criteria or validated tool). Two reviewers independently screened abstracts and papers and performed the data extraction. Pooled estimates of sensitivity and specificity were generated from a bivariate random effects model. Results 17 studies (n = 3,701 observations) were included. Various settings including acute medicine, surgery, stroke wards and the emergency department were represented. The overall prevalence of delirium was 24.2% (95% CI 17.8–32.1%; range 10.5–61.9%). The pooled sensitivity was 0.88 (95% CI 0.80–0.93) and the pooled specificity was 0.88 (95% CI 0.82–0.92). The methodological quality of studies was mostly good. Conclusions The 4AT is now supported by a substantial evidence base comparable to other well-studied tools such as the Confusion Assessment Method (CAM). The strong pooled sensitivity and specificity findings for the 4AT in this meta-analysis along with its brevity and lack of need for specific training provide support for its use as an effective assessment tool for delirium.

Download Full-text