scholarly journals Observed Differences in Diagnostic Test Accuracy between Patient Subgroups: Is It Real or Due to Reference Standard Misclassification?

2007 ◽  
Vol 53 (10) ◽  
pp. 1725-1729 ◽  
Author(s):  
Corné Biesheuvel ◽  
Les Irwig ◽  
Patrick Bossuyt

Abstract Before a new test is introduced in clinical practice, its accuracy should be assessed. In the past decade, researchers have put an increased emphasis on exploring differences in test sensitivity and specificity between patient subgroups. If the reference standard is imperfect and the prevalence of the target condition differs among subgroups, apparent differences in test sensitivity and specificity between subgroups may be caused by reference standard misclassification. We provide guidance on how to determine whether observed differences may be explained by reference standard misclassification. Such misclassification may be ascertained by examining how the apparent sensitivity and specificity change with the prevalence of the target condition in the subgroups.

2019 ◽  
Author(s):  
Choon Han Tan ◽  
Bhone Myint Kyaw ◽  
Helen Smith ◽  
Colin S Tan ◽  
Lorainne Tudor Car

BACKGROUND Diabetic retinopathy (DR), a common complication of diabetes mellitus, is the leading cause of impaired vision in adults worldwide. Smartphone ophthalmoscopy involves using a smartphone camera for digital retinal imaging. Utilizing smartphones to detect DR is potentially more affordable, accessible, and easier to use than conventional methods. OBJECTIVE This study aimed to determine the diagnostic accuracy of various smartphone ophthalmoscopy approaches for detecting DR in diabetic patients. METHODS We performed an electronic search on the Medical Literature Analysis and Retrieval System Online (MEDLINE), EMBASE, and Cochrane Library for literature published from January 2000 to November 2018. We included studies involving diabetic patients, which compared the diagnostic accuracy of smartphone ophthalmoscopy for detecting DR to an accurate or commonly employed reference standard, such as indirect ophthalmoscopy, slit-lamp biomicroscopy, and tabletop fundus photography. Two reviewers independently screened studies against the inclusion criteria, extracted data, and assessed the quality of included studies using the Quality Assessment of Diagnostic Accuracy Studies–2 tool, with disagreements resolved via consensus. Sensitivity and specificity were pooled using the random effects model. A summary receiver operating characteristic (SROC) curve was constructed. This review is reported in line with the Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies guidelines. RESULTS In all, nine studies involving 1430 participants were included. Most studies were of high quality, except one study with limited applicability because of its reference standard. The pooled sensitivity and specificity for detecting any DR was 87% (95% CI 74%-94%) and 94% (95% CI 81%-98%); mild nonproliferative DR (NPDR) was 39% (95% CI 10%-79%) and 95% (95% CI 91%-98%); moderate NPDR was 71% (95% CI 57%-81%) and 95% (95% CI 88%-98%); severe NPDR was 80% (95% CI 49%-94%) and 97% (95% CI 88%-99%); proliferative DR (PDR) was 92% (95% CI 79%-97%) and 99% (95% CI 96%-99%); diabetic macular edema was 79% (95% CI 63%-89%) and 93% (95% CI 82%-97%); and referral-warranted DR was 91% (95% CI 86%-94%) and 89% (95% CI 56%-98%). The area under SROC curve ranged from 0.879 to 0.979. The diagnostic odds ratio ranged from 11.3 to 1225. CONCLUSIONS We found heterogeneous evidence showing that smartphone ophthalmoscopy performs well in detecting DR. The diagnostic accuracy for PDR was highest. Future studies should standardize reference criteria and classification criteria and evaluate other available forms of smartphone ophthalmoscopy in primary care settings.


VASA ◽  
2020 ◽  
Vol 49 (3) ◽  
pp. 195-204
Author(s):  
Djamila M. Rojoa ◽  
Ahmad Q. D. Lodhi ◽  
Nikos Kontopodis ◽  
Christos V. Ioannou ◽  
Nicos Labropoulos ◽  
...  

Summary: Background: The correct diagnosis of internal carotid artery (ICA) occlusion is crucial as it limits unnecessary intervention, whereas correct identification of patients with severe ICA stenosis is paramount in decision making and selecting patients who would benefit from intervention. We aimed to evaluate the accuracy of ultrasonography (US) in the diagnosis of ICA occlusion. Methods: We conducted a systematic review in compliance with the Preferred Reporting Items for a Systematic Review and Meta-analysis (PRISMA) of diagnostic test accuracy studies. We interrogated electronic bibliographic sources using a combination of free text and thesaurus terms to identify studies assessing the diagnostic accuracy of US in ICA occlusion. We used a mixed-effects logistic regression bivariate model to estimate summary sensitivity and specificity. We developed hierarchical summary receiver operating characteristic (HSROC) curves. Results: We identified 23 studies reporting a total of 5,675 arteries of which 722 were proven to be occluded by the reference standard. The reference standard was digital subtraction or cerebral angiography in all but two studies, which used surgery to ascertain a carotid occlusion. The pooled estimates for sensitivity and specificity were 0.97 (95% confidence interval (CI) 0.94 to 0.99) and 0.99 (95% CI 0.98 to 1.00), respectively. The diagnostic odds ratio was 3,846.15 (95% CI 1,375.74 to 10,752.65). The positive and negative likelihood ratio were 114.71 (95% CI 58.84 to 223.63) and 0.03 (95% CI 0.01 to 0.06), respectively. Conclusions: US is a reliable and accurate method in diagnosing ICA occlusion. US can be used as a screening tool with cross-sectional imaging being reserved for ambiguous cases.


10.2196/16658 ◽  
2020 ◽  
Vol 22 (5) ◽  
pp. e16658
Author(s):  
Choon Han Tan ◽  
Bhone Myint Kyaw ◽  
Helen Smith ◽  
Colin S Tan ◽  
Lorainne Tudor Car

Background Diabetic retinopathy (DR), a common complication of diabetes mellitus, is the leading cause of impaired vision in adults worldwide. Smartphone ophthalmoscopy involves using a smartphone camera for digital retinal imaging. Utilizing smartphones to detect DR is potentially more affordable, accessible, and easier to use than conventional methods. Objective This study aimed to determine the diagnostic accuracy of various smartphone ophthalmoscopy approaches for detecting DR in diabetic patients. Methods We performed an electronic search on the Medical Literature Analysis and Retrieval System Online (MEDLINE), EMBASE, and Cochrane Library for literature published from January 2000 to November 2018. We included studies involving diabetic patients, which compared the diagnostic accuracy of smartphone ophthalmoscopy for detecting DR to an accurate or commonly employed reference standard, such as indirect ophthalmoscopy, slit-lamp biomicroscopy, and tabletop fundus photography. Two reviewers independently screened studies against the inclusion criteria, extracted data, and assessed the quality of included studies using the Quality Assessment of Diagnostic Accuracy Studies–2 tool, with disagreements resolved via consensus. Sensitivity and specificity were pooled using the random effects model. A summary receiver operating characteristic (SROC) curve was constructed. This review is reported in line with the Preferred Reporting Items for a Systematic Review and Meta-analysis of Diagnostic Test Accuracy Studies guidelines. Results In all, nine studies involving 1430 participants were included. Most studies were of high quality, except one study with limited applicability because of its reference standard. The pooled sensitivity and specificity for detecting any DR was 87% (95% CI 74%-94%) and 94% (95% CI 81%-98%); mild nonproliferative DR (NPDR) was 39% (95% CI 10%-79%) and 95% (95% CI 91%-98%); moderate NPDR was 71% (95% CI 57%-81%) and 95% (95% CI 88%-98%); severe NPDR was 80% (95% CI 49%-94%) and 97% (95% CI 88%-99%); proliferative DR (PDR) was 92% (95% CI 79%-97%) and 99% (95% CI 96%-99%); diabetic macular edema was 79% (95% CI 63%-89%) and 93% (95% CI 82%-97%); and referral-warranted DR was 91% (95% CI 86%-94%) and 89% (95% CI 56%-98%). The area under SROC curve ranged from 0.879 to 0.979. The diagnostic odds ratio ranged from 11.3 to 1225. Conclusions We found heterogeneous evidence showing that smartphone ophthalmoscopy performs well in detecting DR. The diagnostic accuracy for PDR was highest. Future studies should standardize reference criteria and classification criteria and evaluate other available forms of smartphone ophthalmoscopy in primary care settings.


2021 ◽  
Author(s):  
Victoria Nyawira Nyaga ◽  
Marc Arbyn

Abstract BackgroundAlthough statistical procedures for pooling of several epidemiological metrics are generally available in statistical packages, those for meta-analysis of diagnostic test accuracy studies including options for multivariate regression are lacking. Fitting regression models and the processing of the estimates often entails lengthy and tedious calculations. Therefore, packaging appropriate statistical procedures in a robust and user-friendly program is of great interest to the scientific community. Methodsmetadta is a statistical program for pooling of diagnostic accuracy test data in Stata. It implements both the bivariate random-effects and fixed-effects model, allows for meta-regression, and presents the results in tables, a forest plot and/or summary receiver operating characteristic (SROC) plot. For a model without covariates, it also quantifies heterogeneity using an I2 statistic that accounts for the mean-variance relationship, and correlation between sensitivity and specificity, a typical characteristic of diagnostic data. To demonstrate metadta, we applied the program on two published meta-analyses on: 1) the sensitivity and specificity of cytology and other markers including telomerase for primary diagnosis of bladder cancer; and 2) the accuracy of human papillomavirus testing on self-collected versus clinician-collected samples to detect cervical precancer.ResultsWithout requiring a continuity correction, metadta generated a pooled sensitivity and specificity of 0.77 [95% CI: 0.70, 0.82] and 0.91 [95% CI: 0.75, 0.97] respectively of telomerase for the diagnosis of primary bladder cancer. metadta allowed to assess the relative accuracy of human Papilloma virus (HPV) testing on self- versus clinician-taken specimens in matched studies taking into account two covariates. Under the condition of using assays based on target-amplification, HPV tests were similarly sensitive to detect cervical pre-cancer, irrespective of clinical setting. ConclusionThe metadta program implements state of art statistical procedures in an attempt to close the gap between methodological statisticians and systematic reviewers. With metadta, we hope to popularize even further, the use of appropriate statistical methods for diagnostic meta-analysis.


Methodology ◽  
2020 ◽  
Vol 16 (3) ◽  
pp. 258-277
Author(s):  
Johny J. Pambabay-Calero ◽  
Sergio A. Bauz-Olvera ◽  
Ana B. Nieto-Librero ◽  
Maria Purificación Galindo-Villardón ◽  
Ana B. Sánchez-García

Although measures such as sensitivity and specificity are used in the study of diagnostic test accuracy, these are not appropriate for integrating heterogeneous studies. Therefore, it is essential to assess in detail all related aspects prior to integrating a set of studies so that the correct model can then be selected. This work describes the scheme employed for making decisions regarding the use of the R, STATA and SAS statistical programs. We used the R Program Meta-Analysis of Diagnostic Accuracy package for determining the correlation between sensitivity and specificity. This package considers fixed, random and mixed effects models and provides excellent summaries and assesses heterogeneity. For selecting various cutoff points in the meta-analysis, we used the STATA module for meta-analytical integration of diagnostic test accuracy studies, which produces bivariate outputs for heterogeneity.


2021 ◽  
Vol 184 (2) ◽  
pp. E5-E9
Author(s):  
Alice J Sitch ◽  
Olaf M Dekkers ◽  
Barnaby R Scholefield ◽  
Yemisi Takwoingi

Diagnostic accuracy studies are fundamental for the assessment of diagnostic tests. Researchers need to understand the implications of their chosen design, opting for comparative designs where possible. Researchers should analyse test accuracy studies using the appropriate methods, acknowledging the uncertainty of results and avoiding overstating conclusions and ignoring the clinical situation which should inform the trade-off between sensitivity and specificity. Test accuracy studies should be reported with transparency using the STAndards for the Reporting of Diagnostic accuracy studies (STARD) checklist.


2021 ◽  
Author(s):  
Alfred Kipyegon Keter ◽  
Lutgarde Lynen ◽  
Alastair van Heerden ◽  
Els Goetghebeur ◽  
Bart K.M. Jacobs

Abstract Background Lack of a perfect reference standard for pulmonary tuberculosis (PTB) diagnosis complicates assessment of accuracy of new diagnostic tests. Alternative strategies such as discrepant resolution and use of composite reference standards may lead to incorrect inferences on disease prevalence and diagnostic test sensitivity and specificity. Latent class analysis (LCA), a statistical method for analyzing diagnostic test results in the absence of a gold standard, allows correct estimation under strict assumptions. The model assumes that the diagnostic tests are independent conditional on the true disease status and that the diagnostic test sensitivity and specificity remain constant across subpopulations. These assumptions are violated when a factor such as severe comorbidity affects the prevalence and/or alters the diagnostic test performance. We aim to provide guidance on correct estimation of the prevalence and diagnostic test accuracy based on LCA when a known factor induces dependence among the diagnostic tests. If unaccounted for, this dependence may lead to misleading inferences. Methods Through likelihood evaluation and simulation we examined implications of likely model violations on estimation of prevalence, sensitivity and specificity among passive case-finding presumptive PTB patients with or without HIV. We generated independent results for five diagnostic tests conditional on PTB and HIV. We performed Bayesian LCA, separately for five and three diagnostic tests using four working models with or without constant PTB prevalence and diagnostic test accuracy across HIV subpopulations. Results In evaluating three diagnostic tests, the models accounting for heterogeneity in diagnostic accuracy produced consistent estimates while the models ignoring it produced biased estimates. The model ignoring heterogeneity in PTB prevalence is less problematic. When evaluating five diagnostic tests, the models were robust to violation of the assumptions. Conclusions Well-chosen covariate-specific adaptations of the model can avoid bias implied by recognized heterogeneity in PTB patient populations generating otherwise dependent test results in LCA.


2020 ◽  
Author(s):  
Zoë Tieges ◽  
Alasdair M J Maclullich ◽  
Atul Anand ◽  
Claire Brookes ◽  
Marica Cassarino ◽  
...  

Abstract Objective Detection of delirium in hospitalised older adults is recommended in national and international guidelines. The 4 ‘A’s Test (4AT) is a short (<2 minutes) instrument for delirium detection that is used internationally as a standard tool in clinical practice. We performed a systematic review and meta-analysis of diagnostic test accuracy of the 4AT for delirium detection. Methods We searched MEDLINE, EMBASE, PsycINFO, CINAHL, clinicaltrials.gov and the Cochrane Central Register of Controlled Trials, from 2011 (year of 4AT release on the website www.the4AT.com) until 21 December 2019. Inclusion criteria were: older adults (≥65 years); diagnostic accuracy study of the 4AT index test when compared to delirium reference standard (standard diagnostic criteria or validated tool). Methodological quality was assessed using the Quality Assessment of Diagnostic Accuracy Studies-2 tool. Pooled estimates of sensitivity and specificity were generated from a bivariate random effects model. Results Seventeen studies (3,702 observations) were included. Settings were acute medicine, surgery, a care home and the emergency department. Three studies assessed performance of the 4AT in stroke. The overall prevalence of delirium was 24.2% (95% CI 17.8–32.1%; range 10.5–61.9%). The pooled sensitivity was 0.88 (95% CI 0.80–0.93) and the pooled specificity was 0.88 (95% CI 0.82–0.92). Excluding the stroke studies, the pooled sensitivity was 0.86 (95% CI 0.77–0.92) and the pooled specificity was 0.89 (95% CI 0.83–0.93). The methodological quality of studies varied but was moderate to good overall. Conclusions The 4AT shows good diagnostic test accuracy for delirium in the 17 available studies. These findings support its use in routine clinical practice in delirium detection. PROSPERO Registration number CRD42019133702.


2021 ◽  
Vol 50 (Supplement_1) ◽  
pp. i7-i11
Author(s):  
Z Tieges ◽  
A M J MacLullich ◽  
A Anand ◽  
M Cassaroni ◽  
M O'Connor ◽  
...  

Abstract Introduction Detection of delirium in hospitalised older adults is recommended in national and international guidelines. The 4 ‘A’s Test (4AT; www.the4AT.com) is a short (<2 min) instrument for delirium detection that is used internationally as a standard tool in clinical practice. We performed a systematic review and meta-analysis of diagnostic test accuracy of the 4AT for delirium detection. Methods We searched the following electronic databases through Ovid: MEDLINE, Embase, and PsycINFO. Additional databases were searched: CINAHL (EBSCOhost), clinicaltrials.gov and Cochrane Central Register of Controlled Trials from 2011 (4AT publication) until 21 December 2019. Inclusion criteria: older adults (≥65) across any setting of care except critical care; validation study of the 4AT against a delirium reference standard (standard diagnostic criteria or validated tool). Two reviewers independently screened abstracts and papers and performed the data extraction. Pooled estimates of sensitivity and specificity were generated from a bivariate random effects model. Results 17 studies (n = 3,701 observations) were included. Various settings including acute medicine, surgery, stroke wards and the emergency department were represented. The overall prevalence of delirium was 24.2% (95% CI 17.8–32.1%; range 10.5–61.9%). The pooled sensitivity was 0.88 (95% CI 0.80–0.93) and the pooled specificity was 0.88 (95% CI 0.82–0.92). The methodological quality of studies was mostly good. Conclusions The 4AT is now supported by a substantial evidence base comparable to other well-studied tools such as the Confusion Assessment Method (CAM). The strong pooled sensitivity and specificity findings for the 4AT in this meta-analysis along with its brevity and lack of need for specific training provide support for its use as an effective assessment tool for delirium.


Sign in / Sign up

Export Citation Format

Share Document