scholarly journals How does under-reporting of negative and inconclusive results affect the false-positive rate in meta-analysis? A simulation study

BMJ Open ◽  
2014 ◽  
Vol 4 (8) ◽  
pp. e004831-e004831 ◽  
Author(s):  
M. Kicinski
2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.


2018 ◽  
Author(s):  
Qianying Wang ◽  
Jing Liao ◽  
Kaitlyn Hair ◽  
Alexandra Bannach-Brown ◽  
Zsanett Bahor ◽  
...  

AbstractBackgroundMeta-analysis is increasingly used to summarise the findings identified in systematic reviews of animal studies modelling human disease. Such reviews typically identify a large number of individually small studies, testing efficacy under a variety of conditions. This leads to substantial heterogeneity, and identifying potential sources of this heterogeneity is an important function of such analyses. However, the statistical performance of different approaches (normalised compared with standardised mean difference estimates of effect size; stratified meta-analysis compared with meta-regression) is not known.MethodsUsing data from 3116 experiments in focal cerebral ischaemia to construct a linear model predicting observed improvement in outcome contingent on 25 independent variables. We used stochastic simulation to attribute these variables to simulated studies according to their prevalence. To ascertain the ability to detect an effect of a given variable we introduced in addition this “variable of interest” of given prevalence and effect. To establish any impact of a latent variable on the apparent influence of the variable of interest we also introduced a “latent confounding variable” with given prevalence and effect, and allowed the prevalence of the variable of interest to be different in the presence and absence of the latent variable.ResultsGenerally, the normalised mean difference (NMD) approach had higher statistical power than the standardised mean difference (SMD) approach. Even when the effect size and the number of studies contributing to the meta-analysis was small, there was good statistical power to detect the overall effect, with a low false positive rate. For detecting an effect of the variable of interest, stratified meta-analysis was associated with a substantial false positive rate with NMD estimates of effect size, while using an SMD estimate of effect size had very low statistical power. Univariate and multivariable meta-regression performed substantially better, with low false positive rate for both NMD and SMD approaches; power was higher for NMD than for SMD. The presence or absence of a latent confounding variables only introduced an apparent effect of the variable of interest when there was substantial asymmetry in the prevalence of the variable of interest in the presence or absence of the confounding variable.ConclusionsIn meta-analysis of data from animal studies, NMD estimates of effect size should be used in preference to SMD estimates, and meta-regression should, where possible, be chosen over stratified meta-analysis. The power to detect the influence of the variable of interest depends on the effect of the variable of interest and its prevalence, but unless effects are very large adequate power is only achieved once at least 100 experiments are included in the meta-analysis.


2017 ◽  
Vol 52 (12) ◽  
pp. 1168-1170 ◽  
Author(s):  
Zachary K. Winkelmann ◽  
Ashley K. Crossway

Reference/Citation:  Harmon KG, Zigman M, Drezner JA. The effectiveness of screening history, physical exam, and ECG to detect potentially lethal cardiac disorders in athletes: a systematic review/meta-analysis. J Electrocardiol. 2015;48(3):329–338. Clinical Question:  Which screening method should be considered best practice to detect potentially lethal cardiac disorders during the preparticipation physical examination (PE) of athletes? Data Sources:  The authors completed a comprehensive literature search of MEDLINE, CINAHL, Cochrane Library, Embase, Physiotherapy Evidence Database (PEDro), and SPORTDiscus from January 1996 to November 2014. The following key words were used individually and in combination: ECG, athlete, screening, pre-participation, history, and physical. A manual review of reference lists and key journals was performed to identify additional studies. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed for this review. Study Selection:  Studies selected for this analysis involved (1) outcomes of cardiovascular screening in athletes using the history, PE, and electrocardiogram (ECG); (2) history questions and PE based on the American Heart Association recommendations and guidelines; and (3) ECGs interpreted following modern standards. The exclusion criteria were (1) articles not in English, (2) conference abstracts, and (3) clinical commentary articles. Study quality was assessed on a 7-point scale for risk of bias; a score of 7 indicated the highest quality. Articles with potential bias were excluded. Data Extraction:  Data included number and sex of participants, number of true- and false-positives and negatives, type of ECG criteria used, number of cardiac abnormalities, and specific cardiac conditions. The sensitivity, specificity, false-positive rate, and positive predictive value of each screening tool were calculated and summarized using a bivariate random-effects meta-analysis model. Main Results:  Fifteen articles reporting on 47 137 athletes were fully reviewed. The overall quality of the 15 articles ranged from 5 to 7 on the 7-item assessment scale (ie, participant selection criteria, representative sample, prospective data with at least 1 positive finding, modern ECG criteria used for screening, cardiovascular screening history and PE per American Heart Association guidelines, individual test outcomes reported, and abnormal screening findings evaluated by appropriate diagnostic testing). The athletes (66% males and 34% females) were ethnically and racially diverse, were from several countries, and ranged in age from 5 to 39 years. The sensitivity and specificity of the screening methods were, respectively, ECG, 94% and 93%; history, 20% and 94%; and PE, 9% and 97%. The overall false-positive rate for ECG (6%) was less than that for history (8%) or PE (10%). The positive likelihood ratios of each screening method were 14.8 for ECG, 3.22 for history, and 2.93 for PE. The negative likelihood ratios were 0.055 for ECG, 0.85 for history, and 0.93 for PE. A total of 160 potentially lethal cardiovascular conditions were detected, for a rate of 0.3%, or 1 in 294 patients. The most common conditions were Wolff-Parkinson-White syndrome (n = 67, 42%), long QT syndrome (n = 18, 11%), hypertrophic cardiomyopathy (n = 18, 11%), dilated cardiomyopathy (n = 11, 7%), coronary artery disease or myocardial ischemia (n = 9, 6%), and arrhythmogenic right ventricular cardiomyopathy (n = 4, 3%). Conclusions:  The most effective strategy to screen athletes for cardiovascular disease was ECG. This test was 5 times more sensitive than history and 10 times more sensitive than PE, and it had a higher positive likelihood ratio, lower negative likelihood ratio, and lower false-positive rate than history or PE. The 12-lead ECG interpreted using modern criteria should be considered the best practice in screening athletes for cardiovascular disease, and the use of history and PE alone as screening tools should be reevaluated.


2020 ◽  
Author(s):  
Se Jin Cho ◽  
Leonard Sunwoo ◽  
Sung Hyun Baik ◽  
Yun Jung Bae ◽  
Byung Se Choi ◽  
...  

Abstract Background Accurate detection of brain metastasis (BM) is important for cancer patients. We aimed to systematically review the performance and quality of machine-learning-based BM detection on MRI in the relevant literature. Methods A systematic literature search was performed for relevant studies reported before April 27, 2020. We assessed the quality of the studies using modified tailored questionnaires of the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria and the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Pooled detectability was calculated using an inverse-variance weighting model. Results A total of 12 studies were included, which showed a clear transition from classical machine learning (cML) to deep learning (DL) after 2018. The studies on DL used a larger sample size than those on cML. The cML and DL groups also differed in the composition of the dataset, and technical details such as data augmentation. The pooled proportions of detectability of BM were 88.7% (95% CI, 84–93%) and 90.1% (95% CI, 84–95%) in the cML and DL groups, respectively. The false-positive rate per person was lower in the DL group than the cML group (10 vs 135, P < 0.001). In the patient selection domain of QUADAS-2, three studies (25%) were designated as high risk due to non-consecutive enrollment and arbitrary exclusion of nodules. Conclusion A comparable detectability of BM with a low false-positive rate per person was found in the DL group compared with the cML group. Improvements are required in terms of quality and study design.


2021 ◽  
Author(s):  
Bihan Li ◽  
Zhiyong Sun ◽  
Ying Liu ◽  
Jinxia Song ◽  
Lijuan Wang ◽  
...  

Abstract Purpose Severe combined immunodeficiency (SCID) threatens the newborns' living qualities. The aim of our study is to evaluate the the value of detecting the copy quantities of T-cell receptor excision circles (TRECs) in screening for the newborns' SCID. Methods We searched eligible studies in PubMed, Web of science, EMBASE, the Cochrane Library, China National Knowledge Infrastructure (CNKI) and chose the Endnote software to filter the whole studies. Meta-analysis was conducted with meta-disc 1.4 and STATA 12.0 software. We used sensitivity and false positive rate (FPR) to evaluate this detecting method; subgroup analysis to explore the sources of heterogeneity. The literature quality was assessed by the quality assessment of diagnostic accuracy studies (QUADAS-Ⅱ). Results Fifteen studies were enrolled containing 3,570 cases in the decreased TREC copy quantities group and 12,186,899 cases in control group. All the included studies are of high quality. The heterogeneity in Diagnostic Odds Ratio (DOR) is 77.9%. The summarized estimates revealed that the pooled sensitivity is 100% (95% confidence interval (CI): 99–100%), the false positive rate is 0.00 (95% CI: 0.00–0.00) and Positive Predict Value (PPV) is 0.14 (95% CI: 0.08-0.19). The current evidence shows detecting the copy quantities of TRECs has the screening value with a high sensitivity and positive likelihood ratio (LR+) for newborns' SCID without published bias (p>0.05). Our result suggests 90 and 40 copies/ul of TRECs as the primary and secondary cut-off value. Conclusion Detecting the individual TREC contents could be used as the premise to diagnosis the newborns' SCID.


2008 ◽  
Vol 4;11 (8;4) ◽  
pp. 513-538
Author(s):  
Lee Wolfer

Background: Lumbar provocation discography is a controversial diagnostic test. Currently, there is a concern that the test has an unacceptably high false-positive rate. Study Design: Systematic review and meta-analysis. Objective: To perform a systematic review of lumbar discography studies in asymptomatic subjects and discs with a meta-analysis of the specificity and false-positive rate of lumbar discography. Methods: A systematic review of the literature was conducted via a PUBMED search. Studies were included/excluded according to modern discography practices. Study quality was scored using the Agency for Healthcare Research and Quality (AHRQ) instrument for diagnostic accuracy. Specific data was extracted from studies and tabulated per published criteria and standards to determine the false-positive rates. A meta-analysis of specificity was performed. Strength of evidence was rated according to the AHRQ U.S. Preventive Services Task Force (USPSTF) criteria. Results: Eleven studies were identified. Combining all extractable data, a false-positive rate of 9.3% per patient and 6.0% per disc is obtained. Data pooled from asymptomatic subjects without low back pain or confounding factors, shows a false-positive rate of 3.0% per patient and 2.1% per disc. In data pooled from chronic pain patients, asymptomatic of low back pain, the false-positive rate is 5.6% per patient and 3.85% per disc. Chronic pain does not appear to be a confounding factor in a chronic low back pain patient’s ability to distinguish between positive (pathologic) and negative (non-pathologic) discs. Among additional asymptomatic patient subgroups analyzed, the false-positive rate per patient and per disc is as follows: iliac crest pain 12.5% and 7.1%; chronic neck pain 0%; somatization disorder 50% and 22.2%, and, post-discectomy 15% and 9.1%, respectively. In patients with chronic backache, no false-positive rate can be calculated. Low-pressure positive criteria (≤ 15 psi a.o.) can obtain a low false-positive rate. Based on meta-analysis of the data, using the ISIS standard, discography has a specificity of 0.94 (95% CI 0.88 – 0.98) and a false-positive rate of 0.06. Conclusions: Strength of evidence is level II-2 based on the Agency for Healthcare Research Quality (USPSTF) for the diagnostic accuracy of discography. Contrary to recently published studies, discography has a low false-positive rate for the diagnosis of discogenic pain. Key words: Meta-analysis, lumbar discography, false-positive, asymptomatic subjects


2002 ◽  
Vol 41 (01) ◽  
pp. 37-41 ◽  
Author(s):  
S. Shung-Shung ◽  
S. Yu-Chien ◽  
Y. Mei-Due ◽  
W. Hwei-Chung ◽  
A. Kao

Summary Aim: Even with careful observation, the overall false-positive rate of laparotomy remains 10-15% when acute appendicitis was suspected. Therefore, the clinical efficacy of Tc-99m HMPAO labeled leukocyte (TC-WBC) scan for the diagnosis of acute appendicitis in patients presenting with atypical clinical findings is assessed. Patients and Methods: Eighty patients presenting with acute abdominal pain and possible acute appendicitis but atypical findings were included in this study. After intravenous injection of TC-WBC, serial anterior abdominal/pelvic images at 30, 60, 120 and 240 min with 800k counts were obtained with a gamma camera. Any abnormal localization of radioactivity in the right lower quadrant of the abdomen, equal to or greater than bone marrow activity, was considered as a positive scan. Results: 36 out of 49 patients showing positive TC-WBC scans received appendectomy. They all proved to have positive pathological findings. Five positive TC-WBC were not related to acute appendicitis, because of other pathological lesions. Eight patients were not operated and clinical follow-up after one month revealed no acute abdominal condition. Three of 31 patients with negative TC-WBC scans received appendectomy. They also presented positive pathological findings. The remaining 28 patients did not receive operations and revealed no evidence of appendicitis after at least one month of follow-up. The overall sensitivity, specificity, accuracy, positive and negative predictive values for TC-WBC scan to diagnose acute appendicitis were 92, 78, 86, 82, and 90%, respectively. Conclusion: TC-WBC scan provides a rapid and highly accurate method for the diagnosis of acute appendicitis in patients with equivocal clinical examination. It proved useful in reducing the false-positive rate of laparotomy and shortens the time necessary for clinical observation.


1993 ◽  
Vol 32 (02) ◽  
pp. 175-179 ◽  
Author(s):  
B. Brambati ◽  
T. Chard ◽  
J. G. Grudzinskas ◽  
M. C. M. Macintosh

Abstract:The analysis of the clinical efficiency of a biochemical parameter in the prediction of chromosome anomalies is described, using a database of 475 cases including 30 abnormalities. A comparison was made of two different approaches to the statistical analysis: the use of Gaussian frequency distributions and likelihood ratios, and logistic regression. Both methods computed that for a 5% false-positive rate approximately 60% of anomalies are detected on the basis of maternal age and serum PAPP-A. The logistic regression analysis is appropriate where the outcome variable (chromosome anomaly) is binary and the detection rates refer to the original data only. The likelihood ratio method is used to predict the outcome in the general population. The latter method depends on the data or some transformation of the data fitting a known frequency distribution (Gaussian in this case). The precision of the predicted detection rates is limited by the small sample of abnormals (30 cases). Varying the means and standard deviations (to the limits of their 95% confidence intervals) of the fitted log Gaussian distributions resulted in a detection rate varying between 42% and 79% for a 5% false-positive rate. Thus, although the likelihood ratio method is potentially the better method in determining the usefulness of a test in the general population, larger numbers of abnormal cases are required to stabilise the means and standard deviations of the fitted log Gaussian distributions.


Sign in / Sign up

Export Citation Format

Share Document