Identification of and Correction for Publication Bias: Comment

2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Andrews & Kasy (2019) propose an approach for adjusting effect sizes in meta-analysis for publication bias. We use the Andrews-Kasy estimator to adjust the result of 15 meta-analyses and compare the adjusted results to 15 large-scale multiple labs replication studies estimating the same effects. The pre-registered replications provide precisely estimated effect sizes, which do not suffer from publication bias. The Andrews-Kasy approach leads to a moderate reduction of the inflated effect sizes in the meta-analyses. However, the approach still overestimates effect sizes by a factor of about two or more and has an estimated false positive rate of between 57% and 100%.

2019 ◽  
Author(s):  
Amanda Kvarven ◽  
Eirik Strømland ◽  
Magnus Johannesson

Many researchers rely on meta-analysis to summarize research evidence. However, recent replication projects in the behavioral sciences suggest that effect sizes of original studies are overestimated, and this overestimation is typically attributed to publication bias and selective reporting of scientific results. As the validity of meta-analyses depends on the primary studies, there is a concern that systematic overestimation of effect sizes may translate into biased meta-analytic effect sizes. We compare the results of meta-analyses to large-scale pre-registered replications in psychology carried out at multiple labs. The multiple labs replications provide relatively precisely estimated effect sizes, which do not suffer from publication bias or selective reporting. Searching the literature, 17 meta-analyses – spanning more than 1,200 effect sizes and more than 370,000 participants - on the same topics as multiple labs replications are identified. We find that the meta-analytic effect sizes are significantly different from the replication effect sizes for 12 out of the 17 meta-replication pairs. These differences are systematic and on average meta-analytic effect sizes are about three times as large as the replication effect sizes.


2018 ◽  
Author(s):  
Qianying Wang ◽  
Jing Liao ◽  
Kaitlyn Hair ◽  
Alexandra Bannach-Brown ◽  
Zsanett Bahor ◽  
...  

AbstractBackgroundMeta-analysis is increasingly used to summarise the findings identified in systematic reviews of animal studies modelling human disease. Such reviews typically identify a large number of individually small studies, testing efficacy under a variety of conditions. This leads to substantial heterogeneity, and identifying potential sources of this heterogeneity is an important function of such analyses. However, the statistical performance of different approaches (normalised compared with standardised mean difference estimates of effect size; stratified meta-analysis compared with meta-regression) is not known.MethodsUsing data from 3116 experiments in focal cerebral ischaemia to construct a linear model predicting observed improvement in outcome contingent on 25 independent variables. We used stochastic simulation to attribute these variables to simulated studies according to their prevalence. To ascertain the ability to detect an effect of a given variable we introduced in addition this “variable of interest” of given prevalence and effect. To establish any impact of a latent variable on the apparent influence of the variable of interest we also introduced a “latent confounding variable” with given prevalence and effect, and allowed the prevalence of the variable of interest to be different in the presence and absence of the latent variable.ResultsGenerally, the normalised mean difference (NMD) approach had higher statistical power than the standardised mean difference (SMD) approach. Even when the effect size and the number of studies contributing to the meta-analysis was small, there was good statistical power to detect the overall effect, with a low false positive rate. For detecting an effect of the variable of interest, stratified meta-analysis was associated with a substantial false positive rate with NMD estimates of effect size, while using an SMD estimate of effect size had very low statistical power. Univariate and multivariable meta-regression performed substantially better, with low false positive rate for both NMD and SMD approaches; power was higher for NMD than for SMD. The presence or absence of a latent confounding variables only introduced an apparent effect of the variable of interest when there was substantial asymmetry in the prevalence of the variable of interest in the presence or absence of the confounding variable.ConclusionsIn meta-analysis of data from animal studies, NMD estimates of effect size should be used in preference to SMD estimates, and meta-regression should, where possible, be chosen over stratified meta-analysis. The power to detect the influence of the variable of interest depends on the effect of the variable of interest and its prevalence, but unless effects are very large adequate power is only achieved once at least 100 experiments are included in the meta-analysis.


2015 ◽  
Vol 2015 ◽  
pp. 1-12
Author(s):  
Siyu Lin ◽  
Hao Wu

Cyber-physical systems (CPSs) connect with the physical world via communication networks, which significantly increases security risks of CPSs. To secure the sensitive data, secure forwarding is an essential component of CPSs. However, CPSs require high dimensional multiattribute and multilevel security requirements due to the significantly increased system scale and diversity, and hence impose high demand on the secure forwarding information query and storage. To tackle these challenges, we propose a practical secure data forwarding scheme for CPSs. Considering the limited storage capability and computational power of entities, we adopt bloom filter to store the secure forwarding information for each entity, which can achieve well balance between the storage consumption and query delay. Furthermore, a novel link-based bloom filter construction method is designed to reduce false positive rate during bloom filter construction. Finally, the effects of false positive rate on the performance of bloom filter-based secure forwarding with different routing policies are discussed.


2017 ◽  
Author(s):  
Robbie Cornelis Maria van Aert ◽  
Jelte M. Wicherts ◽  
Marcel A. L. M. van Assen

Publication bias is a substantial problem for the credibility of research in general and of meta-analyses in particular, as it yields overestimated effects and may suggest the existence of non-existing effects. Although there is consensus that publication bias exists, how strongly it affects different scientific literatures is currently less well-known. We examined evidence of publication bias in a large-scale data set of primary studies that were included in 83 meta-analyses published in Psychological Bulletin (representing meta-analyses from psychology) and 499 systematic reviews from the Cochrane Database of Systematic Reviews (CDSR; representing meta-analyses from medicine). Publication bias was assessed on all homogeneous subsets (3.8% of all subsets of meta-analyses published in Psychological Bulletin) of primary studies included in meta-analyses, because publication bias methods do not have good statistical properties if the true effect size is heterogeneous. The Monte-Carlo simulation study revealed that the creation of homogeneous subsets resulted in challenging conditions for publication bias methods since the number of effect sizes in a subset was rather small (median number of effect sizes equaled 6). No evidence of bias was obtained using the publication bias tests. Overestimation was minimal but statistically significant, providing evidence of publication bias that appeared to be similar in both fields. These and other findings, in combination with the small percentages of statistically significant primary effect sizes (28.9% and 18.9% for subsets published in Psychological Bulletin and CDSR), led to the conclusion that evidence for publication bias in the studied homogeneous subsets is weak, but suggestive of mild publication bias in both psychology and medicine.


2021 ◽  
Author(s):  
Ying-Shi Sun ◽  
Yu-Hong Qu ◽  
Dong Wang ◽  
Yi Li ◽  
Lin Ye ◽  
...  

Abstract Background: Computer-aided diagnosis using deep learning algorithms has been initially applied in the field of mammography, but there is no large-scale clinical application.Methods: This study proposed to develop and verify an artificial intelligence model based on mammography. Firstly, retrospectively collected mammograms from six centers were randomized to a training dataset and a validation dataset for establishing the model. Secondly, the model was tested by comparing 12 radiologists’ performance with and without it. Finally, prospectively multicenter mammograms were diagnosed by radiologists with the model. The detection and diagnostic capabilities were evaluated using the free-response receiver operating characteristic (FROC) curve and ROC curve.Results: The sensitivity of model for detecting lesion after matching was 0.908 for false positive rate of 0.25 in unilateral images. The area under ROC curve (AUC) to distinguish the benign from malignant lesions was 0.855 (95% CI: 0.830, 0.880). The performance of 12 radiologists with the model was higher than that of radiologists alone (AUC: 0.852 vs. 0.808, P = 0.005). The mean reading time of with the model was shorter than that of reading alone (80.18 s vs. 62.28 s, P = 0.03). In prospective application, the sensitivity of detection reached 0.887 at false positive rate of 0.25; the AUC of radiologists with the model was 0.983 (95% CI: 0.978, 0.988), with sensitivity, specificity, PPV, and NPV of 94.36%, 98.07%, 87.76%, and 99.09%, respectively.Conclusions: The artificial intelligence model exhibits high accuracy for detecting and diagnosing breast lesions, improves diagnostic accuracy and saves time.Trial registration: NCT, NCT03708978. Registered 17 April 2018, https://register.clinicaltrials.gov/prs/app/ NCT03708978


2020 ◽  
Vol 25 (1) ◽  
pp. 51-72 ◽  
Author(s):  
Christian Franz Josef Woll ◽  
Felix D. Schönbrodt

Abstract. Recent meta-analyses come to conflicting conclusions about the efficacy of long-term psychoanalytic psychotherapy (LTPP). Our first goal was to reproduce the most recent meta-analysis by Leichsenring, Abbass, Luyten, Hilsenroth, and Rabung (2013) who found evidence for the efficacy of LTPP in the treatment of complex mental disorders. Our replicated effect sizes were in general slightly smaller. Second, we conducted an updated meta-analysis of randomized controlled trials comparing LTPP (lasting for at least 1 year and 40 sessions) to other forms of psychotherapy in the treatment of complex mental disorders. We focused on a transparent research process according to open science standards and applied a series of elaborated meta-analytic procedures to test and control for publication bias. Our updated meta-analysis comprising 191 effect sizes from 14 eligible studies revealed small, statistically significant effect sizes at post-treatment for the outcome domains psychiatric symptoms, target problems, social functioning, and overall effectiveness (Hedges’ g ranging between 0.24 and 0.35). The effect size for the domain personality functioning (0.24) was not significant ( p = .08). No signs for publication bias could be detected. In light of a heterogeneous study set and some methodological shortcomings in the primary studies, these results should be interpreted cautiously. In conclusion, LTPP might be superior to other forms of psychotherapy in the treatment of complex mental disorders. Notably, our effect sizes represent the additional gain of LTPP versus other forms of primarily long-term psychotherapy. In this case, large differences in effect sizes are not to be expected.


2017 ◽  
Vol 52 (12) ◽  
pp. 1168-1170 ◽  
Author(s):  
Zachary K. Winkelmann ◽  
Ashley K. Crossway

Reference/Citation:  Harmon KG, Zigman M, Drezner JA. The effectiveness of screening history, physical exam, and ECG to detect potentially lethal cardiac disorders in athletes: a systematic review/meta-analysis. J Electrocardiol. 2015;48(3):329–338. Clinical Question:  Which screening method should be considered best practice to detect potentially lethal cardiac disorders during the preparticipation physical examination (PE) of athletes? Data Sources:  The authors completed a comprehensive literature search of MEDLINE, CINAHL, Cochrane Library, Embase, Physiotherapy Evidence Database (PEDro), and SPORTDiscus from January 1996 to November 2014. The following key words were used individually and in combination: ECG, athlete, screening, pre-participation, history, and physical. A manual review of reference lists and key journals was performed to identify additional studies. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines were followed for this review. Study Selection:  Studies selected for this analysis involved (1) outcomes of cardiovascular screening in athletes using the history, PE, and electrocardiogram (ECG); (2) history questions and PE based on the American Heart Association recommendations and guidelines; and (3) ECGs interpreted following modern standards. The exclusion criteria were (1) articles not in English, (2) conference abstracts, and (3) clinical commentary articles. Study quality was assessed on a 7-point scale for risk of bias; a score of 7 indicated the highest quality. Articles with potential bias were excluded. Data Extraction:  Data included number and sex of participants, number of true- and false-positives and negatives, type of ECG criteria used, number of cardiac abnormalities, and specific cardiac conditions. The sensitivity, specificity, false-positive rate, and positive predictive value of each screening tool were calculated and summarized using a bivariate random-effects meta-analysis model. Main Results:  Fifteen articles reporting on 47 137 athletes were fully reviewed. The overall quality of the 15 articles ranged from 5 to 7 on the 7-item assessment scale (ie, participant selection criteria, representative sample, prospective data with at least 1 positive finding, modern ECG criteria used for screening, cardiovascular screening history and PE per American Heart Association guidelines, individual test outcomes reported, and abnormal screening findings evaluated by appropriate diagnostic testing). The athletes (66% males and 34% females) were ethnically and racially diverse, were from several countries, and ranged in age from 5 to 39 years. The sensitivity and specificity of the screening methods were, respectively, ECG, 94% and 93%; history, 20% and 94%; and PE, 9% and 97%. The overall false-positive rate for ECG (6%) was less than that for history (8%) or PE (10%). The positive likelihood ratios of each screening method were 14.8 for ECG, 3.22 for history, and 2.93 for PE. The negative likelihood ratios were 0.055 for ECG, 0.85 for history, and 0.93 for PE. A total of 160 potentially lethal cardiovascular conditions were detected, for a rate of 0.3%, or 1 in 294 patients. The most common conditions were Wolff-Parkinson-White syndrome (n = 67, 42%), long QT syndrome (n = 18, 11%), hypertrophic cardiomyopathy (n = 18, 11%), dilated cardiomyopathy (n = 11, 7%), coronary artery disease or myocardial ischemia (n = 9, 6%), and arrhythmogenic right ventricular cardiomyopathy (n = 4, 3%). Conclusions:  The most effective strategy to screen athletes for cardiovascular disease was ECG. This test was 5 times more sensitive than history and 10 times more sensitive than PE, and it had a higher positive likelihood ratio, lower negative likelihood ratio, and lower false-positive rate than history or PE. The 12-lead ECG interpreted using modern criteria should be considered the best practice in screening athletes for cardiovascular disease, and the use of history and PE alone as screening tools should be reevaluated.


2020 ◽  
Author(s):  
Se Jin Cho ◽  
Leonard Sunwoo ◽  
Sung Hyun Baik ◽  
Yun Jung Bae ◽  
Byung Se Choi ◽  
...  

Abstract Background Accurate detection of brain metastasis (BM) is important for cancer patients. We aimed to systematically review the performance and quality of machine-learning-based BM detection on MRI in the relevant literature. Methods A systematic literature search was performed for relevant studies reported before April 27, 2020. We assessed the quality of the studies using modified tailored questionnaires of the Quality Assessment of Diagnostic Accuracy Studies 2 (QUADAS-2) criteria and the Checklist for Artificial Intelligence in Medical Imaging (CLAIM). Pooled detectability was calculated using an inverse-variance weighting model. Results A total of 12 studies were included, which showed a clear transition from classical machine learning (cML) to deep learning (DL) after 2018. The studies on DL used a larger sample size than those on cML. The cML and DL groups also differed in the composition of the dataset, and technical details such as data augmentation. The pooled proportions of detectability of BM were 88.7% (95% CI, 84–93%) and 90.1% (95% CI, 84–95%) in the cML and DL groups, respectively. The false-positive rate per person was lower in the DL group than the cML group (10 vs 135, P < 0.001). In the patient selection domain of QUADAS-2, three studies (25%) were designated as high risk due to non-consecutive enrollment and arbitrary exclusion of nodules. Conclusion A comparable detectability of BM with a low false-positive rate per person was found in the DL group compared with the cML group. Improvements are required in terms of quality and study design.


Sign in / Sign up

Export Citation Format

Share Document