Bias in emerging biomarkers for bipolar disorder

2016 ◽  
Vol 46 (11) ◽  
pp. 2287-2297 ◽  
Author(s):  
A. F. Carvalho ◽  
C. A. Köhler ◽  
B. S. Fernandes ◽  
J. Quevedo ◽  
K. W. Miskowiak ◽  
...  

BackgroundTo date no comprehensive evaluation has appraised the likelihood of bias or the strength of the evidence of peripheral biomarkers for bipolar disorder (BD). Here we performed an umbrella review of meta-analyses of peripheral non-genetic biomarkers for BD.MethodThe Pubmed/Medline, EMBASE and PsycInfo electronic databases were searched up to May 2015. Two independent authors conducted searches, examined references for eligibility, and extracted data. Meta-analyses in any language examining peripheral non-genetic biomarkers in participants with BD (across different mood states) compared to unaffected controls were included.ResultsSix references, which examined 13 biomarkers across 20 meta-analyses (5474 BD cases and 4823 healthy controls) met inclusion criteria. Evidence for excess of significance bias (i.e. bias favoring publication of ‘positive’ nominally significant results) was observed in 11 meta-analyses. Heterogeneity was high for (I2 ⩾ 50%) 16 meta-analyses. Only two biomarkers met criteria for suggestive evidence namely the soluble IL-2 receptor and morning cortisol. The median power of included studies, using the effect size of the largest dataset as the plausible true effect size of each meta-analysis, was 15.3%.ConclusionsOur findings suggest that there is an excess of statistically significant results in the literature of peripheral biomarkers for BD. Selective publication of ‘positive’ results and selective reporting of outcomes are possible mechanisms.

2017 ◽  
Vol 4 (2) ◽  
pp. 160254 ◽  
Author(s):  
Estelle Dumas-Mallet ◽  
Katherine S. Button ◽  
Thomas Boraud ◽  
Francois Gonon ◽  
Marcus R. Munafò

Studies with low statistical power increase the likelihood that a statistically significant finding represents a false positive result. We conducted a review of meta-analyses of studies investigating the association of biological, environmental or cognitive parameters with neurological, psychiatric and somatic diseases, excluding treatment studies, in order to estimate the average statistical power across these domains. Taking the effect size indicated by a meta-analysis as the best estimate of the likely true effect size, and assuming a threshold for declaring statistical significance of 5%, we found that approximately 50% of studies have statistical power in the 0–10% or 11–20% range, well below the minimum of 80% that is often considered conventional. Studies with low statistical power appear to be common in the biomedical sciences, at least in the specific subject areas captured by our search strategy. However, we also observe evidence that this depends in part on research methodology, with candidate gene studies showing very low average power and studies using cognitive/behavioural measures showing high average power. This warrants further investigation.


2020 ◽  
Author(s):  
Robbie Cornelis Maria van Aert ◽  
Joris Mulder

Meta-analysis methods are used to synthesize results of multiple studies on the same topic. The most frequently used statistical model in meta-analysis is the random-effects model containing parameters for the average effect, between-study variance in primary study's true effect size, and random effects for the study specific effects. We propose Bayesian hypothesis testing and estimation methods using the marginalized random-effects meta-analysis (MAREMA) model where the study specific true effects are regarded as nuisance parameters which are integrated out of the model. A flat prior distribution is placed on the overall effect size in case of estimation and a proper unit information prior for the overall effect size is proposed in case of hypothesis testing. For the between-study variance in true effect size, a proper uniform prior is placed on the proportion of total variance that can be attributed to between-study variability. Bayes factors are used for hypothesis testing that allow testing point and one-sided hypotheses. The proposed methodology has several attractive properties. First, the proposed MAREMA model encompasses models with a zero, negative, and positive between-study variance, which enables testing a zero between-study variance as it is not a boundary problem. Second, the methodology is suitable for default Bayesian meta-analyses as it requires no prior information about the unknown parameters. Third, the methodology can even be used in the extreme case when only two studies are available, because Bayes factors are not based on large sample theory. We illustrate the developed methods by applying it to two meta-analyses and introduce easy-to-use software in the R package BFpack to compute the proposed Bayes factors.


Angiology ◽  
2020 ◽  
Vol 71 (10) ◽  
pp. 934-941
Author(s):  
Yupeng Huang ◽  
Hongyan Jin ◽  
Guokang Yang

This meta-analysis was conducted to estimate associations between CDKN2B antisense ( CDKN2B-AS) polymorphisms and susceptibility to atherosclerotic cardio-cerebral vascular diseases (ASCVD). A systematic literature research of PubMed, Medline, Web of Science, Embase, and CNKI was performed to identify eligible studies. Overall, 34 studies were included for meta-analyses. Pooled overall analyses showed that rs1333040, rs1333049, rs2383206, and rs2383207 polymorphisms were associated with susceptibility to ASCVD in the whole population. Further analyses by ethnicity revealed that all investigated polymorphisms were associated with susceptibility to ASCVD in East Asians. Moreover, rs2383206, rs2383207, rs10757274, and rs10757278 polymorphisms were associated with susceptibility to ASCVD in West Asians, while rs2383206, rs10757274, and rs10757278 were associated with susceptibility to ASCVD in Caucasians. When we stratified eligible studies by type of disease, positive results were found for all investigated polymorphisms in patients with coronary artery disease (CAD) or myocardial infarction, whereas positive results were only detected for rs2383206 and rs10757274 polymorphisms in patients with ischemic stroke (IS). Our findings suggest that rs1333040, rs1333049, rs2383206, rs2383207, rs10757274, and rs10757278 polymorphisms might serve as genetic biomarkers of CAD, and rs2383206 and rs10757274 polymorphisms might serve as genetic biomarkers of IS.


2021 ◽  
Author(s):  
Anton Olsson-Collentine ◽  
Robbie Cornelis Maria van Aert ◽  
Marjan Bakker ◽  
Jelte M. Wicherts

There are arbitrary decisions to be made (i.e., researcher degrees of freedom) in the execution and reporting of most research. These decisions allow for many possible outcomes from a single study. Selective reporting of results from this ‘multiverse’ of outcomes, whether intentional (_p_-hacking) or not, can lead to inflated effect size estimates and false positive results in the literature. In this study, we examine and illustrate the consequences of researcher degrees of freedom in primary research, both for primary outcomes and for subsequent meta-analyses. We used a set of 10 preregistered multi-lab direct replication projects from psychology (Registered Replication Reports) with a total of 14 primary outcome variables, 236 labs and 37,602 participants. By exploiting researcher degrees of freedom in each project, we were able to compute between 3,840 and 2,621,440 outcomes per lab. We show that researcher degrees of freedom in primary research can cause substantial variability in effect size that we denote the Underlying Multiverse Variability (UMV). In our data, the median UMV across labs was 0.1 standard deviations (interquartile range = 0.09 – 0.15). In one extreme case, the effect size estimate could change by _d_ = 1.27, evidence that _p_-hacking in some (rare) cases can provide support for almost any conclusion. We also show that researcher degrees of freedom in primary research provide another source of uncertainty in meta-analysis beyond those usually estimated. This would not be a large concern for meta-analysis if researchers made all arbitrary decisions at random. However, emulating selective reporting of lab results led to inflation of meta-analytic average effect size estimates in our data by as much as 0.1 - 0.48 standard deviations, depending to a large degree on the number of possible outcomes at the lab level (i.e., multiverse size). Our results illustrate the importance of making research decisions transparent (e.g., through preregistration and multiverse analysis), evaluating studies for selective reporting, and whenever feasible making raw data available.


2018 ◽  
Author(s):  
Robbie Cornelis Maria van Aert

More and more scientific research gets published nowadays, asking for statistical methods that enable researchers to get an overview of the literature in a particular research field. For that purpose, meta-analysis methods were developed that can be used for statistically combining the effect sizes from independent primary studies on the same topic. My dissertation focuses on two issues that are crucial when conducting a meta-analysis: publication bias and heterogeneity in primary studies’ true effect sizes. Accurate estimation of both the meta-analytic effect size as well as the between-study variance in true effect size is crucial since the results of meta-analyses are often used for policy making. Publication bias distorts the results of a meta-analysis since it refers to situations where publication of a primary study depends on its results. We developed new meta-analysis methods, p-uniform and p-uniform*, which estimate effect sizes corrected for publication bias and also test for publication bias. Although the methods perform well in many conditions, these and the other existing methods are shown not to perform well when researchers use questionable research practices. Additionally, when publication bias is absent or limited, traditional methods that do not correct for publication bias outperform p¬-uniform and p-uniform*. Surprisingly, we found no strong evidence for the presence of publication bias in our pre-registered study on the presence of publication bias in a large-scale data set consisting of 83 meta-analyses and 499 systematic reviews published in the fields of psychology and medicine. We also developed two methods for meta-analyzing a statistically significant published original study and a replication of that study, which reflects a situation often encountered by researchers. One method is a frequentist whereas the other method is a Bayesian statistical method. Both methods are shown to perform better than traditional meta-analytic methods that do not take the statistical significance of the original study into account. Analytical studies of both methods also show that sometimes the original study is better discarded for optimal estimation of the true effect size. Finally, we developed a program for determining the required sample size in a replication analogous to power analysis in null hypothesis testing. Computing the required sample size with the method revealed that large sample sizes (approximately 650 participants) are required to be able to distinguish a zero from a small true effect.Finally, in the last two chapters we derived a new multi-step estimator for the between-study variance in primary studies’ true effect sizes, and examined the statistical properties of two methods (Q-profile and generalized Q-statistic method) to compute the confidence interval of the between-study variance in true effect size. We proved that the multi-step estimator converges to the Paule-Mandel estimator which is nowadays one of the recommended methods to estimate the between-study variance in true effect sizes. Two Monte-Carlo simulation studies showed that the coverage probabilities of Q-profile and generalized Q-statistic method can be substantially below the nominal coverage rate if the assumptions underlying the random-effects meta-analysis model were violated.


2020 ◽  
Vol 228 (1) ◽  
pp. 43-49 ◽  
Author(s):  
Michael Kossmeier ◽  
Ulrich S. Tran ◽  
Martin Voracek

Abstract. Currently, dedicated graphical displays to depict study-level statistical power in the context of meta-analysis are unavailable. Here, we introduce the sunset (power-enhanced) funnel plot to visualize this relevant information for assessing the credibility, or evidential value, of a set of studies. The sunset funnel plot highlights the statistical power of primary studies to detect an underlying true effect of interest in the well-known funnel display with color-coded power regions and a second power axis. This graphical display allows meta-analysts to incorporate power considerations into classic funnel plot assessments of small-study effects. Nominally significant, but low-powered, studies might be seen as less credible and as more likely being affected by selective reporting. We exemplify the application of the sunset funnel plot with two published meta-analyses from medicine and psychology. Software to create this variation of the funnel plot is provided via a tailored R function. In conclusion, the sunset (power-enhanced) funnel plot is a novel and useful graphical display to critically examine and to present study-level power in the context of meta-analysis.


2019 ◽  
Vol 227 (4) ◽  
pp. 261-279 ◽  
Author(s):  
Frank Renkewitz ◽  
Melanie Keiner

Abstract. Publication biases and questionable research practices are assumed to be two of the main causes of low replication rates. Both of these problems lead to severely inflated effect size estimates in meta-analyses. Methodologists have proposed a number of statistical tools to detect such bias in meta-analytic results. We present an evaluation of the performance of six of these tools. To assess the Type I error rate and the statistical power of these methods, we simulated a large variety of literatures that differed with regard to true effect size, heterogeneity, number of available primary studies, and sample sizes of these primary studies; furthermore, simulated studies were subjected to different degrees of publication bias. Our results show that across all simulated conditions, no method consistently outperformed the others. Additionally, all methods performed poorly when true effect sizes were heterogeneous or primary studies had a small chance of being published, irrespective of their results. This suggests that in many actual meta-analyses in psychology, bias will remain undiscovered no matter which detection method is used.


2021 ◽  
pp. 146531252110272
Author(s):  
Despina Koletsi ◽  
Anna Iliadi ◽  
Theodore Eliades

Objective: To evaluate all available evidence on the prediction of rotational tooth movements with aligners. Data sources: Seven databases of published and unpublished literature were searched up to 4 August 2020 for eligible studies. Data selection: Studies were deemed eligible if they included evaluation of rotational tooth movement with any type of aligner, through the comparison of software-based and actually achieved data after patient treatment. Data extraction and data synthesis: Data extraction was done independently and in duplicate and risk of bias assessment was performed with the use of the QUADAS-2 tool. Random effects meta-analyses with effect sizes and their 95% confidence intervals (CIs) were performed and the quality of the evidence was assessed through GRADE. Results: Seven articles were included in the qualitative synthesis, of which three contributed to meta-analyses. Overall results revealed a non-accurate prediction of the outcome for the software-based data, irrespective of the use of attachments or interproximal enamel reduction (IPR). Maxillary canines demonstrated the lowest percentage accuracy for rotational tooth movement (three studies: effect size = 47.9%; 95% CI = 27.2–69.5; P < 0.001), although high levels of heterogeneity were identified (I2: 86.9%; P < 0.001). Contrary, mandibular incisors presented the highest percentage accuracy for predicted rotational movement (two studies: effect size = 70.7%; 95% CI = 58.9–82.5; P < 0.001; I2: 0.0%; P = 0.48). Risk of bias was unclear to low overall, while quality of the evidence ranged from low to moderate. Conclusion: Allowing for all identified caveats, prediction of rotational tooth movements with aligner treatment does not appear accurate, especially for canines. Careful selection of patients and malocclusions for aligner treatment decisions remain challenging.


Circulation ◽  
2007 ◽  
Vol 116 (suppl_16) ◽  
Author(s):  
George A Diamond ◽  
Sanjay Kaul

Background A highly publicized meta-analysis of 42 clinical trials comprising 27,844 diabetics ignited a firestorm of controversy by charging that treatment with rosiglitazone was associated with a “…worrisome…” 43% greater risk of myocardial infarction ( p =0.03) and a 64% greater risk of cardiovascular death ( p =0.06). Objective The investigators excluded 4 trials from the infarction analysis and 19 trials from the mortality analysis in which no events were observed. We sought to determine if these exclusions biased the results. Methods We compared the index study to a Bayesian meta-analysis of the entire 42 trials (using odds ratio as the measure of effect size) and to fixed-effects and random-effects analyses with and without a continuity correction that adjusts for values of zero. Results The odds ratios and confidence intervals for the analyses are summarized in the Table . Odds ratios for infarction ranged from 1.43 to 1.22 and for death from 1.64 to 1.13. Corrected models resulted in substantially smaller odds ratios and narrower confidence intervals than did uncorrected models. Although corrected risks remain elevated, none are statistically significant (*p<0.05). Conclusions Given the fragility of the effect sizes and confidence intervals, the charge that roziglitazone increases the risk of adverse events is not supported by these additional analyses. The exaggerated values observed in the index study are likely the result of excluding the zero-event trials from analysis. Continuity adjustments mitigate this error and provide more consistent and reliable assessments of true effect size. Transparent sensitivity analyses should therefore be performed over a realistic range of the operative assumptions to verify the stability of such assessments especially when outcome events are rare. Given the relatively wide confidence intervals, additional data will be required to adjudicate these inconclusive results.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Liansheng Larry Tang ◽  
Michael Caudy ◽  
Faye Taxman

Multiple meta-analyses may use similar search criteria and focus on the same topic of interest, but they may yield different or sometimes discordant results. The lack of statistical methods for synthesizing these findings makes it challenging to properly interpret the results from multiple meta-analyses, especially when their results are conflicting. In this paper, we first introduce a method to synthesize the meta-analytic results when multiple meta-analyses use the same type of summary effect estimates. When meta-analyses use different types of effect sizes, the meta-analysis results cannot be directly combined. We propose a two-step frequentist procedure to first convert the effect size estimates to the same metric and then summarize them with a weighted mean estimate. Our proposed method offers several advantages over existing methods by Hemming et al. (2012). First, different types of summary effect sizes are considered. Second, our method provides the same overall effect size as conducting a meta-analysis on all individual studies from multiple meta-analyses. We illustrate the application of the proposed methods in two examples and discuss their implications for the field of meta-analysis.


Sign in / Sign up

Export Citation Format

Share Document