The statistical significance of meta-analyses is frequently fragile: definition of a fragility index for meta-analyses

Abstract Purpose Meta-analyses occupy the highest level of evidence and thereby guide clinical decision-making. Recently, randomised-controlled trials were evaluated for the robustness of their findings by calculating the fragility index. The fragility index is the number of events that needs to be added to one treatment arm until the statistical significance collapses. We, therefore, aimed to evaluate the robustness of paediatric surgical meta-analyses. Methods We searched MEDLINE for paediatric surgical meta-analyses in the last decade. All meta-analyses on a paediatric surgical condition were eligible for analysis if they based their conclusion on a statistically significant meta-analysis. Results We screened 303 records and conducted a full-text evaluation of 60 manuscripts. Of them, 39 were included in our analysis that conducted 79 individual meta-analyses with significant results. Median fragility index was 5 (Q25–Q75% 2–11). Median fragility in relation to included patients was 0.77% (Q25–Q75% 0.29–1.87%). Conclusion Paediatric surgical meta-analyses are often fragile. In almost 60% of results, the statistical significance depends on less than 1% of the included population. However, as the fragility index is just a transformation of the P value, it basically conveys the same information in a different format. It therefore should be avoided.

Download Full-text

Type M error can explain Weisburd's Paradox

10.31235/osf.io/ahnd4 ◽

2016 ◽

Author(s):

andrew gelman

Keyword(s):

Quality Control ◽

Sample Size ◽

Negative Correlation ◽

Statistical Power ◽

Statistical Significance ◽

Research Goal ◽

Definition Of ◽

Meta Analyses ◽

Post Hoc ◽

Weisburd Paradox

Simple calculations seem to show that larger studies should have higher statistical power, but empirical meta-analyses of published work in criminology have found zero or weak correlations between sample size and estimated statistical power. This is “Weisburd’s paradox” and has been attributed by Weisburd, Petrosino, and Mason (1993) to a difficulty in maintaining quality control as studies get larger, and attributed by Nelson, Wooditch, and Dario (2014) to a negative correlation between sample sizes and the underlying sizes of the effects being measured. We argue against the necessity of both these explanations, instead suggeting that the apparent Weisburd paradox is an artifact of systematic overestimation inherent in post-hoc power calculations, a bias that is large with small n. Furthermore, we recommend abandoning the use of statistical power as a measure of the strength of a study, because implicit in the definition of power is the bad idea of statistical significance as a research goal.

Download Full-text

A new definition of statistical significance

Authorea ◽

10.22541/au.151140201.11838644 ◽

2017 ◽

Author(s):

Thomas F Heston

Keyword(s):

Statistical Significance ◽

Definition Of

Download Full-text

State-of-the-art colorectal disease: postoperative ileus

International Journal of Colorectal Disease ◽

10.1007/s00384-021-03939-1 ◽

2021 ◽

Author(s):

Nils P. Sommer ◽

Reiner Schneider ◽

Sven Wehner ◽

Jörg C. Kalff ◽

Tim O. Vilz

Keyword(s):

Postoperative Ileus ◽

Enhanced Recovery ◽

Treatment Strategies ◽

Colorectal Disease ◽

Opioid Receptor Antagonists ◽

Hospital Stays ◽

Definition Of ◽

Meta Analyses ◽

Epidural Catheters ◽

Prolonged Hospital

Abstract Purpose Postoperative Ileus (POI) remains an important complication for patients after abdominal surgery with an incidence of 10–27% representing an everyday issue for abdominal surgeons. It accounts for patients’ discomfort, increased morbidity, prolonged hospital stays, and a high economic burden. This review outlines the current understanding of POI pathophysiology and focuses on preventive treatments that have proven to be effective or at least show promising effects. Methods Pathophysiology and recommendations for POI treatment are summarized on the basis of a selective literature review. Results While a lot of therapies have been researched over the past decades, many of them failed to prove successful in meta-analyses. To date, there is no evidence-based treatment once POI has manifested. In the era of enhanced recovery after surgery or fast track regimes, a few approaches show a beneficial effect in preventing POI: multimodal, opioid-sparing analgesia with placement of epidural catheters or transverse abdominis plane block; μ-opioid-receptor antagonists; and goal-directed fluid therapy and in general the use of minimally invasive surgery. Conclusion The results of different studies are often contradictory, as a concise definition of POI and reliable surrogate endpoints are still absent. These will be needed to advance POI research and provide clinicians with consistent data to improve the treatment strategies.

Download Full-text

Validity of observational evidence on putative risk and protective factors: appraisal of 3744 meta-analyses on 57 topics

BMC Medicine ◽

10.1186/s12916-021-02020-6 ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Perrine Janiaud ◽

Arnav Agarwal ◽

Ioanna Tzoulaki ◽

Evropi Theodoratou ◽

Konstantinos K. Tsilidis ◽

...

Keyword(s):

Protective Factors ◽

Observational Studies ◽

Prediction Interval ◽

Statistical Significance ◽

Median Number ◽

Small Study ◽

Study Heterogeneity ◽

Large Heterogeneity ◽

Meta Analyses ◽

No Gold Standard

Abstract Background The validity of observational studies and their meta-analyses is contested. Here, we aimed to appraise thousands of meta-analyses of observational studies using a pre-specified set of quantitative criteria that assess the significance, amount, consistency, and bias of the evidence. We also aimed to compare results from meta-analyses of observational studies against meta-analyses of randomized controlled trials (RCTs) and Mendelian randomization (MR) studies. Methods We retrieved from PubMed (last update, November 19, 2020) umbrella reviews including meta-analyses of observational studies assessing putative risk or protective factors, regardless of the nature of the exposure and health outcome. We extracted information on 7 quantitative criteria that reflect the level of statistical support, the amount of data, the consistency across different studies, and hints pointing to potential bias. These criteria were level of statistical significance (pre-categorized according to 10−6, 0.001, and 0.05 p-value thresholds), sample size, statistical significance for the largest study, 95% prediction intervals, between-study heterogeneity, and the results of tests for small study effects and for excess significance. Results 3744 associations (in 57 umbrella reviews) assessed by a median number of 7 (interquartile range 4 to 11) observational studies were eligible. Most associations were statistically significant at P < 0.05 (61.1%, 2289/3744). Only 2.6% of associations had P < 10−6, ≥1000 cases (or ≥20,000 participants for continuous factors), P < 0.05 in the largest study, 95% prediction interval excluding the null, and no large between-study heterogeneity, small study effects, or excess significance. Across the 57 topics, large heterogeneity was observed in the proportion of associations fulfilling various quantitative criteria. The quantitative criteria were mostly independent from one another. Across 62 associations assessed in both RCTs and in observational studies, 37.1% had effect estimates in opposite directions and 43.5% had effect estimates differing beyond chance in the two designs. Across 94 comparisons assessed in both MR and observational studies, such discrepancies occurred in 30.8% and 54.7%, respectively. Conclusions Acknowledging that no gold-standard exists to judge whether an observational association is genuine, statistically significant results are common in observational studies, but they are rarely convincing or corroborated by randomized evidence.

Download Full-text

A Meta-Analysis of Brain-Derived Neurotrophic Factor Effects on Brain Volume in Schizophrenia: Genotype and Serum Levels

Neuropsychobiology ◽

10.1159/000514126 ◽

2021 ◽

pp. 1-14

Author(s):

Anthony O. Ahmed ◽

Samantha Kramer ◽

Naama Hofman ◽

John Flynn ◽

Marie Hansen ◽

...

Keyword(s):

Neurotrophic Factor ◽

Psychotic Disorders ◽

Brain Volume ◽

Statistical Significance ◽

Brain Derived Neurotrophic Factor ◽

Volume Data ◽

Analytic Review ◽

Volume Measurements ◽

Meta Analyses ◽

Serum Bdnf

Aim: The Val66Met single-nucleotide polymorphism (SNP) on the BDNF gene has established pleiotropic effects on schizophrenia incidence and morphologic alterations in the illness. The effects of brain-derived neurotrophic factor (BDNF) on brain volume measurements are however mixed seeming to be less established for most brain regions. The current meta-analytic review examined (1) the association of the Val66Met SNP and brain volume alterations in schizophrenia by comparing Met allele carriers to Val/Val homozygotes and (2) the association of serum BDNF with brain volume measurements. Method: Studies included in the meta-analyses were identified through an electronic search of PubMed and PsycInfo (via EBSCO) for English language publications from January 2000 through December 2017. Included studies had conducted a genotyping procedure of Val66Met or obtained assays of serum BDNF and obtained brain volume data in patients with psychotic disorders. Nonhuman studies were excluded. Results: Study 1 which included 52 comparisons of Met carriers and Val/Val homozygotes found evidence of lower right and left hippocampal volumes among Met allele carriers with schizophrenia. Frontal measurements, while also lower among Met carriers, did not achieve statistical significance. Study 2 which included 7 examinations of the correlation between serum BDNF and brain volume found significant associations between serum BDNF levels and right and left hippocampal volume with lower BDNF corresponding to lower volumes. Discussion: The meta-analyses provided evidence of associations between brain volume alterations in schizophrenia and variations on the Val66Met SNP and serum BDNF. Given the limited number of studies, it remains unclear if BDNF effects are global or regionally specific.

Download Full-text

Recurrent Instability after Arthroscopic Glenoid Labral Repair with a Minimum of Three Points of Fixation: Do the Number of Anchors or Fixation Points Correlate to Outcomes?

Surgical Technology Online ◽

10.52198/21.sti.38.os1411 ◽

2021 ◽

Author(s):

Sean Mc Millan ◽

Brian Fliegel ◽

Michael Stark ◽

Elizabeth Ford ◽

Manuel Pontes ◽

...

Keyword(s):

Shoulder Instability ◽

Statistical Significance ◽

Labral Repair ◽

Secondary Outcome ◽

Consecutive Series ◽

Recurrent Instability ◽

Significant Difference ◽

Fixation Points ◽

Definition Of ◽

Shoulder Score

Introduction: The goal of this study was to evaluate the recurrence rate of instability following arthroscopic Bankart repairs in regard to the number and types of fixation utilized. A Bankart lesion is a tear in the anteroinferior capsulolabral complex within the shoulder, occurring in association with an anterior shoulder dislocation. These injuries can result in glenoid bone loss, decreased range of motion, and recurrent shoulder instability. Successful repair of these lesions has been reported in the literature with repair constructs that have three points of fixation. However, the definition of “one point of fixation” is yet to be fully elucidated. Materials and Methods: A consecutive series of arthroscopically repaired Bankart lesions were evaluated pertaining to the points of fixation required to achieve shoulder stability. This included the number, position, and types of anchors used. Patients consented to complete a series of surveys at a minimum of two years postoperatively. The primary outcome was to determine recurrent instability via the UCLA Shoulder Score, the ROWE Shoulder Instability Score, and the Oxford Shoulder Score. A secondary outcome included pain on a Visual Analog Scale (VAS). Results: There were 116 patients reviewed, 46 patients achieved three points of fixation in their surgical repair via two anchors and 70 patients achieved a similar fixation with three or more anchors. There was no significant difference in the mean age, gender, or body mass index (BMI). Patients receiving two anchors demonstrated recurrent instability 8.7% of the time (4 of 46 patients). Patients who received three or more anchors demonstrated recurrent instability 8.6% of the time (6 of 70 patients). Overall, there was no statistical significance between the number/types of anchors used. Between the two cohorts, there was no statistically significant difference found between VAS, ROWE, UCLA, and Oxford Scores. There was a significant difference in pain reported on the VAS scale with an average VAS score of 0.43 versus 2.5 in those without and with recurrent instability respectively. Conclusion: Contention still exists surrounding the exact definition of “a point of fixation” in arthroscopic Bankart repairs. Three-point constructs can be created through a variety of combinations including anchors and sutures, ultimately achieving the goal of a stable shoulder.

Download Full-text

Structural model of forest ecosystem services’ payment for water quality improvement

SIMI 2019, Abstract Book - International Symposium "The Environmental and The Industry" ◽

10.21698/simi.2019.fp31 ◽

2019 ◽

pp. 238-245

Author(s):

Mariyana Lybenova ◽

Alexandre Chikalanov ◽

Yulian Petkov

Keyword(s):

Quality Improvement ◽

Ecosystem Services ◽

Structural Model ◽

Use Case ◽

Forest Ecosystem Services ◽

Payment Schemes ◽

Structured Information ◽

Definition Of ◽

Meta Analyses ◽

Web Platform

The publication deals with the development of a structural model of payment schemes for ecosystem services (PES) oriented to usage of forests for water, soil and microclimate quality improvement. Proposed structural model is built on the Meta analyses base of more than 50 PES schemes worldwide. The proposed structural model has three top down levels – groups of categories, categories and attributes. There are seven groups of categories, 17 categories and more than 120 attributes. The structured information about studied PES schemes is stored in a warehouse managed by unique web platform created by the authors. An important part presented study is the developed generic use case of PES schemes with definition of seven participated actors.

Download Full-text

Competence of mentally ill patients: a comparative empirical study

Psychological Medicine ◽

10.1017/s0033291703008389 ◽

2003 ◽

Vol 33 (8) ◽

pp. 1463-1471 ◽

Cited By ~ 91

Author(s):

J. VOLLMANN ◽

A. BAUER ◽

H. DANKER-HOPFE ◽

H. HELMCHEN

Keyword(s):

Clinical Assessment ◽

Assessment Tool ◽

Statistical Significance ◽

Chi Square ◽

Impaired Performance ◽

Test Instrument ◽

Chi Square Test ◽

Assessment Of Competence ◽

Definition Of ◽

Attending Physician

Background. This study investigates the competence of patients with dementia, depression and schizophrenia to make treatment decisions. The outcome of an objective test instrument is presented and compared with clinical assessment of competence by the attending physician.Method. The MacArthur Competence Assessment Tool-Treatment (MacCAT-T), a test instrument to assess abilities in different standards of competence, was administered to patients with diagnoses of dementia (N=31), depression (N=35) and schizophrenia (N=43). Statistical significance of group differences in the MacCAT-T results were tested with the chi-square test. The concordance of the test and clinical assessment of competence by the attending physician were evaluated by Cohen's kappa coefficient.Results. Patients with dementia, as a group, showed significantly more often impaired performance than those with schizophrenia who were still more impaired than depressed patients. Patients were classified as impaired or not depending on the standards used. By combination of all standards substantially more patients were classified as impaired than by clinical assessment (67·7 v. 48·4% of patients with dementia, 20·0 v. 2·9% of patients with depression, 53·5 v. 18·4% of patients with schizophrenia).Conclusions. Using different standards of competence the study showed substantial differences among patients with dementia, depression and schizophrenia. The high proportion of patients identified as incompetent raises several ethical questions, in particular, those referring to the selection of standards or the definition of cut-offs for incompetence. The discrepancy between clinical and formal evaluations points out the influence of the used procedure on competence judgements.

Download Full-text

Incidence of Respiratory Symptoms for Residents Living Near a Petrochemical Industrial Complex: A Meta-Analysis

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17072474 ◽

2020 ◽

Vol 17 (7) ◽

pp. 2474

Author(s):

Wen-Wen Chang ◽

Hathaichon Boonhat ◽

Ro-Ting Lin

Keyword(s):

Respiratory Symptoms ◽

Meta Analysis ◽

Statistical Significance ◽

Random Effect ◽

Study Groups ◽

Industrial Complex ◽

Residential Exposure ◽

Increased Risk ◽

Subgroup Analyses ◽

Meta Analyses

The air pollution emitted by petrochemical industrial complexes (PICs) may affect the respiratory health of surrounding residents. Previous meta-analyses have indicated a higher risk of lung cancer mortality and incidence among residents near a PIC. Therefore, in this study, a meta-analysis was conducted to estimate the degree to which PIC exposure increases the risk of the development of nonmalignant respiratory symptoms among residents. We followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines to systematically identify, select, and critically appraise relevant research. Finally, we identified 16 study groups reporting 5 types of respiratory symptoms: asthma, bronchitis, cough, rhinitis, and wheezing. We estimated pooled odds ratios (ORs) using random-effect models and investigated the robustness of pooled estimates in subgroup analyses by location, observation period, and age group. We determined that residential exposure to a PIC was significantly associated with a higher incidence of cough (OR = 1.35), wheezing (OR = 1.28), bronchitis (OR = 1.26), rhinitis (OR = 1.17), and asthma (OR = 1.15), although the latter two associations did not reach statistical significance. Subgroup analyses suggested that the association remained robust across different groups for cough and bronchitis. We identified high heterogeneity for asthma, rhinitis, and wheezing, which could be due to higher ORs in South America. Our meta-analysis indicates that residential exposure to a PIC is associated with an increased risk of nonmalignant respiratory symptoms.

Download Full-text