scholarly journals No Evidence Against the Greater Male Variability Hypothesis: A Commentary on Harrison et al.’s Meta-Analysis of Animal Personality

2022 ◽  
Author(s):  
Marco Del Giudice ◽  
Steven Gangestad

Harrison et al. (2021) set out to test the greater male variability hypothesis with respect to personality in non-human animals. Based on the non-significant results of their meta-analysis, they concluded that there is no evidence to support the hypothesis, and that biological explanations for greater male variability in human psychological traits should be called into question. Here, we show that these conclusions are unwarranted. Specifically: (a) in mammals, birds, and reptiles/amphibians, the magnitude of the sex differences in variability found in the meta-analysis is entirely in line with previous findings from both humans and non-human animals; (b) the generalized lack of statistical significance does not imply that effect sizes were too small to be considered meaningful, as the study was severely underpowered to detect effect sizes in the plausible range; (c) the results of the meta-analysis can be expected to underestimate the true magnitude of sex differences in the variability of personality, because the behavioral measures employed in most of the original studies contain large amounts of measurement error; and (d) variability effect sizes based on personality scores, latencies, and proportions suffer from lac of statistical validity, adding even more noise to the meta-analysis. In total, Harrison et al.’s study does nothing to disprove the greater male variability hypothesis in mammals, let alone in humans. To the extent that they are valid, the data remain compatible with a wide range of plausible scenarios.

1998 ◽  
Vol 21 (2) ◽  
pp. 228-235 ◽  
Author(s):  
Siu L. Chow

Entertaining diverse assumptions about empirical research, commentators give a wide range of verdicts on the NHSTP defence in Statistical significance. The null-hypothesis significance-test procedure (NHSTP) is defended in a framework in which deductive and inductive rules are deployed in theory corroboration in the spirit of Popper's Conjectures and refutations (1968b). The defensible hypothetico-deductive structure of the framework is used to make explicit the distinctions between (1) substantive and statistical hypotheses, (2) statistical alternative and conceptual alternative hypotheses, and (3) making statistical decisions and drawing theoretical conclusions. These distinctions make it easier to show that (1) H0 can be true, (2) the effect size is irrelevant to theory corroboration, and (3) “strong” hypotheses make no difference to NHSTP. Reservations about statistical power, meta-analysis, and the Bayesian approach are still warranted.


2020 ◽  
pp. 174077452096913
Author(s):  
Hwanhee Hong ◽  
Chenguang Wang ◽  
Gary L Rosner

Background/aims: Regulatory approval of a drug or device involves an assessment of not only the benefits but also the risks of adverse events associated with the therapeutic agent. Although randomized controlled trials (RCTs) are the gold standard for evaluating effectiveness, the number of treated patients in a single RCT may not be enough to detect a rare but serious side effect of the treatment. Meta-analysis plays an important role in the evaluation of the safety of medical products and has advantage over analyzing a single RCT when estimating the rate of adverse events. Methods: In this article, we compare 15 widely used meta-analysis models under both Bayesian and frequentist frameworks when outcomes are extremely infrequent or rare. We present extensive simulation study results and then apply these methods to a real meta-analysis that considers RCTs investigating the effect of rosiglitazone on the risks of myocardial infarction and of death from cardiovascular causes. Results: Our simulation studies suggest that the beta hyperprior method modeling treatment group-specific parameters and accounting for heterogeneity performs the best. Most models ignoring between-study heterogeneity give poor coverage probability when such heterogeneity exists. In the data analysis, different methods provide a wide range of log odds ratio estimates between rosiglitazone and control treatments with a mixed conclusion on their statistical significance based on 95% confidence (or credible) intervals. Conclusion: In the rare event setting, treatment effect estimates obtained from traditional meta-analytic methods may be biased and provide poor coverage probability. This trend worsens when the data have large between-study heterogeneity. In general, we recommend methods that first estimate the summaries of treatment-specific risks across studies and then relative treatment effects based on the summaries when appropriate. Furthermore, we recommend fitting various methods, comparing the results and model performance, and investigating any significant discrepancies among them.


2019 ◽  
Vol 35 (2) ◽  
pp. 350-356 ◽  
Author(s):  
Juan Botella ◽  
Juan I. Durán

Meta-analysis is a firmly established methodology and an integral part of the process of generating knowledge across the empirical sciences. Meta-analysis has also focused on methodology and has become a dominant critic of methodological shortcomings. We highlight several problematic issues on how we research in psychology: excess of heterogeneity in the results and difficulties for replication, publication bias, suboptimal methodological quality, and questionable practices of the researchers. These and other problems led to a “crisis of confidence” in psychology. We discuss how the meta-analytical perspective and its procedures can help to overcome the crisis. A more cooperative perspective, instead of a competitive one, can shift to consider replication as a more valuable contribution. Knowledge cannot be based in isolated studies. Given the nature of the object of study of psychology the natural unit to generate knowledge must be the estimated distribution of the effect sizes, not the dichotomous decision on statistical significance in specific studies. Some suggestions are offered on how to redirect researchers' research and practices, so that their personal interests and those of science as such are better aligned.


1990 ◽  
Vol 24 (3) ◽  
pp. 405-415 ◽  
Author(s):  
Nathaniel McConaghy

Meta-analysis replaced statistical significance with effect size in the hope of resolving controversy concerning evaluation of treatment effects. Statistical significance measured reliability of the effect of treatment, not its efficacy. It was strongly influenced by the number of subjects investigated. Effect size as assessed originally, eliminated this influence but by standardizing the size of the treatment effect could distort it. Meta-analyses which combine the results of studies which employ different subject types, outcome measures, treatment aims, no-treatment rather than placebo controls or therapists with varying experience can be misleading. To ensure discussion of these variables meta-analyses should be used as an aid rather than a substitute for literature review. While meta-analyses produce contradictory findings, it seems unwise to rely on the conclusions of an individual analysis. Their consistent finding that placebo treatments obtain markedly higher effect sizes than no treatment hopefully will render the use of untreated control groups obsolete.


2006 ◽  
Vol 37 (1) ◽  
pp. 3-14 ◽  
Author(s):  
STEPHEN KISELY ◽  
LESLIE ANNE CAMPBELL ◽  
ANITA SCOTT ◽  
NEIL J. PRESTON ◽  
JIANGUO XIAO

Background. There is limited randomized controlled trial (RCT) evidence for compulsory community treatment. Other study methods may clarify their effectiveness. We reviewed RCT and non-RCT evidence for the effect of compulsory community treatment on hospital admissions, bed-days, compliance and out-patient contacts.Method. A systematic review of RCTs, controlled before-and-after (CBA) studies, and interrupted time series (ITS) analyses. Meta-analysis of RCTs.Results. Eight papers covering five studies (two RCTs and three CBAs) met inclusion criteria (total n=1108). There was no statistical difference in 12-month admission rates between subjects on involuntary out-patient treatment and controls. Survival analyses of time to admission were equivocal. All five studies reported decreases in the number of bed-days following involuntary out-patient treatment but this only reached statistical significance in one situation; patients receiving the intervention were less likely to have admissions of over 100 days. There was no difference in treatment adherence between the intervention and control groups in either RCT or two of the CBA studies. However, the third CBA study reported a statistically significant increase of nearly five visits in the mean number of overall contacts in the involuntary out-patient treatment group.Conclusions. The evidence for involuntary out-patient treatment in reducing either admissions or bed-days is very limited. It therefore cannot be seen as a less restrictive alternative to admission. Other effects are uncertain. Evaluation of a wide range of outcomes should be included if this type of legislation is introduced.


2020 ◽  
Author(s):  
Michael W. Beets ◽  
R. Glenn Weaver ◽  
John P.A. Ioannidis ◽  
Alexis Jones ◽  
Lauren von Klinggraeff ◽  
...  

Abstract Background: Pilot/feasibility or studies with small sample sizes may be associated with inflated effects. This study explores the vibration of effect sizes (VoE) in meta-analyses when considering different inclusion criteria based upon sample size or pilot/feasibility status. Methods: Searches were conducted for meta-analyses of behavioral interventions on topics related to the prevention/treatment of childhood obesity from 01-2016 to 10-2019. The computed summary effect sizes (ES) were extracted from each meta-analysis. Individual studies included in the meta-analyses were classified into one of the following four categories: self-identified pilot/feasibility studies or based upon sample size (N≤100, N>100, and N>370 the upper 75th of sample size). The VoE was defined as the absolute difference (ABS) between the re-estimations of summary ES restricted to study classifications compared to the originally reported summary ES. Concordance (kappa) of statistical significance between summary ES was assessed. Fixed and random effects models and meta-regressions were estimated. Three case studies are presented to illustrate the impact of including pilot/feasibility and N≤100 studies on the estimated summary ES.Results: A total of 1,602 effect sizes, representing 145 reported summary ES, were extracted from 48 meta-analyses containing 603 unique studies (avg. 22 avg. meta-analysis, range 2-108) and included 227,217 participants. Pilot/feasibility and N≤100 studies comprised 22% (0-58%) and 21% (0-83%) of studies. Meta-regression indicated the ABS between the re-estimated and original summary ES where summary ES were comprised of ≥40% of N≤100 studies was 0.29. The ABS ES was 0.46 when summary ES comprised of >80% of both pilot/feasibility and N≤100 studies. Where ≤40% of the studies comprising a summary ES had N>370, the ABS ES ranged from 0.20-0.30. Concordance was low when removing both pilot/feasibility and N≤100 studies (kappa=0.53) and restricting analyses only to the largest studies (N>370, kappa=0.35), with 20% and 26% of the originally reported statistically significant ES rendered non-significant. Reanalysis of the three case study meta-analyses resulted in the re-estimated ES rendered either non-significant or half of the originally reported ES. Conclusions: When meta-analyses of behavioral interventions include a substantial proportion of both pilot/feasibility and N≤100 studies, summary ES can be affected markedly and should be interpreted with caution.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Dawit Gebreegziabiher Hagos ◽  
Henk D. F. H. Schallig ◽  
Yazezew K. Kiros ◽  
Mahmud Abdulkadir ◽  
Dawit Wolday

Abstract Background Visceral Leishmaniasis (VL) is a severely neglected disease affecting millions of people with high mortality if left untreated. In Ethiopia, the primary laboratory diagnosis of VL is by using an antigen from a 39-amino acid sequence repeat of a kinesin-related (rK39) of leishmania donovani complex (L. donovani), rapid diagnostic tests (RDT). Different rk39 RDT brands are available with very variable performance and studies from Ethiopia showed a very wide range of sensitivity and specificity. Therefore, a systematic review and meta-analysis were conducted to determine the pooled sensitivity and specificity of rk39 RDT in Ethiopia. Method PUBMED, EMBASE, and other sources were searched using predefined search terms to retrieve all relevant articles from 2007 to 2020. Heterogeneity was assessed by visually inspecting summary receiver operating curves (SROC), Spearman correlation coefficient (rs), Cochran Q test statistics, inconsistency square (I2) and subgroup analysis. The presence and statistical significance of publication bias were assessed by Egger's test at p < 0.05, and all the measurements showed the presence of considerable heterogeneity. Quality assessment of diagnostic accuracy studies (QUADAS-2) checklists was used to check the qualities of the study. Results A total of 664 articles were retrieved, and of this 12 articles were included in the meta-analysis. Overall pooled sensitivity and specificity of the rk39 RDT to diagnose VL in Ethiopia were 88.0% (95% CI 86.0% to 89.0%) and 84.0% (95% CI 82.0% to 86.0%), respectively. The sensitivity and specificity of the rk39 RDT commercial test kits were DiaMed: 86.9% (95% CI 84.3% to 89.1%) and 82.2% (95% CI 79.3% to 85.0%), and InBios: 80.0% (95% CI 77.0% to 82.8%) and 97.4% (95% CI 95.0% to 98.8%), respectively. Conclusion Referring to our result, rk39 RDT considered an essential rapid diagnostic test for VL diagnosis. Besides to the diagnostic accuracy, the features such as easy to perform, quick (10–20 min), cheap, equipment-free, electric and cold chain free, and result reproducibility, rk39 RDT is advisable to remains in practice as a diagnostic test at least in the remote VL endemic localities till a better test will come.


2005 ◽  
Vol 77 (1) ◽  
pp. 45-76 ◽  
Author(s):  
Lee-Ann C. Hayek ◽  
W. Ronald Heyer

Several analytic techniques have been used to determine sexual dimorphism in vertebrate morphological measurement data with no emergent consensus on which technique is superior. A further confounding problem for frog data is the existence of considerable measurement error. To determine dimorphism, we examine a single hypothesis (Ho = equal means) for two groups (females and males). We demonstrate that frog measurement data meet assumptions for clearly defined statistical hypothesis testing with statistical linear models rather than those of exploratory multivariate techniques such as principal components, correlation or correspondence analysis. In order to distinguish biological from statistical significance of hypotheses, we propose a new protocol that incorporates measurement error and effect size. Measurement error is evaluated with a novel measurement error index. Effect size, widely used in the behavioral sciences and in meta-analysis studies in biology, proves to be the most useful single metric to evaluate whether statistically significant results are biologically meaningful. Definitions for a range of small, medium, and large effect sizes specifically for frog measurement data are provided. Examples with measurement data for species of the frog genus Leptodactylus are presented. The new protocol is recommended not only to evaluate sexual dimorphism for frog data but for any animal measurement data for which the measurement error index and observed or a priori effect sizes can be calculated.


2005 ◽  
Vol 68 (9) ◽  
pp. 1884-1894 ◽  
Author(s):  
SUMEET R. PATIL ◽  
SHERYL CATES ◽  
ROBERTA MORALES

Risk communication and consumer education to promote safer handling of food can be the best way of managing the risk of foodborne illness at the consumer end of the food chain. Thus, an understanding of the overall status of food handling knowledge and practices is needed. Although traditional qualitative reviews can be used for combining information from several studies on specific food handling behaviors, a structured approach of meta-analysis can be more advantageous in a holistic assessment. We combined findings from 20 studies using meta-analysis methods to estimate percentages of consumers engaging in risky behaviors, such as consumption of raw food, poor hygiene, and cross-contamination, separated by various demographic categories. We estimated standard errors to reflect sampling error and between-study random variation. Then we evaluated the statistical significance of differences in behaviors across demographic categories and across behavioral measures. There were considerable differences in behaviors across demographic categories, possibly because of socioeconomic and cultural differences. For example, compared with women, men reported greater consumption of raw or undercooked foods, poorer hygiene, poorer practices to prevent cross-contamination, and less safe defrosting practices. Mid-age adults consumed more raw food (except milk) than did young adults and seniors. High-income individuals reported greater consumption of raw foods, less knowledge of hygiene, and poorer cross-contamination practices. The highest raw ground beef and egg consumption and the poorest hygiene and cross-contamination practices were found in the U.S. Mountain region. Meta-analysis was useful for identifying important data gaps and demographic groups with risky behaviors, and this information can be used to prioritize further research.


2020 ◽  
Vol 9 (2) ◽  
pp. 206-224
Author(s):  
Manuel Alcaraz-Ibáñez ◽  
Adrian Paterna ◽  
Álvaro Sicilia ◽  
Mark D. Griffiths

AbstractBackground and aimsThis study examined the relationship between self-reported symptoms of morbid exercise behaviour (MEB) and eating disorders (ED) using meta-analytic techniques.MethodsWe systematically searched MEDLINE, PsycINFO, Web of Science, SciELO and Scopus. Random effects models were used to compute pooled effect sizes estimates (r). The robustness of the summarized estimates was examined through sensitivity analyses by removing studies one at a time.ResultsSixty-six studies comprising 135 effect-sizes (N = 21,816) were included. The results revealed: (a) small-sized relationship in the case of bulimic symptoms (r = 0.19), (b) small- (r = 0.28) to medium-sized relationships (r = 0.41) in the case of body/eating concerns, and (c) medium-sized relationships in the case of overall ED symptoms (r = 0.35) and dietary restraint (r = 0.42). Larger effect sizes were observed in the case of overall ED symptoms in clinical, younger, and thinner populations, as well as when employing a continuously-scored instrument for assessing ED or the Compulsive Exercise Test for assessing MEB. Larger effect sizes were also found in female samples when the ED outcome was dietary restraint.ConclusionsThe identified gaps in the literature suggest that future research on the topic may benefit from: (a) considering a range of clinical (in terms of diagnosed ED) and non-clinical populations from diverse exercise modalities, (b) addressing a wide range of ED symptomatology, and (c) employing longitudinal designs that clarify the temporal direction of the relationship under consideration.


Sign in / Sign up

Export Citation Format

Share Document