scholarly journals How Powerful is the Evidence in Criminology? On Whether We Should Fear a Coming Crisis of Confidence

2017 ◽  
Author(s):  
J.C. Barnes

A crisis of confidence has struck the behavioral and social sciences. A key factor driving the crisis is the low levels of statistical power in many studies. Low power is problematic because it leads to increased rates of false-negative results, inflated false-discovery rates, and over-estimates of effect sizes. To determine whether these issues impact criminology, we computed estimates of statistical power by drawing 322 mean effect sizes and 271 average sample sizes from 81 meta-analyses. Results indicated criminological studies, on average, have a moderate level of power (mean = 0.605), but there is variability. This variability is observed across general studies as well as those designed to test interventions. Studies using macro-level data tend to have lower power than studies using individual-level data. To avoid a crisis of confidence, criminologists must not ignore statistical power and should be skeptical of large effects found in studies with small samples.

2017 ◽  
Author(s):  
Josine Min ◽  
Gibran Hemani ◽  
George Davey Smith ◽  
Caroline Relton ◽  
Matthew Suderman

AbstractBackgroundTechnological advances in high throughput DNA methylation microarrays have allowed dramatic growth of a new branch of epigenetic epidemiology. DNA methylation datasets are growing ever larger in terms of the number of samples profiled, the extent of genome coverage, and the number of studies being meta-analysed. Novel computational solutions are required to efficiently handle these data.MethodsWe have developed meffil, an R package designed to quality control, normalize and perform epigenome-wide association studies (EWAS) efficiently on large samples of Illumina Infinium HumanMethylation450 and MethylationEPIC BeadChip microarrays. We tested meffil by applying it to 6000 450k microarrays generated from blood collected for two different datasets, Accessible Resource for Integrative Epigenomic Studies (ARIES) and The Genetics of Overweight Young Adults (GOYA) study.ResultsA complete reimplementation of functional normalization minimizes computational memory requirements to 5% of that required by other R packages, without increasing running time. Incorporating fixed and random effects alongside functional normalization, and automated estimation of functional normalisation parameters reduces technical variation in DNA methylation levels, thus reducing false positive associations and improving power. We also demonstrate that the ability to normalize datasets distributed across physically different locations without sharing any biologically-based individual-level data may reduce heterogeneity in meta-analyses of epigenome-wide association studies. However, we show that when batch is perfectly confounded with cases and controls functional normalization is unable to prevent spurious associations.Conclusionsmeffil is available online (https://github.com/perishky/meffil/) along with tutorials covering typical use cases.


2017 ◽  
Author(s):  
Ulrich Schimmack ◽  
Jerry Brunner

In recent years, the replicability of original findings published in psychology journals has been questioned. A key concern is that selection for significance inflates observed effect sizes and observed power. If selection bias is severe, replication studies are unlikely to reproduce a significant result. We introduce z-curve as a new method that can estimate the average true power for sets of studies that are selected for significance. We compare this method with p-curve, which has been used for the same purpose. Simulation studies show that both methods perform well when all studies have the same power, but p-curve overestimates power if power varies across studies. Based on these findings, we recommend z-curve to estimate power for sets of studies that are heterogeneous and selected for significance. Application of z-curve to various datasets suggests that the average replicability of published results in psychology is approximately 50%, but there is substantial heterogeneity and many psychological studies remain underpowered and are likely to produce false negative results. To increase replicability and credibility of published results it is important to reduce selection bias and to increase statistical power.


2013 ◽  
Vol 168 (2) ◽  
pp. 1102-1107 ◽  
Author(s):  
Zaina AlBalawi ◽  
Finlay A. McAlister ◽  
Kristian Thorlund ◽  
Michelle Wong ◽  
Jørn Wetterslev

2015 ◽  
Author(s):  
Guo-Bo Chen ◽  
Sang Hong Lee ◽  
Matthew R Robinson ◽  
Maciej Trzaskowski ◽  
Zhi-Xiang Zhu ◽  
...  

Genome-wide association studies (GWASs) have been successful in discovering replicable SNP-trait associations for many quantitative traits and common diseases in humans. Typically the effect sizes of SNP alleles are very small and this has led to large genome-wide association meta-analyses (GWAMA) to maximize statistical power. A trend towards ever-larger GWAMA is likely to continue, yet dealing with summary statistics from hundreds of cohorts increases logistical and quality control problems, including unknown sample overlap, and these can lead to both false positive and false negative findings. In this study we propose a new set of metrics and visualization tools for GWAMA, using summary statistics from cohort-level GWASs. We proposed a pair of methods in examining the concordance between demographic information and summary statistics. In method I, we use the population genetics Fststatistic to verify the genetic origin of each cohort and their geographic location, and demonstrate using GWAMA data from the GIANT Consortium that geographic locations of cohorts can be recovered and outlier cohorts can be detected. In method II, we conduct principal component analysis based on reported allele frequencies, and is able to recover the ancestral information for each cohort. In addition, we propose a new statistic that uses the reported allelic effect sizes and their standard errors to identify significant sample overlap or heterogeneity between pairs of cohorts. Finally, to quantify unknown sample overlap across all pairs of cohorts we propose a method that uses randomly generated genetic predictors that does not require the sharing of individual-level genotype data and does not breach individual privacy.


2020 ◽  
Vol 63 (5) ◽  
pp. 1572-1580
Author(s):  
Laura Gaeta ◽  
Christopher R. Brydges

Purpose The purpose was to examine and determine effect size distributions reported in published audiology and speech-language pathology research in order to provide researchers and clinicians with more relevant guidelines for the interpretation of potentially clinically meaningful findings. Method Cohen's d, Hedges' g, Pearson r, and sample sizes ( n = 1,387) were extracted from 32 meta-analyses in journals in speech-language pathology and audiology. Percentile ranks (25th, 50th, 75th) were calculated to determine estimates for small, medium, and large effect sizes, respectively. The median sample size was also used to explore statistical power for small, medium, and large effect sizes. Results For individual differences research, effect sizes of Pearson r = .24, .41, and .64 were found. For group differences, Cohen's d /Hedges' g = 0.25, 0.55, and 0.93. These values can be interpreted as small, medium, and large effect sizes in speech-language pathology and audiology. The majority of published research was inadequately powered to detect a medium effect size. Conclusions Effect size interpretations from published research in audiology and speech-language pathology were found to be underestimated based on Cohen's (1988, 1992) guidelines. Researchers in the field should consider using Pearson r = .25, .40, and .65 and Cohen's d /Hedges' g = 0.25, 0.55, and 0.95 as small, medium, and large effect sizes, respectively, and collect larger sample sizes to ensure that both significant and nonsignificant findings are robust and replicable.


Author(s):  
Richard J. Glassock ◽  
Daniel C. Cattran

The literature on the subject of treatment of glomerular disease is immense (over 15,000 articles in PubMed as of July, 2008). Negotiating this broad and complex panorama can be a difficult task, especially in relationship to the evaluation of the best evidence for a particular treatment strategy for a specific disease entity occurring in an individual patient. Perfection is not attainable in clinical trials of therapy and every report has some pitfall or limitation. Some studies, however, stand out as excellent examples of design and execution. Unfortunately, in the field of treatment of glomerular disease such studies are relatively uncommon. The good news is that well designed and executed studies of treatment of primary glomerular disease are being reported with increasing frequency in recent years. This has occurred in part because of increased collaboration among groups interested in furthering knowledge in this important area of inquiry, but also because of better recognition of the deficiencies of past efforts to study treatment of glomerular disease in clinical trials. Many interinstitutional collaborative studies have been aided by improvements in trial design and by more complete descriptions of the natural history of untreated disease. One of the main weaknesses of clinical studies of therapy in primary glomerular disease is the small numbers of subjects studied in individual reports. This increases the risks of confounding and of both false positive and false negative results. The purpose of this chapter is to provide a concise analysis of the strengths and weakness of the various approaches to the study of therapeutic efficacy and safety of agents used in primary glomerular disease. The focus will be on observational studies, controlled clinical trials, and meta-analyses of published reports. The specific aims are to equip the discerning reader for improved understanding of the evidence-base for therapy of primary glomerular disease. The details of the specific reports and how they can be integrated into an ‘evidence-based’ approach to therapeutic decision-making are dealt with in the chapters devoted to specific disease entities which follow.


Author(s):  
Yayouk E. Willems ◽  
Jian-bin Li ◽  
Anne M. Hendriks ◽  
Meike Bartels ◽  
Catrin Finkenauer

Theoretical studies propose an association between family violence and low self-control in adolescence, yet empirical findings of this association are inconclusive. The aim of the present research was to systematically summarize available findings on the relation between family violence and self-control across adolescence. We included 27 studies with 143 effect sizes, representing more than 25,000 participants of eight countries from early to late adolescence. Applying a multi-level meta-analyses, taking dependency between effect sizes into account while retaining statistical power, we examined the magnitude and direction of the overall effect size. Additionally, we investigated whether theoretical moderators (e.g., age, gender, country), and methodological moderators (cross-sectional/longitudinal, informant) influenced the magnitude of the association between family violence and self-control. Our results revealed that family violence and self-control have a small to moderate significant negative association (r = -.191). This association did not vary across gender, country, and informants. The strength of the association, however, decreased with age and in longitudinal studies. This finding provides evidence that researchers and clinicians may expect low self-control in the wake of family violence, especially in early adolescence. Recommendations for future research in the area are discussed.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Tamar Sofer ◽  
Xiuwen Zheng ◽  
Cecelia A. Laurie ◽  
Stephanie M. Gogarten ◽  
Jennifer A. Brody ◽  
...  

AbstractIn modern Whole Genome Sequencing (WGS) epidemiological studies, participant-level data from multiple studies are often pooled and results are obtained from a single analysis. We consider the impact of differential phenotype variances by study, which we term ‘variance stratification’. Unaccounted for, variance stratification can lead to both decreased statistical power, and increased false positives rates, depending on how allele frequencies, sample sizes, and phenotypic variances vary across the studies that are pooled. We develop a procedure to compute variant-specific inflation factors, and show how it can be used for diagnosis of genetic association analyses on pooled individual level data from multiple studies. We describe a WGS-appropriate analysis approach, implemented in freely-available software, which allows study-specific variances and thereby improves performance in practice. We illustrate the variance stratification problem, its solutions, and the proposed diagnostic procedure, in simulations and in data from the Trans-Omics for Precision Medicine Whole Genome Sequencing Program (TOPMed), used in association tests for hemoglobin concentrations and BMI.


2009 ◽  
Vol 24 (S1) ◽  
pp. 1-1
Author(s):  
A. Leon

Dr. Leon will present the biostatistical considerations that contribute to a clinical trial design and the strategies to enhance signal detection. These include minimizing bias in the estimate of treatment effect while maintaining a nominal level of type I error (i.e., false positive results) and maintaining sufficient statistical power (i.e. reducing the likelihood of false negative results). Particular attention will be paid to reducing the problems of attrition and the hazards of multiplicity. Methods to examine moderators of the treatment effect will also be explored. Examples from psychopharmacologic and psychotherapy trials for the treatment of depression and panic disorder will be provided to illustrate these issues. Following the didactic session, the participants will be encouraged to bring forth their own questions regarding clinical trial design for a 45-minute interactive discussion with the presenters. The objectives of the workshop are to improve the participants’ understanding of the goals of clinical trial design and methods to achieve those goals in order to improve their own research techniques, grantsmanship, and abilities to more accurately judge the results of studies presented in the literature.


2018 ◽  
Vol 75 (6) ◽  
pp. 443-445 ◽  
Author(s):  
John P A Ioannidis

ObjectivesMeta-analyses are considered generally as the highest level of evidence, but concerns have been voiced about their massive, low-quality production. This paper aimed to evaluate the landscape of meta-analyses in the field of occupational and environmental health and medicine.MethodsUsing relevant search terms, all meta-analyses were searched for, but those published in 2015 were assessed for their origin, whether they included randomised trials and individual-level data and whether they had authors from the industry or consultancy firms.ResultsPubMed searches (last update February 2017) identified 1251 eligible meta-analyses in this field. There was a rapid increase over time (n=16 published in 1995 vs n=163 published in 2015). Of the 163 eligible meta-analyses published in 2015, 49 were from China, followed at a distance by the USA (n=19). Only 16 considered randomised (intervention) trials and 13 included individual-level data. Only 1 of the 150 meta-analyses had industry authors and none had consultancy firm authors. As an example of conflicting findings, 12 overlapping meta-analyses addressed mobile phones and brain cancer risk and they differed substantially in number of studies included, eligibility criteria and conclusions.ConclusionsThere has been a major increase in the publication of meta-analyses in occupational and environmental health over time, with the majority of these studies focusing on observational data, while a commendable fraction used individual-level data. Authorship is still limited largely to academic and non-profit authors. With massive production of meta-analyses, redundancy needs to be anticipated and efforts should be made to safeguard quality and protect from bias.


Sign in / Sign up

Export Citation Format

Share Document