Substantive Importance and the Veil of Statistical Significance

AbstractPolitical science is gradually moving away from an exclusive focus on statistical significance and toward an emphasis on the magnitude and importance of effects. While we welcome this change, we argue that the current practice of “magnitude-and-significance,” in which researchers only interpret the magnitude of a statistically significant point estimate, barely improves the much-maligned “sign-and-significance” approach, in which researchers focus only on the statistical significance of an estimate. This exclusive focus on the point estimate hides the uncertainty behind a veil of statistical significance. Instead, we encourage researchers to explicitly account for uncertainty by interpreting the range of values contained in the confidence interval. Especially when making judgments about the importance of estimated effects, we advise researchers to make empirical claims if and only if those claims hold for the entire confidence interval.

Download Full-text

Equivalence Testing for Regression Discontinuity Designs

Political Analysis ◽

10.1017/pan.2020.43 ◽

2020 ◽

pp. 1-17

Author(s):

Erin Hartman

Keyword(s):

Political Science ◽

Current Practice ◽

Regression Discontinuity ◽

Regression Function ◽

Superior Performance ◽

Equivalence Testing ◽

Simulation Studies ◽

Equivalence Tests ◽

Regression Discontinuity Designs ◽

Testing Approach

Abstract Regression discontinuity (RD) designs are increasingly common in political science. They have many advantages, including a known and observable treatment assignment mechanism. The literature has emphasized the need for “falsification tests” and ways to assess the validity of the design. When implementing RD designs, researchers typically rely on two falsification tests, based on empirically testable implications of the identifying assumptions, to argue the design is credible. These tests, one for continuity in the regression function for a pretreatment covariate, and one for continuity in the density of the forcing variable, use a null of no difference in the parameter of interest at the discontinuity. Common practice can, incorrectly, conflate a failure to reject evidence of a flawed design with evidence that the design is credible. The well-known equivalence testing approach addresses these problems, but how to implement equivalence tests in the RD framework is not straightforward. This paper develops two equivalence tests tailored for RD designs that allow researchers to provide statistical evidence that the design is credible. Simulation studies show the superior performance of equivalence-based tests over tests-of-difference, as used in current practice. The tests are applied to the close elections RD data presented in Eggers et al. (2015b).

Download Full-text

The Safety of COVID-19 Vaccinations—We Should Rethink the Policy

Vaccines ◽

10.3390/vaccines9070693 ◽

2021 ◽

Vol 9 (7) ◽

pp. 693

Author(s):

Harald Walach ◽

Rainer J. Klement ◽

Wouter Aukema

Keyword(s):

Side Effects ◽

Confidence Interval ◽

Field Study ◽

Adverse Reactions ◽

Safety Data ◽

European Medicines Agency ◽

Point Estimate ◽

Drug Reactions ◽

Vaccination Policy ◽

Risks And Benefits

Background: COVID-19 vaccines have had expedited reviews without sufficient safety data. We wanted to compare risks and benefits. Method: We calculated the number needed to vaccinate (NNTV) from a large Israeli field study to prevent one death. We accessed the Adverse Drug Reactions (ADR) database of the European Medicines Agency and of the Dutch National Register (lareb.nl) to extract the number of cases reporting severe side effects and the number of cases with fatal side effects. Result: The NNTV is between 200–700 to prevent one case of COVID-19 for the mRNA vaccine marketed by Pfizer, while the NNTV to prevent one death is between 9000 and 50,000 (95% confidence interval), with 16,000 as a point estimate. The number of cases experiencing adverse reactions has been reported to be 700 per 100,000 vaccinations. Currently, we see 16 serious side effects per 100,000 vaccinations, and the number of fatal side effects is at 4.11/100,000 vaccinations. For three deaths prevented by vaccination we have to accept two inflicted by vaccination. Conclusions: This lack of clear benefit should cause governments to rethink their vaccination policy.

Download Full-text

Improving Data Analysis in Political Science

World Politics ◽

10.2307/2009670 ◽

1969 ◽

Vol 21 (4) ◽

pp. 641-654 ◽

Cited By ~ 28

Author(s):

Edward R. Tufte

Keyword(s):

Data Analysis ◽

Measurement Error ◽

Political Science ◽

Statistical Significance ◽

Large Body ◽

Small Collection

Students of politics use statistical and quantitative techniques to: summarize a large body of numbers into a small collection of typical values;confirm (and perhaps sanctify) the results of the analysis by using tests of statistical significance that help protect against sampling and measurement error;discover what's going on in their data and expose some new relationships; andinform their audience what's going on in the data.

Download Full-text

Protozoa enumeration via microscope – some remarks on methodology

E3S Web of Conferences ◽

10.1051/e3sconf/20184400121 ◽

2018 ◽

Vol 44 ◽

pp. 00121

Author(s):

Sara Nicpoń ◽

Paula Iliaszewicz ◽

Maciej Leoniak ◽

Agnieszka Trusz-Zdybek

Keyword(s):

Confidence Interval ◽

Activated Sludge ◽

Sharp Increase ◽

Statistical Significance ◽

Lower Number ◽

Average Confidence ◽

Statistical Results

For proper enumeration of protozoa in activated sludge good methodology is required. In this paper we present some remarks on microscopic methodology of protozoa enumeration. This remarks concern number of repetitions from one sample required to obtain good statistical results as well as influence of sample aeration on number of found protozoa. Presented data shows that at last 10 repetitions are required from each sample to obtain low average confidence interval. Lower number of repetitions leads to sharp increase in average confidence interval and loss of statistical significance while higher number does not decrease average confidence interval substantially. As measurements lasts for few hours lack of sample’s aeration during measurement leads to detection of lower by 27% number of protozoa.

Download Full-text

What can we learn from neurocognitive approaches to political science? : A critical review of the veil of ignorance experiment in neuropolitics

The Annuals of Japanese Political Science Association ◽

10.7218/nenpouseijigaku.68.2_173 ◽

2017 ◽

Vol 68 (2) ◽

pp. 2_173-2_203

Author(s):

Junko KATO ◽

Shiro SAKAIYA ◽

Hirofumi TAKESUE

Keyword(s):

Political Science ◽

Critical Review ◽

Veil Of Ignorance ◽

The Veil

Download Full-text

Perioperative Glucocorticoid Therapy for Patients with Adrenal Insufficiency: Dosing Based on Pharmacokinetic Data

The Journal of Clinical Endocrinology & Metabolism ◽

10.1210/clinem/dgaa042 ◽

2020 ◽

Vol 105 (3) ◽

pp. e753-e761 ◽

Cited By ~ 4

Author(s):

Baha M Arafah

Keyword(s):

Adrenal Insufficiency ◽

Half Life ◽

Current Practice ◽

Volume Of Distribution ◽

Glucocorticoid Therapy ◽

Major Surgery ◽

Pharmacokinetic Data ◽

Healthy Individuals ◽

Hormone Injection ◽

Range Of Values

Abstract Background Perioperative glucocorticoid therapy for patients with adrenal insufficiency (AI) is currently based on anecdotal reports, without supporting pharmacokinetic data. Methods We determined the half-life, clearance, and volume of distribution of 2 consecutive intravenously (IV)-administered doses of hydrocortisone (15 or 25 mg every 6 hours) to 22 dexamethasone-suppressed healthy individuals and used the data to develop a novel protocol to treat 68 patients with AI who required surgical procedures. Patients received 20 mg of hydrocortisone orally 2 to 4 hours before intubation and were started on 25 mg of IV hydrocortisone every 6 hours for 24 hours and 15 mg every 6 hours during the second day. Nadir cortisol concentrations were repeatedly measured during that period. Results In healthy individuals, cortisol half-life was longer when the higher hydrocortisone dose was administered (2.02 ± 0.15 vs 1.81 ± 0.11 hours; P < 0.01), and in patients with AI, the half-life was longer than in healthy individuals given the same hydrocortisone dose. In both populations, the cortisol half-life increased further with the second hormone injection. Prolongation of cortisol half-life was due to decreased hydrocortisone clearance and an increase in its volume of distribution. Nadir cortisol levels determined throughout the 48 postoperative hours were within the range of values and often exceeded those observed perioperatively in patients without adrenal dysfunction. Conclusions Cortisol pharmacokinetics are altered in the postoperative period and indicate that lower doses of hydrocortisone can be safely administered to patients with AI undergoing major surgery. The findings of this investigation call into question the current practice of administering excessive glucocorticoid supplementation during stress.

Download Full-text

Physical restraints versus seclusion room for management of people with acute aggression or agitation due to psychotic illness (TREC-SAVE): a randomized trial

Psychological Medicine ◽

10.1017/s0033291712000372 ◽

2012 ◽

Vol 42 (11) ◽

pp. 2265-2273 ◽

Cited By ~ 22

Author(s):

G. Huf ◽

E. S. F. Coutinho ◽

C. E. Adams ◽

Keyword(s):

Health Care ◽

Relative Risk ◽

Confidence Interval ◽

Care Pathway ◽

Statistical Significance ◽

The Other ◽

Physical Restraints ◽

Psychotic Illness ◽

The People ◽

Mean Time

BackgroundAfter de-escalation techniques have failed, restraints, seclusion and/or rapid tranquillization may be used for people whose aggression is due to psychosis. Most coercive acts of health care have not been evaluated in trials.MethodPeople admitted to the emergency room of Instituto Philippe Pinel, Rio de Janeiro, Brazil, whose aggression/agitation was thought due to psychosis and for whom staff were unsure if best to restrict using physical restraints or a seclusion room, were randomly allocated to one or the other and followed up to 14 days. The primary outcomes were ‘no need to change intervention early – within 1 h’ and ‘not restricted by 4 h’.ResultsA total of 105 people were randomized. Two-thirds of the people secluded were able to be fully managed in this way. Even taking into account the move out of seclusion into restraints, this study provides evidence that embarking on the less restrictive care pathway (seclusion) does not increase overall time in restriction of some sort [not restricted by 4 h: relative risk 1.09, 95% confidence interval 0.75–1.58; mean time to release: restraints 337.6 (s.d.=298.2) min, seclusion room 316.3 (s.d.=264.5) min, p=0.48]. Participants tended to be more satisfied with their care in the seclusion group (17.0% v. 11.1%) but this did not reach conventional levels of statistical significance (p=0.42).ConclusionsThis study should be replicated, but suggests that opting for the least restrictive option in circumstances where there is clinical doubt does not harm or prolong coercion.

Download Full-text

Improving the odds? How to pick the winner of the English Derby

Comparative Exercise Physiology ◽

10.3920/cep10017 ◽

2014 ◽

Vol 10 (1) ◽

pp. 57-62

Author(s):

D.J. Marlin ◽

J.M. Williams ◽

T. Parkin

Keyword(s):

Logistic Regression ◽

Confidence Interval ◽

Regression Models ◽

Mixed Effects ◽

Point Estimate ◽

Prize Money ◽

Logistic Regression Models ◽

Predictive Variables ◽

Time Period ◽

Predictive Probability

Many consider the English Derby on Epsom Downs to be ‘The Blue Riband of the Turf’. The Epsom Derby has been run annually since 1780 and the colt Diomed was the first winner. Today the Epsom Derby, run over 1.5 miles, is one of five classic races and is the second leg of the English Triple Crown, preceded by the 2,000 Guineas and followed by the St Leger. The prize money for 2010 has been in excess of £1.25 million. To the best of our knowledge, whilst epidemiological techniques have previously been applied in an attempt to identify risk factors for injury, the purpose of the present study, which we believe is unique, was to use an epidemiological approach to analyse factors that may be predictive of success (or failure) in a single race over the course of a number of consecutive years: The Epsom Derby. Information on the horses competing in the last 22 runnings of the Epsom Derby between 1988 and 2009. Univariate and multivariable single-level and mixed effects logistic regression models were developed using winning the Epsom Derby as the dependent variable. Between 1988 and 2009 in 22 runnings of the Derby, a total of 344 horses started the Epsom Derby. The number of runners in the race has varied between 12 and 25 over the same time period. On average the probability of winning the Derby between 1988 and 2009 was approximately 6% (22/344), without accounting for any potentially predictive variables. Variables that were related to an increased chance of success were being the favourite (odds ratio (OR) 4.75; 95 % confidence interval (CI) 1.58-14.3; P=0.006), the number of 2-year old wins (OR 1.45; CI 1.03-2.04; P=0.03), being foaled in Ireland (OR 2.80; CI 1.12-7.04; P=0.041) and having the same jockey in all races throughout the horses career up to and including the Derby (OR 2.53; CI 1.0-6.41; P=0.05). The highest predictive probability was for horses that started the race as a favourite, were Irish bred, had been ridden by a single jockey and had won twice as a 2-year old. Although the point estimate for this probability was 52% the degree of uncertainty around this estimate was wide, i.e. the 95% CI was 17.5 to 86.5%. Nevertheless even at the lower confidence interval this still represents a significant improvement on the approximately 6% chance of picking a winner at random. In conclusion, using mixed effects logistic regression models would allow one to improve the odds of picking the winner of the Epsom Derby over the past 22 runnings.

Download Full-text

Identifying Strategies Programs Adopt to Meet Healthy Eating and Physical Activity Standards in Afterschool Programs

Health Education & Behavior ◽

10.1177/1090198116676252 ◽

2016 ◽

Vol 44 (4) ◽

pp. 536-547 ◽

Cited By ~ 4

Author(s):

Robert G. Weaver ◽

Justin B. Moore ◽

Brie Turner-McGrievy ◽

Ruth Saunders ◽

Aaron Beighle ◽

...

Keyword(s):

Physical Activity ◽

Confidence Interval ◽

Healthy Eating ◽

Vigorous Physical Activity ◽

Statistical Significance ◽

National Level ◽

Routine Practice ◽

Afterschool Programs ◽

Point Increase ◽

The Relationship

Background. The YMCA of USA has adopted Healthy Eating and Physical Activity (HEPA) Standards for its afterschool programs (ASPs). Little is known about strategies YMCA ASPs are implementing to achieve Standards and these strategies’ effectiveness. Aims. (1) Identify strategies implemented in YMCA ASPs and (2) evaluate the relationship between strategy implementation and meeting Standards. Method. HEPA was measured via accelerometer (moderate-to-vigorous-physical-activity [MVPA]) and direct observation (snacks served) in 20 ASPs. Strategies were identified and mapped onto a capacity building framework ( Strategies To Enhance Practice [STEPs]). Mixed-effects regression estimated increases in HEPA outcomes as implementation increased. Model-implied estimates were calculated for high (i.e., highest implementation score achieved), moderate (median implementation score across programs), and low (lowest implementation score achieved) implementation for both HEPA separately. Results. Programs implemented a variety of strategies identified in STEPs. For every 1-point increase in implementation score 1.45% (95% confidence interval = 0.33% to 2.55%, p ≤ .001) more girls accumulated 30 min/day of MVPA and fruits and/or vegetables were served on 0.11 more days (95% confidence interval = 0.11-0.45, p ≤ .01). Relationships between implementation and other HEPA outcomes did not reach statistical significance. Still regression estimates indicated that desserts are served on 1.94 fewer days (i.e., 0.40 vs. 2.34) in the highest implementing program than the lowest implementing program and water is served 0.73 more days (i.e., 2.37 vs. 1.64). Conclusions. Adopting HEPA Standards at the national level does not lead to changes in routine practice in all programs. Practical strategies that programs could adopt to more fully comply with the HEPA Standards are identified.

Download Full-text

Health Diagnosis of Structural Systems Using a Repetitive Model Updating Approach

Volume 8: Seismic Engineering ◽

10.1115/pvp2007-26134 ◽

2007 ◽

Author(s):

Jeng-Wen Lin ◽

Chong-Shien Tsai ◽

Chih-Wei Huang

Keyword(s):

Confidence Interval ◽

Model Updating ◽

Structural Parameters ◽

Statistical Significance ◽

Structural Systems ◽

Least Squares Regression ◽

Accurate Identification ◽

Statistical Confidence ◽

Multivariable Polynomial ◽

Health Diagnosis

This paper proposes a statistical confidence interval based model updating approach for the health diagnosis of structural systems subjected to seismic excitations. The proposed model updating approach uses the 95% confidence interval of the estimated structural parameters to determine their statistical significance in a least-squares regression setting. When the parameters’ confidence interval covers the “null” value, it is statistically sustainable to truncate such parameters. The remaining parameters will repetitively undergo such parameter sifting process for model updating until all the parameters’ statistical significance cannot be further improved. This newly developed model updating approach is implemented for the developed series models of multivariable polynomial expansions: the linear, the Taylor series, and the power series model, leading to a more accurate identification as well as a more controllable design for system vibration control.

Download Full-text