Causal inference via string diagram surgery

Abstract Extracting causal relationships from observed correlations is a growing area in probabilistic reasoning, originating with the seminal work of Pearl and others from the early 1990s. This paper develops a new, categorically oriented view based on a clear distinction between syntax (string diagrams) and semantics (stochastic matrices), connected via interpretations as structure-preserving functors. A key notion in the identification of causal effects is that of an intervention, whereby a variable is forcefully set to a particular value independent of any prior propensities. We represent the effect of such an intervention as an endo-functor which performs ‘string diagram surgery’ within the syntactic category of string diagrams. This diagram surgery in turn yields a new, interventional distribution via the interpretation functor. While in general there is no way to compute interventional distributions purely from observed data, we show that this is possible in certain special cases using a calculational tool called comb disintegration. We demonstrate the use of this technique on two well-known toy examples: one where we predict the causal effect of smoking on cancer in the presence of a confounding common cause and where we show that this technique provides simple sufficient conditions for computing interventions which apply to a wide variety of situations considered in the causal inference literature; the other one is an illustration of counterfactual reasoning where the same interventional techniques are used, but now in a ‘twinned’ set-up, with two version of the world – one factual and one counterfactual – joined together via exogenous variables that capture the uncertainties at hand.

Download Full-text

Causal inference under over-simplified longitudinal causal models

The International Journal of Biostatistics ◽

10.1515/ijb-2020-0081 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Lola Étiévant ◽

Vivian Viallon

Keyword(s):

Causal Inference ◽

Causal Effect ◽

Sufficient Conditions ◽

Repeated Measurements ◽

Causal Models ◽

Causal Effects ◽

Sensitivity Analyses ◽

Time Varying ◽

Weighted Averages

Abstract Many causal models of interest in epidemiology involve longitudinal exposures, confounders and mediators. However, repeated measurements are not always available or used in practice, leading analysts to overlook the time-varying nature of exposures and work under over-simplified causal models. Our objective is to assess whether – and how – causal effects identified under such misspecified causal models relates to true causal effects of interest. We derive sufficient conditions ensuring that the quantities estimated in practice under over-simplified causal models can be expressed as weighted averages of longitudinal causal effects of interest. Unsurprisingly, these sufficient conditions are very restrictive, and our results state that the quantities estimated in practice should be interpreted with caution in general, as they usually do not relate to any longitudinal causal effect of interest. Our simulations further illustrate that the bias between the quantities estimated in practice and the weighted averages of longitudinal causal effects of interest can be substantial. Overall, our results confirm the need for repeated measurements to conduct proper analyses and/or the development of sensitivity analyses when they are not available.

Download Full-text

You Can’t Drive a Car With Only Three Wheels

American Journal of Epidemiology ◽

10.1093/aje/kwz119 ◽

2019 ◽

Vol 188 (9) ◽

pp. 1682-1685 ◽

Cited By ~ 1

Author(s):

Hailey R Banack

Keyword(s):

Causal Inference ◽

Observational Data ◽

Causal Effect ◽

Causal Effects ◽

Measurement Bias ◽

Exposure Misclassification ◽

Unmeasured Confounding ◽

Obstetrical Care ◽

The Impact ◽

Fundamental Requirement

Abstract Authors aiming to estimate causal effects from observational data frequently discuss 3 fundamental identifiability assumptions for causal inference: exchangeability, consistency, and positivity. However, too often, studies fail to acknowledge the importance of measurement bias in causal inference. In the presence of measurement bias, the aforementioned identifiability conditions are not sufficient to estimate a causal effect. The most fundamental requirement for estimating a causal effect is knowing who is truly exposed and unexposed. In this issue of the Journal, Caniglia et al. (Am J Epidemiol. 2019;000(00):000–000) present a thorough discussion of methodological challenges when estimating causal effects in the context of research on distance to obstetrical care. Their article highlights empirical strategies for examining nonexchangeability due to unmeasured confounding and selection bias and potential violations of the consistency assumption. In addition to the important considerations outlined by Caniglia et al., authors interested in estimating causal effects from observational data should also consider implementing quantitative strategies to examine the impact of misclassification. The objective of this commentary is to emphasize that you can’t drive a car with only three wheels, and you also cannot estimate a causal effect in the presence of exposure misclassification bias.

Download Full-text

Causal inference for clinicians

BMJ evidence-based medicine ◽

10.1136/bmjebm-2018-111069 ◽

2019 ◽

Vol 24 (3) ◽

pp. 109-112 ◽

Cited By ~ 3

Author(s):

Steven D Stovitz ◽

Ian Shrier

Keyword(s):

Decision Making ◽

Causal Inference ◽

Clinical Decision Making ◽

Causal Effect ◽

Treatment Decision ◽

Clinical Decision ◽

Directed Acyclic Graphs ◽

Causal Effects ◽

Treatment Decision Making ◽

Inference Methods

Evidence-based medicine (EBM) calls on clinicians to incorporate the ‘best available evidence’ into clinical decision-making. For decisions regarding treatment, the best evidence is that which determines the causal effect of treatments on the clinical outcomes of interest. Unfortunately, research often provides evidence where associations are not due to cause-and-effect, but rather due to non-causal reasons. These non-causal associations may provide valid evidence for diagnosis or prognosis, but biased evidence for treatment effects. Causal inference aims to determine when we can infer that associations are or are not due to causal effects. Since recommending treatments that do not have beneficial causal effects will not improve health, causal inference can advance the practice of EBM. The purpose of this article is to familiarise clinicians with some of the concepts and terminology that are being used in the field of causal inference, including graphical diagrams known as ‘causal directed acyclic graphs’. In order to demonstrate some of the links between causal inference methods and clinical treatment decision-making, we use a clinical vignette of assessing treatments to lower cardiovascular risk. As the field of causal inference advances, clinicians familiar with the methods and terminology will be able to improve their adherence to the principles of EBM by distinguishing causal effects of treatment from results due to non-causal associations that may be a source of bias.

Download Full-text

Shut Down Schools, Knock Down the Virus? No Causal Effect of School Closures on the Spread of COVID-19

10.1101/2021.04.21.21255832 ◽

2021 ◽

Author(s):

Kentaro Fukumoto ◽

Charles T. McClean ◽

Kuninori Nakagawa

Keyword(s):

Causal Inference ◽

Causal Effect ◽

Causal Effects ◽

Negative Consequences ◽

School Closures ◽

Knock Down ◽

Level Data ◽

Children And Parents ◽

Municipal Level ◽

Do So

AbstractAs COVID-19 spread in 2020, most countries shut down schools in the hopes of slowing the pandemic. Yet, studies have not reached a consensus about the effectiveness of these policies partly because they lack rigorous causal inference. Our study aims to estimate the causal effects of school closures on the number of confirmed cases. To do so, we apply matching methods to municipal-level data in Japan. We do not find that school closures caused a reduction in the spread of the coronavirus. Our results suggest that policies on school closures should be reexamined given the potential negative consequences for children and parents.

Download Full-text

Counterfactual Reasoning in Observational Studies

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019886 ◽

2019 ◽

Vol 33 ◽

pp. 9886-9887

Author(s):

Negar Hassanpour

Keyword(s):

Alternative Treatment ◽

Observational Studies ◽

Causal Effect ◽

Intelligent Agent ◽

Population Based ◽

Causal Effects ◽

Medical Procedure ◽

Counterfactual Reasoning ◽

Effect Estimation ◽

The Individual

To identify the appropriate action to take, an intelligent agent must infer the causal effects of every possible action choices. A prominent example is precision medicine that attempts to identify which medical procedure will benefit each individual patient the most. This requires answering counterfactual questions such as: ""Would this patient have lived longer, had she received an alternative treatment?"". In my PhD, I attempt to explore ways to address the challenges associated with causal effect estimation; with a focus on devising methods that enhance performance according to the individual-based measures (as opposed to population-based measures).

Download Full-text

Polynomial Mendelian Randomization reveals widespread non-linear causal effects in the UK Biobank

10.1101/2021.12.08.471751 ◽

2021 ◽

Author(s):

Jonathan Sulc ◽

Jenny Sjaarda ◽

Zoltan Kutalik

Keyword(s):

Causal Inference ◽

Large Scale ◽

Causal Effect ◽

Causal Effects ◽

Mendelian Randomisation ◽

Uk Biobank ◽

Individual Level ◽

Glucose Levels ◽

Causal Function ◽

The Uk

Causal inference is a critical step in improving our understanding of biological processes and Mendelian randomisation (MR) has emerged as one of the foremost methods to efficiently interrogate diverse hypotheses using large-scale, observational data from biobanks. Although many extensions have been developed to address the three core assumptions of MR-based causal inference (relevance, exclusion restriction, and exchangeability), most approaches implicitly assume that any putative causal effect is linear. Here we propose PolyMR, an MR-based method which provides a polynomial approximation of an (arbitrary) causal function between an exposure and an outcome. We show that this method provides accurate inference of the shape and magnitude of causal functions with greater accuracy than existing methods. We applied this method to data from the UK Biobank, testing for effects between anthropometric traits and continuous health-related phenotypes and found most of these (84%) to have causal effects which deviate significantly from linear. These deviations ranged from slight attenuation at the extremes of the exposure distribution, to large changes in the magnitude of the effect across the range of the exposure (e.g. a 1 kg/m2 change in BMI having stronger effects on glucose levels if the initial BMI was higher), to non-monotonic causal relationships (e.g. the effects of BMI on cholesterol forming an inverted U shape). Finally, we show that the linearity assumption of the causal effect may lead to the misinterpretation of health risks at the individual level or heterogeneous effect estimates when using cohorts with differing average exposure levels.

Download Full-text

Batch Effects are Causal Effects: Applications in Human Connectomics

10.1101/2021.09.03.458920 ◽

2021 ◽

Author(s):

Eric W. Bridgeford ◽

Michael Powell ◽

Gregory Kiar ◽

Ross Lawrence ◽

Brian Caffo ◽

...

Keyword(s):

Experimental Design ◽

Causal Inference ◽

Open Source ◽

Causal Effect ◽

Source Code ◽

Batch Effect ◽

Causal Effects ◽

Batch Effects ◽

Population Demographics ◽

Inference Techniques

AbstractBatch effects, undesirable sources of variance across multiple experiments, present a substantial hurdle for scientific and clinical discoveries. Specifically, the presence of batch effects can create both spurious discoveries and hide veridical signals, contributing to the ongoing reproducibility crisis. Typical approaches to dealing with batch effects conceptualize ‘batches’ as an associational effect, rather than a causal effect, despite the fact that the sources of variance that comprise the batch – potentially including experimental design and population demographics – causally impact downstream inferences. We therefore cast batch effects as a causal problem rather than an associational problem. This reformulation enables us to make explicit the assumptions and limitations of existing approaches for dealing with batch effects. We therefore develop causal batch effect strategies—CausalDcorr for discovery of batch effects and CausalComBat for mitigating batch effects – which build upon existing statistical associational methods by incorporating modern causal inference techniques. We apply these strategies to a large mega-study of human connectomes assembled by the Consortium for Reliability and Reproducibility, consisting of 24 batches including over 1700 individuals to illustrate that existing approaches create more spurious discoveries (false positives) and miss more veridical signals (true positives) than our proposed approaches. Our work therefore introduces a conceptual framing, as well as open source code, for combining multiple distinct datasets to increase confidence in claims of scientific and clinical discoveries.

Download Full-text

Pre-Event Trends in the Panel Event-Study Design

The American Economic Review ◽

10.1257/aer.20180609 ◽

2019 ◽

Vol 109 (9) ◽

pp. 3307-3338 ◽

Cited By ~ 14

Author(s):

Simon Freyaldenhoven ◽

Christian Hansen ◽

Jesse M. Shapiro

Keyword(s):

Causal Inference ◽

Event Study ◽

Study Design ◽

Causal Effect ◽

Sufficient Conditions ◽

Moment Equations ◽

Policy Variable ◽

Alternative Approaches

We consider a linear panel event-study design in which unobserved confounds may be related both to the outcome and to the policy variable of interest. We provide sufficient conditions to identify the causal effect of the policy by exploiting covariates related to the policy only through the confounds. Our model implies a set of moment equations that are linear in parameters. The effect of the policy can be estimated by 2SLS, and causal inference is valid even when endogeneity leads to pre-event trends (“pre-trends”) in the outcome. Alternative approaches perform poorly in our simulations. (JEL C23, C26)

Download Full-text

Data-Adaptive Causal Effects and Superefficiency

Journal of Causal Inference ◽

10.1515/jci-2016-0007 ◽

2016 ◽

Vol 4 (2) ◽

Cited By ~ 1

Author(s):

Peter M. Aronow

Keyword(s):

Causal Inference ◽

Empirical Distribution ◽

Causal Effect ◽

Short Note ◽

Causal Effects ◽

Asymptotically Normal ◽

Average Causal Effect ◽

Population Average ◽

Data Adaptive ◽

Local Average

AbstractRecent approaches in causal inference have proposed estimating average causal effects that are local to some subpopulation, often for reasons of efficiency. These inferential targets are sometimes data-adaptive, in that they are dependent on the empirical distribution of the data. In this short note, we show that if researchers are willing to adapt the inferential target on the basis of efficiency, then extraordinary gains in precision can potentially be obtained. Specifically, when causal effects are heterogeneous, any asymptotically normal and root-$n$ consistent estimator of the population average causal effect is superefficient for a data-adaptive local average causal effect.

Download Full-text

Polynomial Mendelian Radomization reveals widespread non-linear causal effects in the UK Biobank

10.21203/rs.3.rs-1152056/v1 ◽

2022 ◽

Author(s):

Jonathan Sulc ◽

Jennifer Sjaarda ◽

Zoltan Kutalik

Keyword(s):

Causal Inference ◽

Large Scale ◽

Causal Effect ◽

Causal Effects ◽

Mendelian Randomisation ◽

Uk Biobank ◽

Individual Level ◽

Glucose Levels ◽

Causal Function ◽

The Uk

Abstract Causal inference is a critical step in improving our understanding of biological processes and Mendelian randomisation (MR) has emerged as one of the foremost methods to efficiently interrogate diverse hypotheses using large-scale, observational data from biobanks. Although many extensions have been developed to address the three core assumptions of MR-based causal inference (relevance, exclusion restriction, and exchangeability), most approaches implicitly assume that any putative causal effect is linear. Here we propose PolyMR, an MR-based method which provides a polynomial approximation of an (arbitrary) causal function between an exposure and an outcome. We show that this method provides accurate inference of the shape and magnitude of causal functions with greater accuracy than existing methods. We applied this method to data from the UK Biobank, testing for effects between anthropometric traits and continuous health-related phenotypes and found most of these (84%) to have causal effects which deviate significantly from linear. These deviations ranged from slight attenuation at the extremes of the exposure distribution, to large changes in the magnitude of the effect across the range of the exposure (e.g. a 1 kg/m2 change in BMI having stronger effects on glucose levels if the initial BMI was higher), to non-monotonic causal relationships (e.g. the effects of BMI on cholesterol forming an inverted U shape). Finally, we show that the linearity assumption of the causal effect may lead to the misinterpretation of health risks at the individual level or heterogeneous effect estimates when using cohorts with differing average exposure levels.

Download Full-text