Causal Inference

Sociology ◽  
2020 ◽  
Author(s):  
Pablo Geraldo Bastías ◽  
Jennie E. Brand

Causal inference is a growing interdisciplinary subfield in statistics, computer science, economics, epidemiology, and the social sciences. In contrast with both traditional quantitative methods and cutting-edge approaches like machine learning, causal inference questions are defined in relation to potential outcomes, or variable values that are counterfactual to the observed world and therefore cannot be answered from joint probabilities alone, even with infinite data. The fact that one can possibly observe at most one potential outcome among those of interest is known as the “fundamental problem of causal inference.” For example, in this framework, the economic return to college education can be defined as a comparison between two potential outcomes: the wages of an individual with a college education versus the wages that the same individual would have received had he or she not attended college. In general, researchers are interested in estimating such effects for certain groups and comparing the effects for different subpopulations. Critical to causal inference is recognizing that, to answer causal questions from observed data, one has to rely on untestable assumptions about how the data were generated. In other words, there is no particular statistical method that would render a conclusion “causal”; the validity of such an interpretation depends on a combination of data, assumptions about the data-generating process based on expert judgment, and estimation techniques. In the last several decades, our understanding of causality has improved enormously, owing to a conceptual apparatus and a mathematical language that enables rigorous conceptualization of causal quantities and formal representation of causal assumptions, while still employing familiar statistical methods. Potential outcomes or the Neyman-Rubin causal model and structural equations encoded as directed acyclic graphs (DAGs, also known as structural causal models) are two common approaches for conceptualizing causal relationships. The symbiosis of both languages offers a powerful framework to address causal questions. This review covers developments in both causal identification (i.e., deciding if a quantity of interest would be recoverable from infinite data, based on our assumptions) and causal effect estimation (i.e., the use of statistical methods to approximate that answer with finite, although potentially big, data). The literature is presented following the type of assumptions and questions frequently encountered in empirical research, ending with a discussion of promising new directions in the field.

2019 ◽  
Vol 24 (3) ◽  
pp. 109-112 ◽  
Author(s):  
Steven D Stovitz ◽  
Ian Shrier

Evidence-based medicine (EBM) calls on clinicians to incorporate the ‘best available evidence’ into clinical decision-making. For decisions regarding treatment, the best evidence is that which determines the causal effect of treatments on the clinical outcomes of interest. Unfortunately, research often provides evidence where associations are not due to cause-and-effect, but rather due to non-causal reasons. These non-causal associations may provide valid evidence for diagnosis or prognosis, but biased evidence for treatment effects. Causal inference aims to determine when we can infer that associations are or are not due to causal effects. Since recommending treatments that do not have beneficial causal effects will not improve health, causal inference can advance the practice of EBM. The purpose of this article is to familiarise clinicians with some of the concepts and terminology that are being used in the field of causal inference, including graphical diagrams known as ‘causal directed acyclic graphs’. In order to demonstrate some of the links between causal inference methods and clinical treatment decision-making, we use a clinical vignette of assessing treatments to lower cardiovascular risk. As the field of causal inference advances, clinicians familiar with the methods and terminology will be able to improve their adherence to the principles of EBM by distinguishing causal effects of treatment from results due to non-causal associations that may be a source of bias.


2013 ◽  
Vol 1 (1) ◽  
pp. 1-20 ◽  
Author(s):  
Tyler J. VanderWeele ◽  
Miguel A. Hernan

Abstract: In this article, we discuss causal inference when there are multiple versions of treatment. The potential outcomes framework, as articulated by Rubin, makes an assumption of no multiple versions of treatment, and here we discuss an extension of this potential outcomes framework to accommodate causal inference under violations of this assumption. A variety of examples are discussed in which the assumption may be violated. Identification results are provided for the overall treatment effect and the effect of treatment on the treated when multiple versions of treatment are present and also for the causal effect comparing a version of one treatment to some other version of the same or a different treatment. Further identification and interpretative results are given for cases in which the version precedes the treatment as when an underlying treatment variable is coarsened or dichotomized to create a new treatment variable for which there are effectively “multiple versions”. Results are also given for effects defined by setting the version of treatment to a prespecified distribution. Some of the identification results bear resemblance to identification results in the literature on direct and indirect effects. We describe some settings in which ignoring multiple versions of treatment, even when present, will not lead to incorrect inferences.


Author(s):  
Eleanor J Murray ◽  
Brandon D L Marshall ◽  
Ashley L Buchanan

Abstract Agent-based models are a key tool for investigating the emergent properties of population health settings, such as infectious disease transmission, where the exposure often violates the key ‘no interference’ assumption of traditional causal inference under the potential outcomes framework. Agent-based models and other simulation-based modeling approaches have generally been viewed as a separate knowledge-generating paradigm from the potential outcomes framework, but this can lead to confusion about how to interpret the results of these models in real-world settings. By explicitly incorporating the target trial framework into the development of an agent-based or other simulation model, we can clarify the causal parameters of interest, as well as make explicit the assumptions required for valid causal effect estimation within or between populations. In this paper, we describe the use of the target trial framework for designing agent-based models when the goal is estimation of causal effects in the presence of interference, or spillover.


2019 ◽  
Vol 7 (1) ◽  
Author(s):  
Wei Luo ◽  
Wenbo Wu ◽  
Yeying Zhu

AbstractOften the research interest in causal inference is on the regression causal effect, which is the mean difference in the potential outcomes conditional on the covariates. In this paper, we use sufficient dimension reduction to estimate a lower dimensional linear combination of the covariates that is sufficient to model the regression causal effect. Compared with the existing applications of sufficient dimension reduction in causal inference, our approaches are more efficient in reducing the dimensionality of covariates, and avoid estimating the individual outcome regressions. The proposed approaches can be used in three ways to assist modeling the regression causal effect: to conduct variable selection, to improve the estimation accuracy, and to detect the heterogeneity. Their usefulness are illustrated by both simulation studies and a real data example.


2019 ◽  
pp. 41-78
Author(s):  
Daniel Westreich

Chapter 3 discusses basic concepts in causal inference, beginning with an introduction to potential outcomes and definitions of causal contrasts (or causal estimates of effect), concepts, terms, and notation. Many concepts introduced here will be developed further in subsequent chapters. The author discusses sufficient conditions for estimation of causal effects (which are sometimes called causal identification conditions), causal directed acyclic graphs (sometimes called causal diagrams), and four key types of systematic error (confounding bias, missing data bias, selection bias, and measurement error/information bias). The author also briefly discusses alternative approaches to causal inference.


2017 ◽  
Vol 5 (2) ◽  
Author(s):  
Peng Ding ◽  
Xinran Li ◽  
Luke W. Miratrix

AbstractThere are two general views in causal analysis of experimental data: the super population view that the units are an independent sample from some hypothetical infinite population, and the finite population view that the potential outcomes of the experimental units are fixed and the randomness comes solely from the treatment assignment. These two views differs conceptually and mathematically, resulting in different sampling variances of the usual difference-in-means estimator of the average causal effect. Practically, however, these two views result in identical variance estimators. By recalling a variance decomposition and exploiting a completeness-type argument, we establish a connection between these two views in completely randomized experiments. This alternative formulation could serve as a template for bridging finite and super population causal inference in other scenarios.


2021 ◽  
Vol 15 (5) ◽  
pp. 1-46
Author(s):  
Liuyi Yao ◽  
Zhixuan Chu ◽  
Sheng Li ◽  
Yaliang Li ◽  
Jing Gao ◽  
...  

Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy, and economics, for decades. Nowadays, estimating causal effect from observational data has become an appealing research direction owing to the large amount of available data and low budget requirement, compared with randomized controlled trials. Embraced with the rapidly developed machine learning area, various causal effect estimation methods for observational data have sprung up. In this survey, we provide a comprehensive review of causal inference methods under the potential outcome framework, one of the well-known causal inference frameworks. The methods are divided into two categories depending on whether they require all three assumptions of the potential outcome framework or not. For each category, both the traditional statistical methods and the recent machine learning enhanced methods are discussed and compared. The plausible applications of these methods are also presented, including the applications in advertising, recommendation, medicine, and so on. Moreover, the commonly used benchmark datasets as well as the open-source codes are also summarized, which facilitate researchers and practitioners to explore, evaluate and apply the causal inference methods.


2021 ◽  
pp. 096228022110028
Author(s):  
Yun Li ◽  
Irina Bondarenko ◽  
Michael R Elliott ◽  
Timothy P Hofer ◽  
Jeremy MG Taylor

With medical tests becoming increasingly available, concerns about over-testing, over-treatment and health care cost dramatically increase. Hence, it is important to understand the influence of testing on treatment selection in general practice. Most statistical methods focus on average effects of testing on treatment decisions. However, this may be ill-advised, particularly for patient subgroups that tend not to benefit from such tests. Furthermore, missing data are common, representing large and often unaddressed threats to the validity of most statistical methods. Finally, it is often desirable to conduct analyses that can be interpreted causally. Using the Rubin Causal Model framework, we propose to classify patients into four potential outcomes subgroups, defined by whether or not a patient’s treatment selection is changed by the test result and by the direction of how the test result changes treatment selection. This subgroup classification naturally captures the differential influence of medical testing on treatment selections for different patients, which can suggest targets to improve the utilization of medical tests. We can then examine patient characteristics associated with patient potential outcomes subgroup memberships. We used multiple imputation methods to simultaneously impute the missing potential outcomes as well as regular missing values. This approach can also provide estimates of many traditional causal quantities of interest. We find that explicitly incorporating causal inference assumptions into the multiple imputation process can improve the precision for some causal estimates of interest. We also find that bias can occur when the potential outcomes conditional independence assumption is violated; sensitivity analyses are proposed to assess the impact of this violation. We applied the proposed methods to examine the influence of 21-gene assay, the most commonly used genomic test in the United States, on chemotherapy selection among breast cancer patients.


Sign in / Sign up

Export Citation Format

Share Document