scholarly journals A Bayesian Approach for Estimating Causal Effects from Observational Data

2020 ◽  
Vol 34 (04) ◽  
pp. 5395-5402
Author(s):  
Johan Pensar ◽  
Topi Talvitie ◽  
Antti Hyttinen ◽  
Mikko Koivisto

We present a novel Bayesian method for the challenging task of estimating causal effects from passively observed data when the underlying causal DAG structure is unknown. To rigorously capture the inherent uncertainty associated with the estimate, our method builds a Bayesian posterior distribution of the linear causal effect, by integrating Bayesian linear regression and averaging over DAGs. For computing the exact posterior for all cause-effect variable pairs, we give an algorithm that runs in time O(3d d) for d variables, being feasible up to 20 variables. We also give a variant that computes the posterior probabilities of all pairwise ancestor relations within the same time complexity, significantly improving the fastest previous algorithm. In simulations, our Bayesian method outperforms previous methods in estimation accuracy, especially for small sample sizes. We further show that our method for effect estimation is well-adapted for detecting strong causal effects markedly deviating from zero, while our variant for computing posteriors of ancestor relations is the method of choice for detecting the mere existence of a causal relation. Finally, we apply our method on observational flow cytometry data, detecting several causal relations that concur with previous findings from experimental data.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Florent Le Borgne ◽  
Arthur Chatton ◽  
Maxime Léger ◽  
Rémi Lenain ◽  
Yohann Foucher

AbstractIn clinical research, there is a growing interest in the use of propensity score-based methods to estimate causal effects. G-computation is an alternative because of its high statistical power. Machine learning is also increasingly used because of its possible robustness to model misspecification. In this paper, we aimed to propose an approach that combines machine learning and G-computation when both the outcome and the exposure status are binary and is able to deal with small samples. We evaluated the performances of several methods, including penalized logistic regressions, a neural network, a support vector machine, boosted classification and regression trees, and a super learner through simulations. We proposed six different scenarios characterised by various sample sizes, numbers of covariates and relationships between covariates, exposure statuses, and outcomes. We have also illustrated the application of these methods, in which they were used to estimate the efficacy of barbiturates prescribed during the first 24 h of an episode of intracranial hypertension. In the context of GC, for estimating the individual outcome probabilities in two counterfactual worlds, we reported that the super learner tended to outperform the other approaches in terms of both bias and variance, especially for small sample sizes. The support vector machine performed well, but its mean bias was slightly higher than that of the super learner. In the investigated scenarios, G-computation associated with the super learner was a performant method for drawing causal inferences, even from small sample sizes.


Author(s):  
Negar Hassanpour

To identify the appropriate action to take, an intelligent agent must infer the causal effects of every possible action choices. A prominent example is precision medicine that attempts to identify which medical procedure will benefit each individual patient the most. This requires answering counterfactual questions such as: ""Would this patient have lived longer, had she received an alternative treatment?"". In my PhD, I attempt to explore ways to address the challenges associated with causal effect estimation; with a focus on devising methods that enhance performance according to the individual-based measures (as opposed to population-based measures).


2021 ◽  
Vol 12 ◽  
Author(s):  
Yixin Gao ◽  
Jinhui Zhang ◽  
Huashuo Zhao ◽  
Fengjun Guan ◽  
Ping Zeng

BackgroundIn two-sample Mendelian randomization (MR) studies, sex instrumental heterogeneity is an important problem needed to address carefully, which however is often overlooked and may lead to misleading causal inference.MethodsWe first employed cross-trait linkage disequilibrium score regression (LDSC), Pearson’s correlation analysis, and the Cochran’s Q test to examine sex genetic similarity and heterogeneity in instrumental variables (IVs) of exposures. Simulation was further performed to explore the influence of sex instrumental heterogeneity on causal effect estimation in sex-specific two-sample MR analyses. Furthermore, we chose breast/prostate cancer as outcome and four anthropometric traits as exposures as an illustrative example to illustrate the importance of taking sex heterogeneity of instruments into account in MR studies.ResultsThe simulation definitively demonstrated that sex-combined IVs can lead to biased causal effect estimates in sex-specific two-sample MR studies. In our real applications, both LDSC and Pearson’s correlation analyses showed high genetic correlation between sex-combined and sex-specific IVs of the four anthropometric traits, while nearly all the correlation coefficients were larger than zero but less than one. The Cochran’s Q test also displayed sex heterogeneity for some instruments. When applying sex-specific instruments, significant discrepancies in the magnitude of estimated causal effects were detected for body mass index (BMI) on breast cancer (P = 1.63E-6), for hip circumference (HIP) on breast cancer (P = 1.25E-20), and for waist circumference (WC) on prostate cancer (P = 0.007) compared with those generated with sex-combined instruments.ConclusionOur study reveals that the sex instrumental heterogeneity has non-ignorable impact on sex-specific two-sample MR studies and the causal effects of anthropometric traits on breast/prostate cancer would be biased if sex-combined IVs are incorrectly employed.


2019 ◽  
Vol 8 (5) ◽  
pp. 66
Author(s):  
Balgobin Nandram ◽  
Yuan Yu

In sample surveys with sensitive items, sampled units may not respond or they respond untruthfully. Usually a negative answer is given when it is actually positive, thereby leading to an estimate of the population proportion of positives (sensitive proportion) that is too small. In our study, we have binary data obtained from the unrelated-question design, and both the sensitive proportion and the nonsensitive proportion are of interest. A respondent answers the sensitive item with a known probability, and to avoid non-identifiable parameters, at least two (not necessarily exactly two) different random mechanisms are used, but only one for each cluster of respondents. The key point here is that the counts are sparse (very small sample sizes), and we show how to overcome some of the problems associated with the unrelated question design. A standard approach to this problem is to use the expectation-maximization (EM) algorithm. However, because we consider only small sample sizes (sparse counts), the EM algorithm may not converge and asymptotic theory, which can permit normality assumptions for inference, is not appropriate; so we develop a Bayesian method. To compare the EM algorithm and the Bayesian method, we have presented an example with sparse data on college cheating and a simulation study to illustrate the properties of our procedure. Finally, we discuss two extensions to accommodate finite population sampling and optional responses.


1989 ◽  
Vol 8 (4) ◽  
pp. 307-312 ◽  
Author(s):  
O.F.D. Chadwick ◽  
H.R. Anderson

1 The evidence from studies of the neuropsychological consequences of chronic volatile substance abuse is reviewed. 2 Studies of occupational exposure to solvent vapour are of limited relevance when considering the effects of volatile substance abuse because occupational exposure is normally to small quantities of many different compounds over prolonged periods of time. 3 Many studies of chronic volatile substance abusers suffer from serious shortcomings such as the use of small sample sizes, inadequate controls, failure to exclude the possibility of acute toxic effects and a disregard of other factors which could account for the findings. 4 There is reasonably good evidence that neuropsychological impairment is often present amongst volatile subtance abusers with definite neurological abnormalities. 5 Although most studies have found that volatile substance abusers without reported neurological abnormalities obtain lower psychometric test scores than non-abusers, it remains uncertain whether these deficits are best explained in terms of a causal effect of volatile substance abuse, rather than a reflection of other factors associated with volatile substance abuse, such as background, social disadvantages or history of delinquency.


2021 ◽  
pp. 1-21
Author(s):  
Keith A. Markus

Abstract Rubin and Pearl offered approaches to causal effect estimation and Lewis and Pearl offered theories of counterfactual conditionals. Arguments offered by Pearl and his collaborators support a weak form of equivalence such that notation from the rival theory can be re-purposed to express Pearl’s theory in a way that is equivalent to Pearl’s theory expressed in its native notation. Nonetheless, the many fundamental differences between the theories rule out any stronger form of equivalence. A renewed emphasis on comparative research can help to guide applications, further develop each theory, and better understand their relative strengths and weaknesses.


Author(s):  
Jing Ma ◽  
Ruocheng Guo ◽  
Aidong Zhang ◽  
Jundong Li

One fundamental problem in causality learning is to estimate the causal effects of one or multiple treatments (e.g., medicines in the prescription) on an important outcome (e.g., cure of a disease). One major challenge of causal effect estimation is the existence of unobserved confounders -- the unobserved variables that affect both the treatments and the outcome. Recent studies have shown that by modeling how instances are assigned with different treatments together, the patterns of unobserved confounders can be captured through their learned latent representations. However, the interpretability of the representations in these works is limited. In this paper, we focus on the multi-cause effect estimation problem from a new perspective by learning disentangled representations of confounders. The disentangled representations not only facilitate the treatment effect estimation but also strengthen the understanding of causality learning process. Experimental results on both synthetic and real-world datasets show the superiority of our proposed framework from different aspects.


2021 ◽  
Vol 9 (1) ◽  
pp. 211-218
Author(s):  
Sergio Garrido ◽  
Stanislav Borysov ◽  
Jeppe Rich ◽  
Francisco Pereira

Abstract The estimation of causal effects is fundamental in situations where the underlying system will be subject to active interventions. Part of building a causal inference engine is defining how variables relate to each other, that is, defining the functional relationship between variables entailed by the graph conditional dependencies. In this article, we deviate from the common assumption of linear relationships in causal models by making use of neural autoregressive density estimators and use them to estimate causal effects within Pearl’s do-calculus framework. Using synthetic data, we show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables and include confidence bands using the non-parametric bootstrap. We also explore scenarios that deviate from the ideal causal effect estimation setting such as poor data support or unobserved confounders.


Sign in / Sign up

Export Citation Format

Share Document