Penalized regression procedures for variable selection in the potential outcomes framework

Mediation has played a critical role in developmental theory and research. Yet, developmentalists rarely discuss the methodological challenges of establishing causality in mediation analysis or potential strategies to improve the identification of causal mediation effects. In this article, we discuss the potential outcomes framework from statistics as a means for highlighting several fundamental challenges of establishing causality in mediation analysis, including the difficulty of meeting the key assumption of sequential ignorability, even in experimental studies. We argue that this framework—which, although commonplace in other fields, has not yet been taken up in developmental science—can inform solutions to these challenges. Based on the framework, we offer a series of recommendations for improving causal inference in mediation analysis, including an overview of best practices in both study design and analysis, as well as resources for conducting analysis. In doing so, our overall objective in this article is to support the use of rigorous methods for understanding questions of mechanism in developmental science.

Download Full-text

Appendix: Qualitative Causal Models and the Potential-Outcomes Framework

Multi-Method Social Science ◽

10.1017/cbo9781316160831.009 ◽

2016 ◽

pp. 192-207

Author(s):

Jason Seawright

Keyword(s):

Causal Models ◽

Potential Outcomes ◽

Potential Outcomes Framework

Download Full-text

Variable selection in microbiome compositional data analysis

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqaa029 ◽

2020 ◽

Vol 2 (2) ◽

Cited By ~ 2

Author(s):

Antoni Susin ◽

Yiwen Wang ◽

Kim-Anh Lê Cao ◽

M Luz Calle

Keyword(s):

Data Analysis ◽

Variable Selection ◽

Compositional Data ◽

Penalized Regression ◽

Compositional Data Analysis ◽

Forward Selection ◽

Computationally Efficient ◽

Parsimonious Model ◽

Microbiome Data ◽

Log Ratio

Abstract Though variable selection is one of the most relevant tasks in microbiome analysis, e.g. for the identification of microbial signatures, many studies still rely on methods that ignore the compositional nature of microbiome data. The applicability of compositional data analysis methods has been hampered by the availability of software and the difficulty in interpreting their results. This work is focused on three methods for variable selection that acknowledge the compositional structure of microbiome data: selbal, a forward selection approach for the identification of compositional balances, and clr-lasso and coda-lasso, two penalized regression models for compositional data analysis. This study highlights the link between these methods and brings out some limitations of the centered log-ratio transformation for variable selection. In particular, the fact that it is not subcompositionally consistent makes the microbial signatures obtained from clr-lasso not readily transferable. Coda-lasso is computationally efficient and suitable when the focus is the identification of the most associated microbial taxa. Selbal stands out when the goal is to obtain a parsimonious model with optimal prediction performance, but it is computationally greedy. We provide a reproducible vignette for the application of these methods that will enable researchers to fully leverage their potential in microbiome studies.

Download Full-text

All your data are always missing: incorporating bias due to measurement error into the potential outcomes framework

International Journal of Epidemiology ◽

10.1093/ije/dyu272 ◽

2015 ◽

Vol 44 (4) ◽

pp. 1452-1459 ◽

Cited By ~ 27

Author(s):

Jessie K Edwards ◽

Stephen R Cole ◽

Daniel Westreich

Keyword(s):

Measurement Error ◽

Potential Outcomes ◽

Potential Outcomes Framework

Download Full-text

Outlier detection and variable selection via difference based regression model and penalized regression

Journal of the Korean Data and Information Science Society ◽

10.7465/jkdi.2018.29.3.815 ◽

2018 ◽

Vol 29 (3) ◽

pp. 815-825 ◽

Cited By ~ 1

Author(s):

InHae Choi ◽

Chun Gun Park ◽

Kyeong Eun Lee

Keyword(s):

Variable Selection ◽

Regression Model ◽

Outlier Detection ◽

Penalized Regression

Download Full-text

Applying a potential outcomes framework to estimate policy-relevant effects of exposure mixtures

ISEE Conference Abstracts ◽

10.1289/isee.2021.o-sy-100 ◽

2021 ◽

Vol 2021 (1) ◽

Author(s):

Jessie P. Buckley

Keyword(s):

Potential Outcomes ◽

Potential Outcomes Framework

Download Full-text

A mixture of local and quadratic approximation variable selection algorithm in nonconcave penalized regression

Revue Africaine de la Recherche en Informatique et Mathématiques Appliquées ◽

10.46298/arima.1962 ◽

2013 ◽

Vol Volume 16, 2012 ◽

Author(s):

Assi N'GUESSAN ◽

Ibrahim Sidi Zakari ◽

Assi Mkhadri

Keyword(s):

Variable Selection ◽

Penalized Likelihood ◽

Real Data ◽

Penalized Regression ◽

Penalty Functions ◽

Selection Algorithm ◽

Data Set ◽

International Audience ◽

One Step ◽

Nonconvex Penalty

International audience We consider the problem of variable selection via penalized likelihood using nonconvex penalty functions. To maximize the non-differentiable and nonconcave objective function, an algorithm based on local linear approximation and which adopts a naturally sparse representation was recently proposed. However, although it has promising theoretical properties, it inherits some drawbacks of Lasso in high dimensional setting. To overcome these drawbacks, we propose an algorithm (MLLQA) for maximizing the penalized likelihood for a large class of nonconvex penalty functions. The convergence property of MLLQA and oracle property of one-step MLLQA estimator are established. Some simulations and application to a real data set are also presented.

Download Full-text

Invited Commentary: Counterfactuals in Social Epidemiology—Thinking Outside of “the Box”

American Journal of Epidemiology ◽

10.1093/aje/kwz198 ◽

2019 ◽

Vol 189 (3) ◽

pp. 175-178 ◽

Cited By ~ 1

Author(s):

Tyler J VanderWeele

Keyword(s):

Social Movements ◽

Causal Inference ◽

Social Determinants Of Health ◽

Social Determinants ◽

Social Epidemiology ◽

Determinants Of Health ◽

Potential Outcomes ◽

The Social ◽

Potential Outcomes Framework

Abstract There are tensions inherent between many of the social exposures examined within social epidemiology and the assumptions embedded in quantitative potential-outcomes-based causal inference framework. The potential-outcomes framework characteristically requires a well-defined hypothetical intervention. As noted by Galea and Hernán (Am J Epidemiol. 2020;189(3):167–170), for many social exposures, such well-defined hypothetical exposures do not exist or there is no consensus on what they might be. Nevertheless, the quantitative potential-outcomes framework can still be useful for the study of some of these social exposures by creative adaptations that 1) redefine the exposure, 2) separate the exposure from the hypothetical intervention, or 3) allow for a distribution of hypothetical interventions. These various approaches and adaptations are reviewed and discussed. However, even these approaches have their limits. For certain important historical and social determinants of health such as social movements or wars, the quantitative potential-outcomes framework with well-defined hypothetical interventions is the wrong tool. Other modes of inquiry are needed.

Download Full-text

Propensity scores-potential outcomes framework to incorporate severity probabilities in the Highway Safety Manual crash prediction algorithm

Accident Analysis & Prevention ◽

10.1016/j.aap.2014.05.017 ◽

2014 ◽

Vol 71 ◽

pp. 183-193 ◽

Cited By ~ 9

Author(s):

Lekshmi Sasidharan ◽

Eric T. Donnell

Keyword(s):

Propensity Scores ◽

Highway Safety ◽

Potential Outcomes ◽

Prediction Algorithm ◽

Crash Prediction ◽

Highway Safety Manual ◽

Potential Outcomes Framework

Download Full-text

Seleksi Peubah menggunakan Algoritme Genetika pada Data Rancangan Faktorial Pecahan Lewat Jenuh Dua Taraf

Xplore Journal of Statistics ◽

10.29244/xplore.v10i1.473 ◽

2020 ◽

Vol 10 (1) ◽

pp. 55-69

Author(s):

Ani Safitri ◽

Rahma Anisa ◽

Bagus Sartono

Keyword(s):

Genetic Algorithm ◽

Variable Selection ◽

Stepwise Regression ◽

Computation Time ◽

Penalized Regression ◽

Fractional Factorial ◽

Supersaturated Design ◽

Forward Selection ◽

Main Factors ◽

Genetic Algorithm Model

In certain fields, experiments involve many factors and are constrained by costs. Reducing runs is one of the solutions to reduce experiment costs. But that can cause the number of runs to become less than the number of factors. This case of experimental design also is known as a supersaturated design. The important factors in this design are generally estimated by involving variable selection such as forward selection, stepwise regression, and penalized regression. Genetic algorithm is one of the methods that can be used for variable selection, especially for high dimensional data or supersaturated design. This study aims to use a genetic algorithm for variable selection in the supersaturated design and compare the genetic algorithm results with a stepwise regression which is generally used for a simple design. This study also involved fractional factorial design principles. The result showed that the main factors and interactions of the genetic algorithm and stepwise regression were quite different. But the principle was the same because the variables correlated. The genetic algorithm model had a smaller AIC and BIC and all of the main factors and interactions which had chosen were significant on the 0.1%. Therefore genetic algorithm model was chosen although computation time was much longer than stepwise regression.

Download Full-text