scholarly journals Penalized regression procedures for variable selection in the potential outcomes framework

2015 ◽  
Vol 34 (10) ◽  
pp. 1645-1658 ◽  
Author(s):  
Debashis Ghosh ◽  
Yeying Zhu ◽  
Donna L. Coffman
2021 ◽  
pp. 016502542098164
Author(s):  
Jorge Cuartas ◽  
Dana Charles McCoy

Mediation has played a critical role in developmental theory and research. Yet, developmentalists rarely discuss the methodological challenges of establishing causality in mediation analysis or potential strategies to improve the identification of causal mediation effects. In this article, we discuss the potential outcomes framework from statistics as a means for highlighting several fundamental challenges of establishing causality in mediation analysis, including the difficulty of meeting the key assumption of sequential ignorability, even in experimental studies. We argue that this framework—which, although commonplace in other fields, has not yet been taken up in developmental science—can inform solutions to these challenges. Based on the framework, we offer a series of recommendations for improving causal inference in mediation analysis, including an overview of best practices in both study design and analysis, as well as resources for conducting analysis. In doing so, our overall objective in this article is to support the use of rigorous methods for understanding questions of mechanism in developmental science.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Antoni Susin ◽  
Yiwen Wang ◽  
Kim-Anh Lê Cao ◽  
M Luz Calle

Abstract Though variable selection is one of the most relevant tasks in microbiome analysis, e.g. for the identification of microbial signatures, many studies still rely on methods that ignore the compositional nature of microbiome data. The applicability of compositional data analysis methods has been hampered by the availability of software and the difficulty in interpreting their results. This work is focused on three methods for variable selection that acknowledge the compositional structure of microbiome data: selbal, a forward selection approach for the identification of compositional balances, and clr-lasso and coda-lasso, two penalized regression models for compositional data analysis. This study highlights the link between these methods and brings out some limitations of the centered log-ratio transformation for variable selection. In particular, the fact that it is not subcompositionally consistent makes the microbial signatures obtained from clr-lasso not readily transferable. Coda-lasso is computationally efficient and suitable when the focus is the identification of the most associated microbial taxa. Selbal stands out when the goal is to obtain a parsimonious model with optimal prediction performance, but it is computationally greedy. We provide a reproducible vignette for the application of these methods that will enable researchers to fully leverage their potential in microbiome studies.


Author(s):  
Assi N'GUESSAN ◽  
Ibrahim Sidi Zakari ◽  
Assi Mkhadri

International audience We consider the problem of variable selection via penalized likelihood using nonconvex penalty functions. To maximize the non-differentiable and nonconcave objective function, an algorithm based on local linear approximation and which adopts a naturally sparse representation was recently proposed. However, although it has promising theoretical properties, it inherits some drawbacks of Lasso in high dimensional setting. To overcome these drawbacks, we propose an algorithm (MLLQA) for maximizing the penalized likelihood for a large class of nonconvex penalty functions. The convergence property of MLLQA and oracle property of one-step MLLQA estimator are established. Some simulations and application to a real data set are also presented.


2019 ◽  
Vol 189 (3) ◽  
pp. 175-178 ◽  
Author(s):  
Tyler J VanderWeele

Abstract There are tensions inherent between many of the social exposures examined within social epidemiology and the assumptions embedded in quantitative potential-outcomes-based causal inference framework. The potential-outcomes framework characteristically requires a well-defined hypothetical intervention. As noted by Galea and Hernán (Am J Epidemiol. 2020;189(3):167–170), for many social exposures, such well-defined hypothetical exposures do not exist or there is no consensus on what they might be. Nevertheless, the quantitative potential-outcomes framework can still be useful for the study of some of these social exposures by creative adaptations that 1) redefine the exposure, 2) separate the exposure from the hypothetical intervention, or 3) allow for a distribution of hypothetical interventions. These various approaches and adaptations are reviewed and discussed. However, even these approaches have their limits. For certain important historical and social determinants of health such as social movements or wars, the quantitative potential-outcomes framework with well-defined hypothetical interventions is the wrong tool. Other modes of inquiry are needed.


2020 ◽  
Vol 10 (1) ◽  
pp. 55-69
Author(s):  
Ani Safitri ◽  
Rahma Anisa ◽  
Bagus Sartono

In certain fields, experiments involve many factors and are constrained by costs. Reducing runs is one of the solutions to reduce experiment costs. But that can cause the number of runs to become less than the number of factors. This case of experimental design also is known as a supersaturated design. The important factors in this design are generally estimated by involving variable selection such as forward selection, stepwise regression, and penalized regression. Genetic algorithm is one of the methods that can be used for variable selection, especially for high dimensional data or supersaturated design. This study aims to use a genetic algorithm for variable selection in the supersaturated design and compare the genetic algorithm results with a stepwise regression which is generally used for a simple design. This study also involved fractional factorial design principles. The result showed that the main factors and interactions of the genetic algorithm and stepwise regression were quite different. But the principle was the same because the variables correlated. The genetic algorithm model had a smaller AIC and BIC and all of the main factors and interactions which had chosen were significant on the 0.1%. Therefore genetic algorithm model was chosen although computation time was much longer than stepwise regression.


Sign in / Sign up

Export Citation Format

Share Document