scholarly journals Optimized application of penalized regression methods to diverse genomic data

2011 ◽  
Vol 27 (24) ◽  
pp. 3399-3406 ◽  
Author(s):  
Levi Waldron ◽  
Melania Pintilie ◽  
Ming-Sound Tsao ◽  
Frances A. Shepherd ◽  
Curtis Huttenhower ◽  
...  
Author(s):  
Mayrim Vega-Hernández ◽  
Eduardo Martínez-Montes ◽  
Jhoanna Pérez-Hidalgo-Gato ◽  
José M. Sánchez-Bornot ◽  
Pedro Valdés-Sosa

Author(s):  
Pascalis Kadaro Matthew ◽  
Abubakar Yahaya

<p>Some few decades ago, penalized regression techniques for linear regression have been developed specifically to reduce the flaws inherent in the prediction accuracy of the classical ordinary least squares (OLS) regression technique. In this paper, we used a diabetes data set obtained from previous literature to compare three of these well-known techniques, namely: Least Absolute Shrinkage Selection Operator (LASSO), Elastic Net and Correlation Adjusted Elastic Net (CAEN). After thorough analysis, it was observed that CAEN generated a less complex model.</p>


2018 ◽  
Vol 61 (4) ◽  
pp. 451-458
Author(s):  
Suna Akkol

Abstract. The least absolute selection and shrinkage operator (LASSO) and adaptive LASSO methods have become a popular model in the last decade, especially for data with a multicollinearity problem. This study was conducted to estimate the live weight (LW) of Hair goats from biometric measurements and to select variables in order to reduce the model complexity by using penalized regression methods: LASSO and adaptive LASSO for γ=0.5 and γ=1. The data were obtained from 132 adult goats in Honaz district of Denizli province. Age, gender, forehead width, ear length, head length, chest width, rump height, withers height, back height, chest depth, chest girth, and body length were used as explanatory variables. The adjusted coefficient of determination (Radj2), root mean square error (RMSE), Akaike's information criterion (AIC), Schwarz Bayesian criterion (SBC), and average square error (ASE) were used in order to compare the effectiveness of the methods. It was concluded that adaptive LASSO (γ=1) estimated the LW with the highest accuracy for both male (Radj2=0.9048; RMSE = 3.6250; AIC = 79.2974; SBC = 65.2633; ASE = 7.8843) and female (Radj2=0.7668; RMSE = 4.4069; AIC = 392.5405; SBC = 308.9888; ASE = 18.2193) Hair goats when all the criteria were considered.


PLoS Genetics ◽  
2015 ◽  
Vol 11 (12) ◽  
pp. e1005689 ◽  
Author(s):  
Silvia Pineda ◽  
Francisco X. Real ◽  
Manolis Kogevinas ◽  
Alfredo Carrato ◽  
Stephen J. Chanock ◽  
...  

2015 ◽  
Vol 14s2 ◽  
pp. CIN.S17295 ◽  
Author(s):  
Jenna Czarnota ◽  
Chris Gennings ◽  
David C. Wheeler

In evaluation of cancer risk related to environmental chemical exposures, the effect of many chemicals on disease is ultimately of interest. However, because of potentially strong correlations among chemicals that occur together, traditional regression methods suffer from collinearity effects, including regression coefficient sign reversal and variance inflation. In addition, penalized regression methods designed to remediate collinearity may have limitations in selecting the truly bad actors among many correlated components. The recently proposed method of weighted quantile sum (WQS) regression attempts to overcome these problems by estimating a body burden index, which identifies important chemicals in a mixture of correlated environmental chemicals. Our focus was on assessing through simulation studies the accuracy of WQS regression in detecting subsets of chemicals associated with health outcomes (binary and continuous) in site-specific analyses and in non-site-specific analyses. We also evaluated the performance of the penalized regression methods of lasso, adaptive lasso, and elastic net in correctly classifying chemicals as bad actors or unrelated to the outcome. We based the simulation study on data from the National Cancer Institute Surveillance Epidemiology and End Results Program (NCI-SEER) case-control study of non-Hodgkin lymphoma (NHL) to achieve realistic exposure situations. Our results showed that WQS regression had good sensitivity and specificity across a variety of conditions considered in this study. The shrinkage methods had a tendency to incorrectly identify a large number of components, especially in the case of strong association with the outcome.


2016 ◽  
Vol 81 (3) ◽  
pp. 142-149 ◽  
Author(s):  
Chen Lu ◽  
George T. O'Connor ◽  
Josée Dupuis ◽  
Eric D. Kolaczyk

2019 ◽  
Vol 30 (3) ◽  
pp. 697-719 ◽  
Author(s):  
Fan Wang ◽  
Sach Mukherjee ◽  
Sylvia Richardson ◽  
Steven M. Hill

AbstractPenalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well developed, the relative efficacy of different approaches in finite-sample settings, as encountered in practice, remains incompletely understood. There is therefore a need for empirical investigations in this area that can offer practical insight and guidance to users. In this paper, we present a large-scale comparison of penalized regression methods. We distinguish between three related goals: prediction, variable selection and variable ranking. Our results span more than 2300 data-generating scenarios, including both synthetic and semisynthetic data (real covariates and simulated responses), allowing us to systematically consider the influence of various factors (sample size, dimensionality, sparsity, signal strength and multicollinearity). We consider several widely used approaches (Lasso, Adaptive Lasso, Elastic Net, Ridge Regression, SCAD, the Dantzig Selector and Stability Selection). We find considerable variation in performance between methods. Our results support a “no panacea” view, with no unambiguous winner across all scenarios or goals, even in this restricted setting where all data align well with the assumptions underlying the methods. The study allows us to make some recommendations as to which approaches may be most (or least) suitable given the goal and some data characteristics. Our empirical results complement existing theory and provide a resource to compare methods across a range of scenarios and metrics.


Sign in / Sign up

Export Citation Format

Share Document