scholarly journals mecor: An R package for measurement error correction in linear regression models with a continuous outcome

Author(s):  
Linda Nab ◽  
Maarten van Smeden ◽  
Ruth H. Keogh ◽  
Rolf H.H. Groenwold
2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Jan Klosa ◽  
Noah Simon ◽  
Pål Olof Westermark ◽  
Volkmar Liebscher ◽  
Dörte Wittenburg

Abstract Background Statistical analyses of biological problems in life sciences often lead to high-dimensional linear models. To solve the corresponding system of equations, penalization approaches are often the methods of choice. They are especially useful in case of multicollinearity, which appears if the number of explanatory variables exceeds the number of observations or for some biological reason. Then, the model goodness of fit is penalized by some suitable function of interest. Prominent examples are the lasso, group lasso and sparse-group lasso. Here, we offer a fast and numerically cheap implementation of these operators via proximal gradient descent. The grid search for the penalty parameter is realized by warm starts. The step size between consecutive iterations is determined with backtracking line search. Finally, seagull -the R package presented here- produces complete regularization paths. Results Publicly available high-dimensional methylation data are used to compare seagull to the established R package SGL. The results of both packages enabled a precise prediction of biological age from DNA methylation status. But even though the results of seagull and SGL were very similar (R2 > 0.99), seagull computed the solution in a fraction of the time needed by SGL. Additionally, seagull enables the incorporation of weights for each penalized feature. Conclusions The following operators for linear regression models are available in seagull: lasso, group lasso, sparse-group lasso and Integrative LASSO with Penalty Factors (IPF-lasso). Thus, seagull is a convenient envelope of lasso variants.


The R Journal ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 99
Author(s):  
Lily Medina ◽  
Ann-Kristin Kreutzmann ◽  
Natalia Rojas-Perilla ◽  
Piedad Castro

Author(s):  
Anusha Musunuru ◽  
Richard J. Porter

Road safety modelers frequently use average annual daily traffic (AADT) as a measure of exposure in regression models of expected crash frequency for road segments and intersections. Recorded AADT values at most locations are estimated by state and local transportation agencies with significant uncertainty, often by extrapolating short-term traffic counts over time and space. This uncertainty in the traffic volume estimates, often termed in a modeling context as measurement error in right-hand-side variables, can have serious effects on model estimation, including: 1) biased regression coefficient estimates; and 2) increases in dispersion. The structure and magnitude of measurement error in AADT estimates are not clearly understood by researchers or practitioners, leading to difficulties in explicitly accounting for this error in statistical road safety models, and ultimately in finding solutions for its correction. This study explores the impacts of measurement error in traffic volume estimates on statistical road safety models by employing measurement error correction approaches, including regression calibration and simulation extrapolation. The concept is demonstrated using crash, traffic, and roadway data from rural, two-lane horizontal curves in the State of Washington. The overall results show that the regression coefficient estimates with a positive coefficient were larger and those with a negative coefficient were smaller (i.e., more negative) when the measurement error correction methods were applied to the regression models of expected crash frequency. Future directions in applications of measurement error correction approaches to road safety research are provided.


Sign in / Sign up

Export Citation Format

Share Document