covariate selection
Recently Published Documents


TOTAL DOCUMENTS

82
(FIVE YEARS 26)

H-INDEX

16
(FIVE YEARS 3)

2022 ◽  
Vol 114 ◽  
pp. 105950
Author(s):  
Reinhard Uehleke ◽  
Martin Petrick ◽  
Silke Hüttel
Keyword(s):  

Complexity ◽  
2022 ◽  
Vol 2022 ◽  
pp. 1-10
Author(s):  
Sara Muhammadullah ◽  
Amena Urooj ◽  
Faridoon Khan ◽  
Mohammed N Alshahrani ◽  
Mohammed Alqawba ◽  
...  

In order to reduce the dimensionality of parameter space and enhance out-of-sample forecasting performance, this research compares regularization techniques with Autometrics in time-series modeling. We mainly focus on comparing weighted lag adaptive LASSO (WLAdaLASSO) with Autometrics, but as a benchmark, we estimate other popular regularization methods LASSO, AdaLASSO, SCAD, and MCP. For analytical comparison, we implement Monte Carlo simulation and assess the performance of these techniques in terms of out-of-sample Root Mean Square Error, Gauge, and Potency. The comparison is assessed with varying autocorrelation coefficients and sample sizes. The simulation experiment indicates that, compared to Autometrics and other regularization approaches, the WLAdaLASSO outperforms the others in covariate selection and forecasting, especially when there is a greater linear dependency between predictors. In contrast, the computational efficiency of Autometrics decreases with a strong linear dependency between predictors. However, under the large sample and weak linear dependency between predictors, the Autometrics potency ⟶ 1 and gauge ⟶ α. In contrast, LASSO, AdaLASSO, SCAD, and MCP select more covariates and possess higher RMSE than Autometrics and WLAdaLASSO. To compare the considered techniques, we made the Generalized Unidentified Model for covariate selection and out-of-sample forecasting for the trade balance of Pakistan. We train the model on 1985–2015 observations and 2016–2020 observations as test data for the out-of-sample forecast.


2021 ◽  
Author(s):  
Alexia Couture ◽  
Danielle Iuliano ◽  
Howard H Chang ◽  
Neha N Patel ◽  
Matthew Gilmer ◽  
...  

Introduction: In the United States, COVID-19 is a nationally notifiable disease, cases and hospitalizations are reported to the CDC by states. Identifying and reporting every case from every facility in the United States may not be feasible in the long term. Creating sustainable methods for estimating burden of COVID-19 from established sentinel surveillance systems is becoming more important. We aimed to provide a method leveraging surveillance data to create a long-term solution to estimate monthly rates of hospitalizations for COVID-19. Methods: We estimated monthly hospitalization rates for COVID-19 from May 2020 through April 2021 for the 50 states using surveillance data from COVID-19-Associated Hospitalization Surveillance Network (COVID-NET) and a Bayesian hierarchical model for extrapolation. We created a model for six age groups (0-17, 18-49, 50-64, 65-74, 75-84, and ≥85 years), separately. We identified covariates from multiple data sources that varied by age, state, and/or month, and performed covariate selection for each age group based on two methods, Least Absolute Shrinkage and Selection Operator (LASSO) and Spike and Slab selection methods. We validated our method by checking sensitivity of model estimates to covariate selection and model extrapolation as well as comparing our results to external data. Results: We estimated 3,569,500 (90% Credible Interval:3,238,000 - 3,934,700) hospitalizations for a cumulative incidence of 1,089.8 (988.6 - 1,201.3) hospitalizations per 100,000 population with COVID-19 in the United States from May 2020 through April 2021. Cumulative incidence varied from 352 - 1,821per 100,000 between states. The age group with the highest cumulative incidence was aged greater than or equal to 85 years (5,583.1; 5,061.0 - 6,157.5). The monthly hospitalization rate was highest in December (183.8; 154.5 - 218.0). Our monthly estimates by state showed variations in magnitudes of peak rates, number of peaks and timing of peaks between states. Conclusions: Our novel approach to estimate COVID-19 hospitalizations has potential to provide sustainable estimates for monitoring COVID-19 burden, as well as a flexible framework leveraging surveillance data.


2021 ◽  
Author(s):  
Wen Wei Loh ◽  
Dongning Ren

Valid inference of cause-and-effect relations in observational studies necessitates adjusting for common causes of the focal predictor (i.e., treatment) and the outcome. When such common causes, henceforth termed confounders, remain unadjusted for, they generate spurious correlations that lead to biased causal effect estimates. But routine adjustment for all available covariates, when only a subset are truly confounders, is known to yield potentially inefficient and unstable estimators. In this article, we introduce a data-driven confounder selection strategy that focuses on stable estimation of the treatment effect. The approach exploits the causal knowledge that after adjusting for confounders to eliminate all confounding biases, adding any remaining non-confounding covariates associated with only treatment or outcome, but not both, should not systematically change the effect estimator. The strategy proceeds in two steps. First, we prioritize covariates for adjustment by probing how strongly each covariate is associated with treatment and outcome. Next, we gauge the stability of the effect estimator by evaluating its trajectory adjusting for different covariate subsets. The smallest subset that yields a stable effect estimate is then selected. Thus, the strategy offers direct insight into the (in)sensitivity of the effect estimator to the chosen covariates for adjustment. The ability to correctly select confounders and yield valid causal inference following data-driven covariate selection is evaluated empirically using extensive simulation studies. Furthermore, we compare the proposed method empirically with routine variable selection methods. Finally, we demonstrate the procedure using two publicly available real-world datasets.


2021 ◽  
Author(s):  
Wen Wei Loh ◽  
Dongning Ren

Valid inference of cause-and-effect relations in observational studies necessitates adjusting for common causes of the focal predictor (i.e., treatment) and the outcome. When such common causes, henceforth termed confounders, remain unadjusted for, they generate spurious correlations that lead to biased causal effect estimates. But routine adjustment for all available covariates, when only a subset are truly confounders, is known to yield potentially inefficient and unstable estimators. In this article, we introduce a data-driven confounder selection strategy that focuses on stable estimation of the treatment effect. The approach exploits the causal knowledge that after adjusting for confounders to eliminate all confounding biases, adding any remaining non-confounding covariates associated with only treatment or outcome, but not both, should not systematically change the effect estimator. The strategy proceeds in two steps. First, we prioritize covariates for adjustment by probing how strongly each covariate is associated with treatment and outcome. Next, we gauge the stability of the effect estimator by evaluating its trajectory adjusting for different covariate subsets. The smallest subset that yields a stable effect estimate is then selected. Thus, the strategy offers direct insight into the (in)sensitivity of the effect estimator to the chosen covariates for adjustment. The ability to correctly select confounders and yield valid causal inference following data-driven covariate selection is evaluated empirically using extensive simulation studies. Furthermore, we compare the proposed method empirically with routine variable selection methods. Finally, we demonstrate the procedure using two publicly available real-world datasets.


2021 ◽  
Author(s):  
Martin Green ◽  
Eliana Lima ◽  
Robert Hyde

Abstract Epidemiological research commonly involves identification of causal factors from within high dimensional (wide) data, where predictor variables outnumber observations. In this situation, however, conventional stepwise selection procedures perform poorly. Selection stability is one method to aid robust variable selection, by refitting a model to repeated resamples of the data and calculating the proportion of times each covariate is selected. A key problem when applying selection stability is to determine a threshold of stability above which a covariate is deemed ‘important’. In this research we describe and illustrate a two-step process to implement a stability threshold for covariate selection. Firstly, covariate stability distributions were established with a permuted model (randomly reordering the outcome to sever the relationship with predictors) using a cumulative distribution function. Subsequently, covariate stability was estimated using the true model outcome and covariates with a stability above a threshold defined from the permuted model, were selected in a final model. The proposed method performed well across 22 varied, simulated datasets with known outcomes; selection error rates were consistently lower than conventional implementation of equivalent models. This method of covariate selection appears to offer substantial advantages over current methods, to accurately identify the correct covariates from within a large, complex parameter space.


Sign in / Sign up

Export Citation Format

Share Document