A Simple Aggregation Rule for Penalized Regression Coefficients after Multiple Imputation

2021 ◽  
pp. 1-14
Author(s):  
Ryan A. Peterson
2018 ◽  
Vol 22 (4) ◽  
pp. 391-409
Author(s):  
John M. Roberts ◽  
Aki Roberts ◽  
Tim Wadsworth

Incident-level homicide datasets such as the Supplementary Homicide Reports (SHR) commonly exhibit missing data. We evaluated multiple imputation methods (that produce multiple completed datasets, across which imputed values may vary) via unique data that included actual values, from police agency incident reports, of seemingly missing SHR data. This permitted evaluation under a real, not assumed or simulated, missing data mechanism. We compared analytic results based on multiply imputed and actual data; multiple imputation rather successfully recovered victim–offender relationship distributions and regression coefficients that hold in the actual data. Results are encouraging for users of multiple imputation, though it is still important to minimize the extent of missing information in SHR and similar data.


Mathematics ◽  
2021 ◽  
Vol 9 (11) ◽  
pp. 1288
Author(s):  
Daniela De Canditiis ◽  
Italia De Feis

We introduce a new methodology for anomaly detection (AD) in multichannel fast oscillating signals based on nonparametric penalized regression. Assuming the signals share similar shapes and characteristics, the estimation procedures are based on the use of the Rational-Dilation Wavelet Transform (RADWT), equipped with a tunable Q-factor able to provide sparse representations of functions with different oscillations persistence. Under the standard hypothesis of Gaussian additive noise, we model the signals by the RADWT and the anomalies as additive in each signal. Then we perform AD imposing a double penalty on the multiple regression model we obtained, promoting group sparsity both on the regression coefficients and on the anomalies. The first constraint preserves a common structure on the underlying signal components; the second one aims to identify the presence/absence of anomalies. Numerical experiments show the performance of the proposed method in different synthetic scenarios as well as in a real case.


2018 ◽  
Vol 28 (5) ◽  
pp. 1311-1327 ◽  
Author(s):  
Faisal M Zahid ◽  
Christian Heumann

Missing data is a common issue that can cause problems in estimation and inference in biomedical, epidemiological and social research. Multiple imputation is an increasingly popular approach for handling missing data. In case of a large number of covariates with missing data, existing multiple imputation software packages may not work properly and often produce errors. We propose a multiple imputation algorithm called mispr based on sequential penalized regression models. Each variable with missing values is assumed to have a different distributional form and is imputed with its own imputation model using the ridge penalty. In the case of a large number of predictors with respect to the sample size, the use of a quadratic penalty guarantees unique estimates for the parameters and leads to better predictions than the usual Maximum Likelihood Estimation (MLE), with a good compromise between bias and variance. As a result, the proposed algorithm performs well and provides imputed values that are better even for a large number of covariates with small samples. The results are compared with the existing software packages mice, VIM and Amelia in simulation studies. The missing at random mechanism was the main assumption in the simulation study. The imputation performance of the proposed algorithm is evaluated with mean squared imputation error and mean absolute imputation error. The mean squared error ([Formula: see text]), parameter estimates with their standard errors and confidence intervals are also computed to compare the performance in the regression context. The proposed algorithm is observed to be a good competitor to the existing algorithms, with smaller mean squared imputation error, mean absolute imputation error and mean squared error. The algorithm’s performance becomes considerably better than that of the existing algorithms with increasing number of covariates, especially when the number of predictors is close to or even greater than the sample size. Two real-life datasets are also used to examine the performance of the proposed algorithm using simulations.


Marketing ZFP ◽  
2019 ◽  
Vol 41 (4) ◽  
pp. 33-42
Author(s):  
Thomas Otter

Empirical research in marketing often is, at least in parts, exploratory. The goal of exploratory research, by definition, extends beyond the empirical calibration of parameters in well established models and includes the empirical assessment of different model specifications. In this context researchers often rely on the statistical information about parameters in a given model to learn about likely model structures. An example is the search for the 'true' set of covariates in a regression model based on confidence intervals of regression coefficients. The purpose of this paper is to illustrate and compare different measures of statistical information about model parameters in the context of a generalized linear model: classical confidence intervals, bootstrapped confidence intervals, and Bayesian posterior credible intervals from a model that adapts its dimensionality as a function of the information in the data. I find that inference from the adaptive Bayesian model dominates that based on classical and bootstrapped intervals in a given model.


2010 ◽  
Author(s):  
Stanley J. Zarnoch ◽  
H. Ken Cordell ◽  
Carter J. Betz ◽  
John C. Bergstrom

2020 ◽  
pp. 89-97
Author(s):  
A. U. Yakupov ◽  
D. A. Cherentsov ◽  
K. S. Voronin ◽  
Yu. D. Zemenkov

The article performed the processing of the results of a computer experiment to determine the cooling time of oil in a stopped oil pipeline. We proposed a calculation model in previous works that allows you to simulate the process of cooling oil.There was a need to verify the previously obtained results when conducting a laboratory experiment on a stand with soil. To conduct the experiment, it was necessary to conduct the planning of the experiment. The factors affecting the cooling time of oil in the oil pipeline, which will vary in the proposed experiment, are determined, empirical relationships are established. A regression analysis was carried out, and the dispersion homogeneity was checked using the Cochren criterion. The estimates of reproducibility variances are calculated. The adequacy hypothesis was tested using the Fisher criterion. Significant regression coefficients are established.


The present study explored the relationship between spot and futures coffee prices. The Correlation and Regression analysis were carried out based on monthly observations of International Coffee Organization (ICO) indicator prices of the four groups (Colombian Milds, Other Milds, Brazilian Naturals, and Robustas) representing Spot markets and the averages of 2nd and 3rd positions of the Intercontinental Exchange (ICE) New York for Arabica and ICE Europe for Robusta representing the Futures market for the period 1990 to 2019. The study also used the monthly average prices paid to coffee growers in India from 1990 to 2019. The estimated correlation coefficients indicated both the Futures prices and Spot prices of coffee are highly correlated. Further, estimated regression coefficients revealed a very strong relationship between Futures prices and Spot prices for all four ICO group indicator prices. Hence, the ICE New York (Arabica) and ICE Europe (Robusta) coffee futures prices are very closely related to Spot prices. The estimated regression coefficients between Futures prices and the price paid to coffee growers in India confirmed the positive relationship, but the dispersion of more prices over the trend line indicates a lesser degree of correlation between the price paid to growers at India and Futures market prices during the study period.


2019 ◽  
Author(s):  
Josh Colston ◽  
Pablo Peñataro Yori ◽  
Lawrence H. Moulton ◽  
Maribel Paredes Olortegui ◽  
Peter S. Kosek ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document