scholarly journals Centering in Multiple Regression Does Not Always Reduce Multicollinearity: How to Tell When Your Estimates Will Not Benefit From Centering

2018 ◽  
Vol 79 (5) ◽  
pp. 813-826 ◽  
Author(s):  
Oscar L. Olvera Astivia ◽  
Edward Kroc

Within the context of moderated multiple regression, mean centering is recommended both to simplify the interpretation of the coefficients and to reduce the problem of multicollinearity. For almost 30 years, theoreticians and applied researchers have advocated for centering as an effective way to reduce the correlation between variables and thus produce more stable estimates of regression coefficients. By reviewing the theory on which this recommendation is based, this article presents three new findings. First, that the original assumption of expectation-independence among predictors on which this recommendation is based can be expanded to encompass many other joint distributions. Second, that for many jointly distributed random variables, even some that enjoy considerable symmetry, the correlation between the centered main effects and their respective interaction can increase when compared with the correlation of the uncentered effects. Third, that the higher order moments of the joint distribution play as much of a role as lower order moments such that the symmetry of lower dimensional marginals is a necessary but not sufficient condition for a decrease in correlation between centered main effects and their interaction. Theoretical and simulation results are presented to help conceptualize the issues.

2021 ◽  
Vol 106 (3) ◽  
pp. 467-475
Author(s):  
Jeffrey B. Vancouver ◽  
Bruce W. Carlson ◽  
Lindsay Y. Dhanani ◽  
Cassandra E. Colton

2019 ◽  
Author(s):  
Adam Altmejd ◽  
Anna Dreber ◽  
Eskil Forsell ◽  
Teck Hua Ho ◽  
Juergen Huber ◽  
...  

We measure how accurately replication of experimental results can be predicted by a black-box statistical model. With data from four large- scale replication projects in experimental psychology and economics, and techniques from machine learning, we train a predictive model and study which variables drive predictable replication.The model predicts binary replication with a cross validated accuracy rate of 70% (AUC of 0.79) and relative effect size with a Spearman ρ of 0.38. The accuracy level is similar to the market-aggregated beliefs of peer scientists (Camerer et al., 2016; Dreber et al., 2015). The predictive power is validated in a pre-registered out of sample test of the outcome of Camerer et al. (2018b), where 71% (AUC of 0.73) of replications are predicted correctly and effect size correlations amount to ρ = 0.25.Basic features such as the sample and effect sizes in original papers, and whether reported effects are single-variable main effects or two- variable interactions, are predictive of successful replication. The models presented in this paper are simple tools to produce cheap, prognostic replicability metrics. These models could be useful in institutionalizing the process of evaluation of new findings and guiding resources to those direct replications that are likely to be most informative.


Author(s):  
K. P. Singh ◽  
B. Patel ◽  
Rakesh Kumar ◽  
R. K. Roy ◽  
S. K. Singh

The study on Cauliflower cv. ‘Pusa Dipali’ was carried out to find out the correlation and multiple regression coefficients studies of yield and yield contributing characters. Yield was found to be highly and significantly positively correlated with all the ancillary characters viz, curd depth (0.9180), curd diameter (0.9050), weight of curd (0.8990, plant height (0.8898), weight of plant (0.8768) and plant girth (0.6880). The multiple regression coefficients were found to be non significant due to multi collinearly between the characters. The step wise regression analysis showed that curd depth has highest contribution towards field followed by curd weight, curd diameter and plant height while the lowest contribution was due to plant girth and weight of plant.


1981 ◽  
Vol 61 (2) ◽  
pp. 255-263 ◽  
Author(s):  
R. M. De PAUW ◽  
D. G. FARIS ◽  
C. J. WILLIAMS

Three cultivars of each crop, wheat (Triticum aestivum L.), oats (Avena sativa L.), and barley (Hordeum vulgare L.), were grown for 4 yr at five locations north of the 55th parallel in northwestern Canada. There were highly significant differences among all main effects and interactions. Galt barley produced the highest seed yield followed by Centennial barley, Random oats and Harmon oats. Victory oats, Olli barley, Neepawa wheat and Pitic 62 wheat yielded similarly to each other while Thatcher wheat was significantly lower yielding. Mean environment yields ranged from 2080 to 5610 kg/ha. The genotype-environment (GE) interaction of species and cultivars was sufficiently complicated that it could not be characterized by one or two statistics (e.g., stability variances or regression coefficients). However, variability in frost-free period among years and locations contributed to the GE interaction because, for example, some cultivars yielded well (e.g., Pitic 62) only in those year-location environments with a relatively long frost-free period while other early maturing cultivars (e.g., Olli) performed well even in a short frost-free period environment.


2019 ◽  
Vol 11 (13) ◽  
pp. 3523 ◽  
Author(s):  
Benjamín García García ◽  
Caridad Rosique Jiménez ◽  
Felipe Aguado-Giménez ◽  
José García García

Equations were developed through multiple regression analysis (MRA) to explain the variability of potential environmental impacts (PEIs) estimated by life cycle assessment (LCA). The case studied refers to the production of seabass in basic offshore fish farms. Contribution analysis showed that the components of the system which most influence the potential environmental impacts are the feed (54% of the overall impact) and the fuel consumed by vessels operating in the farm (23%). Feed and fuel varied widely from one fish farm to another due to different factors, such as the efficiency of the feeding system used in each of them, or the distance from the harbor to the farm. Therefore, a number of scenarios (13) were simulated with different values of both factors and the results of the PEI were fitted by MRA to the model: PEI = a + b × Feed + c × Fuel. For all the PEIs, the regression coefficients were significant (p < 0.05) and R2 was 1. These equations allow us to estimate simply and quickly very different scenarios that reflect the reality of different farms at the present time, but also future scenarios based on the implementation of technologies that will decrease both feed and fuel consumption.


Sign in / Sign up

Export Citation Format

Share Document