scholarly journals Improving on Adjusted R-Squared

2019 ◽  
Author(s):  
Julian Karch

The amount of variance explained is widely reported for quantifying the model fit of a multiple linear regression model. The default adjusted R-squared estimator has the disadvantage of not being unbiased. The theoretically optimal Olkin-Pratt estimator is unbiased. Despite this, it is not being used due to being difficult to compute. In this paper, I present an algorithm for the exact and fast computation of the Olkin-Pratt estimator, which enables using it. I compare the Olkin-Pratt, the adjusted R-squared, and 18 alternative estimators using a simulation study. The metrics I use for comparison closely resemble established theoretical optimality properties. Importantly, the exact Olkin-Pratt estimator is shown to be optimal under the standard metric, which considers an estimator optimal if it has the least mean squared error among all unbiased estimators. Under the important alternative metric, which aims for the estimator with the lowest mean squared error, no optimal estimator could be identified. Based on these results, I carefully provide recommendations on when to use which estimator, which first and foremost depends on the choice of which metric is deemed most appropriate. If such a choice is infeasible, I recommend using the exact Olkin-Pratt instead of the default adjusted R-squared estimator. To facilitate this, I provide the R package altR2, which implements the Olkin-Pratt estimator as well as all other estimators.

2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Julian Karch

The amount of variance explained is widely reported for quantifying the model fit of a multiple linear regression model. The default adjusted R-squared estimator has the disadvantage of not being unbiased. The theoretically optimal Olkin-Pratt estimator is unbiased. Despite this, it is not being used due to being difficult to compute. In this paper, I present an algorithm for the exact and fast computation of the Olkin-Pratt estimator, which facilitates its use. I compare the Olkin-Pratt, the adjusted R-squared, and 18 alternative estimators using a simulation study. The metrics I use for comparison closely resemble established theoretical optimality properties. Importantly, the exact Olkin-Pratt estimator is shown to be optimal under the standard metric, which considers an estimator optimal if it has the least mean squared error among all unbiased estimators. Under the important alternative metric, which aims for the estimator with the lowest mean squared error, no optimal estimator could be identified. Based on these results, I provide careful recommendations on when to use which estimator, which first and foremost depends on the choice of which metric is deemed most appropriate. If such a choice is infeasible, I recommend using the exact Olkin-Pratt instead of the default adjusted R-squared estimator. To facilitate this, I provide the R package altR2, which implements the Olkin-Pratt estimator as well as all other estimators.


1993 ◽  
Vol 9 (1) ◽  
pp. 62-80 ◽  
Author(s):  
Jan F. Kiviet ◽  
Garry D.A. Phillips

The small sample bias of the least-squares coefficient estimator is examined in the dynamic multiple linear regression model with normally distributed whitenoise disturbances and an arbitrary number of regressors which are all exogenous except for the one-period lagged-dependent variable. We employ large sample (T → ∞) and small disturbance (σ → 0) asymptotic theory and derive and compare expressions to O(T−1) and to O(σ2), respectively, for the bias in the least-squares coefficient vector. In some simulations and for an empirical example, we examine the mean (squared) error of these expressions and of corrected estimation procedures that yield estimates that are unbiased to O(T−l) and to O(σ2), respectively. The large sample approach proves to be superior, easily applicable, and capable of generating more efficient and less biased estimators.


2021 ◽  
Vol 19 (1) ◽  
pp. 2-21
Author(s):  
Talha Omer ◽  
Zawar Hussain ◽  
Muhammad Qasim ◽  
Said Farooq Shah ◽  
Akbar Ali Khan

Shrinkage estimators are introduced for the scale parameter of the Rayleigh distribution by using two different shrinkage techniques. The mean squared error properties of the proposed estimator have been derived. The comparison of proposed classes of the estimators is made with the respective conventional unbiased estimators by means of mean squared error in the simulation study. Simulation results show that the proposed shrinkage estimators yield smaller mean squared error than the existence of unbiased estimators.


2014 ◽  
Vol 2014 ◽  
pp. 1-6 ◽  
Author(s):  
Jibo Wu

The stochastic restrictedr-kclass estimator and stochastic restrictedr-dclass estimator are proposed for the vector of parameters in a multiple linear regression model with stochastic linear restrictions. The mean squared error matrix of the proposed estimators is derived and compared, and some properties of the proposed estimators are also discussed. Finally, a numerical example is given to show some of the theoretical results.


1994 ◽  
Vol 115 (2) ◽  
pp. 335-363 ◽  
Author(s):  
Stephen Man Sing Lee

AbstractA parametric bootstrap estimate (PB) may be more accurate than its non-parametric version (NB) if the parametric model upon which it is based is, at least approximately, correct. Construction of an optimal estimator based on both PB and NB is pursued with the aim of minimizing the mean squared error. Our approach is to pick an empirical estimate of the optimal tuning parameter ε∈[0, 1] which minimizes the mean square error of εNB+(1−ε) PB. The resulting hybrid estimator is shown to be more reliable than either PB or NB uniformly over a rich class of distributions. Theoretical asymptotic results show that the asymptotic error of this hybrid estimator is quite close in distribution to the smaller of the errors of PB and NB. All these errors typically have the same convergence rate of order . A particular example is also presented to illustrate the fact that this hybrid estimate can indeed be strictly better than either of the pure bootstrap estimates in terms of minimizing mean squared error. Two simulation studies were conducted to verify the theoretical results and demonstrate the good practical performance of the hybrid method.


1992 ◽  
Vol 22 (2) ◽  
pp. 149-165 ◽  
Author(s):  
Ragnar Norberg

AbstractThe theory of linear filtering of stochastic processes provides continuous time analogues of finite-dimensional linear Bayes estimators known to actuaries as credibility methods. In the present paper a selfcontained theory is built for processes of bounded variation, which are of particular relevance to insurance. Two methods for constructing the optimal estimator and its mean squared error are deviced. Explicit solutions are obtained in a continuous time variation of Hachemeister's regression model and in a homogeneous doubly stochastic generalized Poisson process. The traditional discrete time set-up is compared to the one with continuous time, and some merits of the latter are pointed out.


2012 ◽  
Vol 61 (2) ◽  
pp. 277-290 ◽  
Author(s):  
Ádám Csorba ◽  
Vince Láng ◽  
László Fenyvesi ◽  
Erika Michéli

Napjainkban egyre nagyobb igény mutatkozik olyan technológiák és módszerek kidolgozására és alkalmazására, melyek lehetővé teszik a gyors, költséghatékony és környezetbarát talajadat-felvételezést és kiértékelést. Ezeknek az igényeknek felel meg a reflektancia spektroszkópia, mely az elektromágneses spektrum látható (VIS) és közeli infravörös (NIR) tartományában (350–2500 nm) végzett reflektancia-mérésekre épül. Figyelembe véve, hogy a talajokról felvett reflektancia spektrum információban nagyon gazdag, és a vizsgált tartományban számos talajalkotó rendelkezik karakterisztikus spektrális „ujjlenyomattal”, egyetlen görbéből lehetővé válik nagyszámú, kulcsfontosságú talajparaméter egyidejű meghatározása. Dolgozatunkban, a reflektancia spektroszkópia alapjaira helyezett, a talajok ösz-szetételének meghatározását célzó módszertani fejlesztés első lépéseit mutatjuk be. Munkánk során talajok szervesszén- és CaCO3-tartalmának megbecslését lehetővé tévő többváltozós matematikai-statisztikai módszerekre (részleges legkisebb négyzetek módszere, partial least squares regression – PLSR) épülő prediktív modellek létrehozását és tesztelését végeztük el. A létrehozott modellek tesztelése során megállapítottuk, hogy az eljárás mindkét talajparaméter esetében magas R2értéket [R2(szerves szén) = 0,815; R2(CaCO3) = 0,907] adott. A becslés pontosságát jelző közepes négyzetes eltérés (root mean squared error – RMSE) érték mindkét paraméter esetében közepesnek mondható [RMSE (szerves szén) = 0,467; RMSE (CaCO3) = 3,508], mely a reflektancia mérési előírások standardizálásával jelentősen javítható. Vizsgálataink alapján arra a következtetésre jutottunk, hogy a reflektancia spektroszkópia és a többváltozós kemometriai eljárások együttes alkalmazásával, gyors és költséghatékony adatfelvételezési és -értékelési módszerhez juthatunk.


Sign in / Sign up

Export Citation Format

Share Document