Evaluating the Statistical Significance of Models Developed by Stepwise Regression

1983 ◽  
Vol 20 (1) ◽  
pp. 1-11 ◽  
Author(s):  
Shelby H. McIntyre ◽  
David B. Montgomery ◽  
V. Srinivasan ◽  
Barton A. Weitz

Information for evaluating the statistical significance of stepwise regression models developed with a forward selection procedure is presented. Cumulative distributions of the adjusted coefficient of determination ([Formula: see text]) under the null hypothesis of no relationship between the dependent variable and m potential independent variables are derived from a Monté Carlo simulation study. The study design included sample sizes of 25, 50, and 100, available independent variables of 10, 20, and 40, and three criteria for including variables in the regression model. The results reveal that the biases involved in testing statistical significance by two well-known rules are very large, thus demonstrating the desirability of using the Monté Carlo cumulative [Formula: see text] distributions developed by the authors. Although the results were derived under the assumption of uncorrelated predictors, the authors show that the results continue to be useful for the correlated predictor case.

Author(s):  
Rolando Pena-Sanchez ◽  
Jacques Verville ◽  
Christine Bernadas

<p class="MsoNormal" style="text-align: justify; margin: 0in 34.2pt 0pt 0.5in;"><span style="font-size: 10pt;"><span style="font-family: Times New Roman;">Often researchers in the field of information systems face problems related to the variable selection for model building; as well as difficulties associated to their data (small sample and/or non normality). The goal of this article is to present an original statistical blocking-technique based on relative variability for screening of variables in multivariate regression models. We applied the blocking-technique and a nonparametric bootstrapping method to the data collected on the <span style="text-decoration: underline;">USA-South border</span> for a research concerning enterprise software (ES) acquisition contracts. Three mutually exclusive blocks of relative variability for the response variables were formed and their corresponding regression models were built and explained. A conclusion was drawn about the decreasing tendency on the adjusted coefficient of determination (R<sup>2</sup><sub>adj</sub>) magnitudes when the blocks change from low (L) to high (H) condition of relative variability. The obtained models (via stepwise regression) exhibited significant p-values (0.0001).<span style="mso-bidi-font-weight: bold;"></span></span></span></p>


2013 ◽  
Vol 11 (2) ◽  
pp. 147
Author(s):  
. Anjarwati

This study aims to analyze the development of intermediation to economic growth in Indonesia and a significant test whether or not the effect of intermediation on economic growth. From the test results obtained by the coefficient of determination (R2) for multiple linear regression models for 0.916. It means that the independent variables can explain the variation in the dependent variable 91.6% together, then variable t can be seen that variable Interest Rate Loans and lending have a significant effect on economic growth, It is proved that t-count > t-table. Lending to the variable (X1) 7,944 t > t table 2.026, and for variable Interest Rate Loans (X2) 4.521 t-count > t table 2.026. From the analysis has been conducted simultaneously indicates that those independent variables have a significant effect on economic growth, with simultaneous F test results are calculated F value > F 204.012> 3.25. it means that Ho is rejected.


PEDIATRICS ◽  
1987 ◽  
Vol 80 (6) ◽  
pp. 969-970
Author(s):  
JACOB KELLER ◽  
MARY ELLEN AVERY

In Reply.— We appreciate the opportunity to clarify the issues raised by Barry et al. We tested the significance of the differences among centers in the incidence of chronic lung disease, adjusted for sex, race, and birth weight, by computing two multiple logistic regression models. The first model included only sex, race, and birth weight as independent variables. The second model additionally included seven dummy variables representing the eight centers. Twice the difference in the log likelihoods in the two models has a χ2 distribution with 7 df under the null hypothesis of no center differences.


2018 ◽  
Vol 7 (2) ◽  
pp. 47-50
Author(s):  
Martin Richter ◽  
Eva Richterová ◽  
Iveta Zentková

Abstract The aim of the paper is to find out the relationship between beer productions in separate V4 countries. Logarithmic regression analyses with corrected heteroscedasticity and autocorrelation is used to observe the relationship between beer production as dependent variable and independent variables beer consumption, barley and hops yield, imported and exported quantity. All estimated logarithmic regression models are statistically highly significant. Variables without statistical significance are not interpreted. All interpreted variables are inelastic. Except of Czech Republic beer consumption is significant variable in relation to beer production what can suggest export orientation of Czech breweries. Exported beer quantity is significant variable in given model. In Slovakia barley yield is statistically significant but negatively correlated to beer production. In other countries barley and hops yields are not statistically significant. This may mean that these crops are not directly used in domestic beer production. Crops are exported in order to obtain higher added value from side of primary producers. Based on that, beer is mostly produced from intermediate products or imported from other countries.


2011 ◽  
Vol 23 (2) ◽  
pp. 131-138 ◽  
Author(s):  
Andrzej Kalisz

Growth and earliness of Chinese cabbage (Brassica rapavar.chinensis) as a function of time and weather conditionsThe aim of the research, which was carried out in the years 2003-2005, was to assess the possibility of creating regression models for the earliness of Chinese cabbage (Brassica rapavar.chinensis, pak choy) plants cultivated under field conditions during the summer-autumn months. Four variables were chosen for developing the prediction model: the minimum, mean and maximum air temperatures and sunshine hours. After stepwise regression analysis, it was noted that crop earliness could be described as a function of mean air temperature and sunshine hours. For this model, the coefficient of determination R2was equal to 0.962. Additional models based on various thermal indices - growing degree days (GDD), heliothermal units (HTU) and photothermal units (PTU) - were also constructed and taken into consideration. The linear regression equation that includes GDD could only simulate the earliness of the plants with a precision below 14%. The model built on the basis of PTU showed a better fit of the predicted data to the observed data (around 49%), while the last model, which incorporated HTU, was the most accurate (R2was equal to 0.843). Models for the growth of pak choy plants in the field based on a time variable are also presented in this paper.


Author(s):  
Gorgees Shaheed Mohammad

In this paper, we propose estimating standard errors for R2 and R2 and to construct their confidence intervals, using the usual and “smoothed bootstrap methods”, which are accurate measures. It is shown by ”Monte Carlo experiments” that the smoothed bootstrap standard errors are more accurate estimates of usual bootstrap method. It is also shown that although the usual and smoothed bootstrap 0.95 confidence intervals of R2 do not include the true value of the parent coefficient of determination in some particular cases, such a phenomenon does not occur when is used.


Sign in / Sign up

Export Citation Format

Share Document