Regression

2021 ◽  
pp. 71-84
Author(s):  
Andy Hector

This chapter extends the use of linear models to relationships with continuous explanatory variables, in other words, linear regression. The goal of the worked example (on timber hardness data) given in detail in this chapter is prediction, not hypothesis testing. Confidence intervals and prediction intervals are explained. Graphical approaches to checking the assumptions of linear-model analysis are explored in further detail. The effects of transformations on linearity, normality, and equality of variance are investigated.

2021 ◽  
pp. 139-180
Author(s):  
Justin C. Touchon

Chapter 6 continues exploring the world of statistics that are covered within the linear model, namely two-way and three-way ANOVA, linear regression and analysis of covariance (ANCOVA). In each type of model, a detailed description of how to interpret the summary output is undertaken, including understanding how to interpret and plot interactions. Conducting post-hoc analyses and using the predict() function are also covered. The chapter ends by reinforcing earlier plotting skills in ggplot2 by walking through an example of making a professional looking figure with multiple non-linear regression curves and confidence intervals.


2021 ◽  
pp. 51-70
Author(s):  
Andy Hector

The last chapter conducted a simple analysis of Darwin’s maize data using R as an oversized pocket calculator to work out confidence intervals ‘by hand’. This is a simple way to learn about analysis and good for demystifying the process, but it is inefficient. Instead, we want to take advantage of the more sophisticated functions that R provides that are designed to perform linear-model analysis. This chapter explores those functions by repeating and extending the analysis of Darwin’s maize data.


2007 ◽  
Vol 22 (3) ◽  
pp. 637-650 ◽  
Author(s):  
Ian T. Jolliffe

Abstract When a forecast is assessed, a single value for a verification measure is often quoted. This is of limited use, as it needs to be complemented by some idea of the uncertainty associated with the value. If this uncertainty can be quantified, it is then possible to make statistical inferences based on the value observed. There are two main types of inference: confidence intervals can be constructed for an underlying “population” value of the measure, or hypotheses can be tested regarding the underlying value. This paper will review the main ideas of confidence intervals and hypothesis tests, together with the less well known “prediction intervals,” concentrating on aspects that are often poorly understood. Comparisons will be made between different methods of constructing confidence intervals—exact, asymptotic, bootstrap, and Bayesian—and the difference between prediction intervals and confidence intervals will be explained. For hypothesis testing, multiple testing will be briefly discussed, together with connections between hypothesis testing, prediction intervals, and confidence intervals.


PeerJ ◽  
2016 ◽  
Vol 4 ◽  
pp. e2662 ◽  
Author(s):  
Christiaan W. Winterbach ◽  
Sam M. Ferreira ◽  
Paul J. Funston ◽  
Michael J. Somers

BackgroundThe range, population size and trend of large carnivores are important parameters to assess their status globally and to plan conservation strategies. One can use linear models to assess population size and trends of large carnivores from track-based surveys on suitable substrates. The conventional approach of a linear model with intercept may not intercept at zero, but may fit the data better than linear model through the origin. We assess whether a linear regression through the origin is more appropriate than a linear regression with intercept to model large African carnivore densities and track indices.MethodsWe did simple linear regression with intercept analysis and simple linear regression through the origin and used the confidence interval for ß in the linear modely = αx + ß, Standard Error of Estimate, Mean Squares Residual and Akaike Information Criteria to evaluate the models.ResultsThe Lion on Clay and Low Density on Sand models with intercept were not significant (P > 0.05). The other four models with intercept and the six models thorough origin were all significant (P < 0.05). The models using linear regression with intercept all included zero in the confidence interval for ß and the null hypothesis that ß = 0 could not be rejected. All models showed that the linear model through the origin provided a better fit than the linear model with intercept, as indicated by the Standard Error of Estimate and Mean Square Residuals. Akaike Information Criteria showed that linear models through the origin were better and that none of the linear models with intercept had substantial support.DiscussionOur results showed that linear regression through the origin is justified over the more typical linear regression with intercept for all models we tested. A general model can be used to estimate large carnivore densities from track densities across species and study areas. The formulaobserved track density = 3.26 × carnivore densitycan be used to estimate densities of large African carnivores using track counts on sandy substrates in areas where carnivore densities are 0.27 carnivores/100 km2or higher. To improve the current models, we need independent data to validate the models and data to test for non-linear relationship between track indices and true density at low densities.


2021 ◽  
pp. 177-194
Author(s):  
Andy Hector

This book began with the simplest types of linear model: one-way ANOVA and its simple linear regression equivalent. However, once more complex ANOVA and ANCOVA designs were encountered some complexities arose that were then skipped over. This chapter explores these complexities of linear-model analysis and some additional ones that arise with unbalanced designs—those with unequal numbers of replicates in the different treatment groups.


Author(s):  
Shaolin Hu ◽  
Karl Meinke ◽  
Rushan Chen ◽  
Ouyang Huajiang

Iterative Estimators of Parameters in Linear Models with Partially Variant CoefficientsA new kind of linear model with partially variant coefficients is proposed and a series of iterative algorithms are introduced and verified. The new generalized linear model includes the ordinary linear regression model as a special case. The iterative algorithms efficiently overcome some difficulties in computation with multidimensional inputs and incessantly appending parameters. An important application is described at the end of this article, which shows that this new model is reasonable and applicable in practical fields.


Author(s):  
Nguyen Minh Ha

The purpose of this research is to identify factors to explain the gap in exit rate between state and non-state firms in Vietnam. With a sample of 7,962 Vietnamese firms and using the Oaxaca-Blinder decomposition extended for non-linear models, the research finds out that a very large part of the ownership gap in exit rate between state and nonstate firms cannot be explained by the included covariates, but it is almost explained by the effects of differences in the coefficients of covariates. In particular, the differences in coefficients of covariates of initial assets, the industrial sector (mining, construction and manufacturing industries), and the service sector considerably increase the ownership gap in exit rate, but the difference in the coefficient of initial employment reduces the ownership gap in the exit rate. Moreover, differences in explanatory variables between state and non-state firms explain a very small part of the ownership gap in exit rate. This means that an estimate of the reduction in state and non-state firms’ exit rates (resulting from giving non-state firms the same characteristics as state firms) is very small. The differences in coefficients have a much greater impact on differences in exit rates than characteristics that may be due to existing discrimination between state and non-state firms.


2015 ◽  
Vol 733 ◽  
pp. 910-913
Author(s):  
Jing Zhang ◽  
Hong Xia Guo

As partially linear regression model contains parameters part and the nonparametric part, it is better than the linear model. Partially linear regression model is more freedom, flexible, and can seize the characteristics of data. This passage first reduces the dimension of expenditure index data using principal component analysis. Then based on the dimension-reduced data, a partial linear model is established to forecast expenditure on army. The results show a great advantage over those by stepwise linear regression analysis.


2017 ◽  
Vol 6 (5) ◽  
pp. 140
Author(s):  
Theodosia Prodromou

Following recent scholarly interest in teaching informal linear regression models, this study looks at teachers’ reasoning about informal lines of best fit and their role in pedagogy. The case results presented in this journal paper provide insights into the reasoning used when developing a simple informal linear model to best fit the available data. This study also suggests potential in specific aspects of bidirectional modelling to help foster the development of robust knowledge of the logic of inference for those investigating and coordinating relations between models developed during modelling exercises and informal inferences based on these models. These insights can inform refinement of instructional practices using simple linear models to support students’ learning of statistical inference, both formal and informal.


2012 ◽  
Vol 61 (1-6) ◽  
pp. 126-132 ◽  
Author(s):  
Chunfa Tong ◽  
Guangxin Liu ◽  
Liwei Yang ◽  
Jisen Shi

Abstract The diallel mating designs have been extensively employed to gain genetic information by crop and tree breeders, but analysis of diallel data faces some challenges because the same parent acts both male and female roles. Theoretically, little attention was paid to the statistical inference and hypothesis testing for a fixed diallel linear model. In this paper we provide a uniform solution to any fixed diallel linear model with matrix expression based on the theory of restricted linear models. We derive formulae for estimating diallel parameters and their standard errors, and obtain uniform statistics for hypothesis testing of parameters, factors and differences between general combining abilities (GCA) or specific combining abilities (SCA). To put the result into practice, we have developed a Windows® software program “GSCA” for analyzing a flexible diallel linear model that could contain the GCA, SCA, reciprocal, block and environment effects as well as interaction effects such as GCA by environment. GSCA can perform analyses not only for Griffing’s four types of diallel crosses but also for more complicated diallel crosses whether the data structure is balanced or unbalanced. A real example is given to illustrate the convenience, flexibility and power of our software for diallel analysis.


Sign in / Sign up

Export Citation Format

Share Document