scholarly journals The Comparison of Some Methods in Analysis of Linear Regression Using R Software

2019 ◽  
Vol 3 (3) ◽  
pp. 22
Author(s):  
Ilir Palla

This article contains the OLS method, WLS method and bootstrap methods to estimate coefficients of linear regression and their standard deviation. If regression holds random errors with constant variance and if those errors are independent normally distributed we can use least squares method, which is accurate for drawing inferences with these assumptions. If the errors are heteroscedastic, meaning that their variance depends from explanatory variable, or have different weights, we can’t use least squares method because this method cannot be safe for accurate results. If we know weights for each error, we can use weight least squares method. In this article we have also described bootstrap methods to evaluate regression parameters. The bootstrap methods improved quantile estimation. We simulated errors with non constant variances in a linear regression using R program and comparison results. Using this software we have found confidence interval, estimated coefficients, plots and results for any case.

2014 ◽  
Vol 71 (1) ◽  
Author(s):  
Bello Abdulkadir Rasheed ◽  
Robiah Adnan ◽  
Seyed Ehsan Saffari ◽  
Kafi Dano Pati

In a linear regression model, the ordinary least squares (OLS) method is considered the best method to estimate the regression parameters if the assumptions are met. However, if the data does not satisfy the underlying assumptions, the results will be misleading. The violation for the assumption of constant variance in the least squares regression is caused by the presence of outliers and heteroscedasticity in the data. This assumption of constant variance (homoscedasticity) is very important in linear regression in which the least squares estimators enjoy the property of minimum variance. Therefor e robust regression method is required to handle the problem of outlier in the data. However, this research will use the weighted least square techniques to estimate the parameter of regression coefficients when the assumption of error variance is violated in the data. Estimation of WLS is the same as carrying out the OLS in a transformed variables procedure. The WLS can easily be affected by outliers. To remedy this, We have suggested a strong technique for the estimation of regression parameters in the existence of heteroscedasticity and outliers. Here we apply the robust regression of M-estimation using iterative reweighted least squares (IRWLS) of Huber and Tukey Bisquare function and resistance regression estimator of least trimmed squares to estimating the model parameters of state-wide crime of united states in 1993. The outcomes from the study indicate the estimators obtained from the M-estimation techniques and the least trimmed method are more effective compared with those obtained from the OLS.


2013 ◽  
Vol 278-280 ◽  
pp. 1323-1326
Author(s):  
Yan Hua Yu ◽  
Li Xia Song ◽  
Kun Lun Zhang

Fuzzy linear regression has been extensively studied since its inception symbolized by the work of Tanaka et al. in 1982. As one of the main estimation methods, fuzzy least squares approach is appealing because it corresponds, to some extent, to the well known statistical regression analysis. In this article, a restricted least squares method is proposed to fit fuzzy linear models with crisp inputs and symmetric fuzzy output. The paper puts forward a kind of fuzzy linear regression model based on structured element, This model has precise input data and fuzzy output data, Gives the regression coefficient and the fuzzy degree function determination method by using the least square method, studies the imitation degree question between the observed value and the forecast value.


1982 ◽  
Vol 58 (5) ◽  
pp. 213-219 ◽  
Author(s):  
Jean Beaulieu ◽  
Yvan J. Hardy

This paper presents a method of analysis which differentiates between spruce budworm caused mortality and regular mortality on balsam fir in the Gatineau region in Quebec. A first attempt was made using multiple linear regression and a uniform random number generator. In order to overcome the bias inherent to the least squares method when dealing with a binary (0,1) dependent variable, a profit analysis was also conducted. In this case, the parameters and their variance were estimated using likehood method. These two approaches proved to be equivalent when percent budworm caused mortality was compared within the 1958 to 1979 period covered by the data at hand, while the outbreak lasted from 1968 to 1975.In 1979, approximately 55% of the stems had been killed by the budworm, accounting for 53% of the volume. Maple-yellow birch associations were more affected than fir associations although no significant difference was found. Fir mortality was delayed by aerial spraying of insecticides but this advantage disappeared as soon as the spray operations came to an end.


2019 ◽  
Vol 20 (2) ◽  
pp. 83-92
Author(s):  
Małgorzata Kobylińska

This paper presents the application of the regression maximum depth for the estimation of linear regression function structural elements. For two-dimensional sets including untypical observations, regression functions were developed using the classical least squares method and a method based on the concept of observation depth measure in a sample. The effect of untypical observations on the estimated models has been noted.


2019 ◽  
Vol 631 ◽  
pp. A145 ◽  
Author(s):  
G. Damljanović ◽  
F. Taris

Context. The second solution of the Gaia catalog, which has been available since April 2018, plays an important role in the realization of the future Gaia reference frame. Since 1997, the reference frame has been materialized by the optical HIPPARCOS positions of about 120 000 stars. The HIPPARCOS has been compared with and linked to the International Celestial Reference Frame (ICRF). The ICRF is materialized by means of the radio positions of extragalactic sources using very large baseline interferometry observations. Both, the HIPPARCOS and Gaia missions belong to the European Space Agency, and it is important to note that the Gaia catalog is going to replace the HIPPARCOS catalog. Aims. It has been shown that the International Latitude Service zenith telescope data pertaining to ground-based surveys that span a time baseline of about 80 yr, and which are also key when measuring proper motions, could be useful for the accurate determination of μδ for 387 ILS stars. Therefore, in this study we aim first to reduce these stars to the HIPPARCOS reference system; second, to made our original catalog of μδ, which we refer to as the ILS catalog, for these 387 bright stars; third, to present comparison results of the four catalogs by pairs (the ILS, HIPPARCOS or HIP, new HIPPARCOS or NHIP, and Gaia DR2); and fourth, to analyze the differences in μδ between pairs of catalogs to characterize the μδ errors for these catalogs with a special focus on the Gaia DR2 and ILS catalogs. Methods. At seven ILS sites around the world at latitude 39.°1, a set of seven telescopes was used to monitor the latitude variation via observations of the same stars for about 80 yr. Here, the inverse task was applied to improve μδ values of the 387 HIPPARCOS stars using the previously mentioned observations. Due to the specific Horrebow-Talcott method of the measured star pair, it is difficult to determine μδ for each single star. However, we achieved this by developing the original method and in combination with the HIPPARCOS data. We used the previously developed least squares method and formula to determine the coefficients, which describe the systematic part of differences in μδ between the pairs of catalogs. Results. We calculated the coefficients with the aforementioned formula (in line with the coordinates, stellar magnitude, and color index of every star) to compare ILS, HIP, NHIP, and Gaia DR2 data of μδ against each other by using the set of 387 stars. The presented differences of μδ show that the systematic errors in the four catalogs are nearly at the same level of 0.1 mas yr−1. This means that the DR2 and ILS μδ values are in good agreement with each other, and with values from the HIPPARCOS and new HIPPARCOS catalogs. Also, the random errors of differences are small ones; they are near 1 mas yr−1 for ILS-HIP and ILS-NHIP, and about 2 mas yr−1 for ILS-DR2, HIP-DR2, and NHIP-DR2. It is important to note that there is a similar level of proper motion formal errors in HIPPARCOS and new HIPPARCOS catalogs.


Author(s):  
Kazuhisa Takemura ◽  

Fuzzy linear regression analysis using the least squares method under linear constraint, where input data, output data, and coefficients are represented by triangular fuzzy numbers, was proposed and compared to possibilistic linear regression analysis proposed by Sakawa and Yano (1992) using fuzzy rating data in a psychological study. Major findings of the comparison were as follows: (1) Under the proposed analysis, the width between the maximum and minimum of the predicted model was nearer to the width of the dependent variable than that of possibilistic linear regression analysis, (2) the representative prediction by the proposed analysis was also nearer to that of the dependent variable, compared to that of possibilistic linear regression analysis.


2004 ◽  
Vol 50 (11) ◽  
pp. 81-88 ◽  
Author(s):  
J.-L. Bertrand-Krajewski

In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity T values with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.


Author(s):  
Warha, Abdulhamid Audu ◽  
Yusuf Abbakar Muhammad ◽  
Akeyede, Imam

Linear regression is the measure of relationship between two or more variables known as dependent and independent variables. Classical least squares method for estimating regression models consist of minimising the sum of the squared residuals. Among the assumptions of Ordinary least squares method (OLS) is that there is no correlations (multicollinearity) between the independent variables. Violation of this assumptions arises most often in regression analysis and can lead to inefficiency of the least square method. This study, therefore, determined the efficient estimator between Least Absolute Deviation (LAD) and Weighted Least Square (WLS) in multiple linear regression models at different levels of multicollinearity in the explanatory variables. Simulation techniques were conducted using R Statistical software, to investigate the performance of the two estimators under violation of assumptions of lack of multicollinearity. Their performances were compared at different sample sizes. Finite properties of estimators’ criteria namely, mean absolute error, absolute bias and mean squared error were used for comparing the methods. The best estimator was selected based on minimum value of these criteria at a specified level of multicollinearity and sample size. The results showed that, LAD was the best at different levels of multicollinearity and was recommended as alternative to OLS under this condition. The performances of the two estimators decreased when the levels of multicollinearity was increased.


Sign in / Sign up

Export Citation Format

Share Document