Calibration of stormwater quality regression models: a random process?

2010 ◽  
Vol 62 (4) ◽  
pp. 875-882 ◽  
Author(s):  
A. Dembélé ◽  
J.-L. Bertrand-Krajewski ◽  
B. Barillon

Regression models are among the most frequently used models to estimate pollutants event mean concentrations (EMC) in wet weather discharges in urban catchments. Two main questions dealing with the calibration of EMC regression models are investigated: i) the sensitivity of models to the size and the content of data sets used for their calibration, ii) the change of modelling results when models are re-calibrated when data sets grow and change with time when new experimental data are collected. Based on an experimental data set of 64 rain events monitored in a densely urbanised catchment, four TSS EMC regression models (two log-linear and two linear models) with two or three explanatory variables have been derived and analysed. Model calibration with the iterative re-weighted least squares method is less sensitive and leads to more robust results than the ordinary least squares method. Three calibration options have been investigated: two options accounting for the chronological order of the observations, one option using random samples of events from the whole available data set. Results obtained with the best performing non linear model clearly indicate that the model is highly sensitive to the size and the content of the data set used for its calibration.

2009 ◽  
Vol 2009 ◽  
pp. 1-8 ◽  
Author(s):  
Janet Myhre ◽  
Daniel R. Jeske ◽  
Michael Rennie ◽  
Yingtao Bi

A heteroscedastic linear regression model is developed from plausible assumptions that describe the time evolution of performance metrics for equipment. The inherited motivation for the related weighted least squares analysis of the model is an essential and attractive selling point to engineers with interest in equipment surveillance methodologies. A simple test for the significance of the heteroscedasticity suggested by a data set is derived and a simulation study is used to evaluate the power of the test and compare it with several other applicable tests that were designed under different contexts. Tolerance intervals within the context of the model are derived, thus generalizing well-known tolerance intervals for ordinary least squares regression. Use of the model and its associated analyses is illustrated with an aerospace application where hundreds of electronic components are continuously monitored by an automated system that flags components that are suspected of unusual degradation patterns.


1985 ◽  
Vol 15 (2) ◽  
pp. 331-340 ◽  
Author(s):  
T. Cunia ◽  
R. D. Briggs

To construct biomass tables for various tree components that are consistent with each other, one may use linear regression techniques with dummy variables. When the biomass of these components is measured on the same sample trees, one should also use the generalized rather than ordinary least squares method. A procedure is shown which allows the estimation of the covariance matrix of the sample biomass values and circumvents the problem of storing and inverting large covariance matrices. Applied to 20 sets of sample tree data, the generalized least squares regressions generated estimates which, on the average were slightly higher (about 1%) than the sample data. The confidence and prediction bands about the regression function were wider, sometimes considerably wider than those estimated by the ordinary weighted least squares.


Geophysics ◽  
2018 ◽  
Vol 83 (4) ◽  
pp. V243-V252
Author(s):  
Wail A. Mousa

A stable explicit depth wavefield extrapolation is obtained using [Formula: see text] iterative reweighted least-squares (IRLS) frequency-space ([Formula: see text]-[Formula: see text]) finite-impulse response digital filters. The problem of designing such filters to obtain stable images of challenging seismic data is formulated as an [Formula: see text] IRLS minimization. Prestack depth imaging of the challenging Marmousi model data set was then performed using the explicit depth wavefield extrapolation with the proposed [Formula: see text] IRLS-based algorithm. Considering the extrapolation filter design accuracy, the [Formula: see text] IRLS minimization method resulted in an image with higher quality when compared with the weighted least-squares method. The method can, therefore, be used to design high-accuracy extrapolation filters.


2018 ◽  
Vol 22 (5) ◽  
pp. 358-371 ◽  
Author(s):  
Radoslaw Trojanek ◽  
Michal Gluszak ◽  
Justyna Tanas

In the paper, we analysed the impact of proximity to urban green areas on apartment prices in Warsaw. The data-set contained in 43 075 geo-coded apartment transactions for the years 2010 to 2015. In this research, the hedonic method was used in Ordinary Least Squares (OLS), Weighted Least Squares (WLS) and Median Quantile Regression (Median QR) models. We found substantial evidence that proximity to an urban green area is positively linked with apartment prices. On an average presence of a green area within 100 meters from an apartment increases the price of a dwelling by 2,8% to 3,1%. The effect of park/forest proximity on house prices is more significant for newer apartments than those built before 1989. We found that proximity to a park or a forest is particularly important (and has a higher implicit price as a result) in the case of buildings constructed after 1989. The impact of an urban green was particularly high in the case of a post-transformation housing estate. Close vicinity (less than 100 m distance) to an urban green increased the sales prices of apartments in new residential buildings by 8,0–8,6%, depending on a model.


Author(s):  
Warha, Abdulhamid Audu ◽  
Yusuf Abbakar Muhammad ◽  
Akeyede, Imam

Linear regression is the measure of relationship between two or more variables known as dependent and independent variables. Classical least squares method for estimating regression models consist of minimising the sum of the squared residuals. Among the assumptions of Ordinary least squares method (OLS) is that there is no correlations (multicollinearity) between the independent variables. Violation of this assumptions arises most often in regression analysis and can lead to inefficiency of the least square method. This study, therefore, determined the efficient estimator between Least Absolute Deviation (LAD) and Weighted Least Square (WLS) in multiple linear regression models at different levels of multicollinearity in the explanatory variables. Simulation techniques were conducted using R Statistical software, to investigate the performance of the two estimators under violation of assumptions of lack of multicollinearity. Their performances were compared at different sample sizes. Finite properties of estimators’ criteria namely, mean absolute error, absolute bias and mean squared error were used for comparing the methods. The best estimator was selected based on minimum value of these criteria at a specified level of multicollinearity and sample size. The results showed that, LAD was the best at different levels of multicollinearity and was recommended as alternative to OLS under this condition. The performances of the two estimators decreased when the levels of multicollinearity was increased.


2020 ◽  
Vol 50 (1) ◽  
Author(s):  
Guilherme Alves Puiatti ◽  
Paulo Roberto Cecon ◽  
Moysés Nascimento ◽  
Ana Carolina Campana Nascimento ◽  
Antônio Policarpo Souza Carneiro ◽  
...  

ABSTRACT: The objective of this study was to adjust nonlinear quantile regression models for the study of dry matter accumulation in garlic plants over time, and to compare them to models fitted by the ordinary least squares method. The total dry matter of nine garlic accessions belonging to the Vegetable Germplasm Bank of Universidade Federal de Viçosa (BGH/UFV) was measured in four stages (60, 90, 120 and 150 days after planting), and those values were used for the nonlinear regression models fitting. For each accession, there was an adjustment of one model of quantile regression (τ=0.5) and one based on the least squares method. The nonlinear regression model fitted was the Logistic. The Akaike Information Criterion was used to evaluate the goodness of fit of the models. Accessions were grouped using the UPGMA algorithm, with the estimates of the parameters with biological interpretation as variables. The nonlinear quantile regression is efficient for the adjustment of models for dry matter accumulation in garlic plants over time. The estimated parameters are more uniform and robust in the presence of asymmetry in the distribution of the data, heterogeneous variances, and outliers.


2016 ◽  
Vol 72 (2) ◽  
pp. 250-260 ◽  
Author(s):  
Bertrand Fournier ◽  
Jesse Sokolow ◽  
Philip Coppens

Two methods for scaling of multicrystal data collected in time-resolved photocrystallography experiments are discussed. The WLS method is based on a weighted least-squares refinement of laser-ON/laser-OFF intensity ratios. The other, previously applied, is based on the average absolute system response to light exposure. A more advanced application of these methods for scaling within a data set, necessary because of frequent anisotropy of light absorption in crystalline samples, is proposed. The methods are applied to recently collected synchrotron data on the tetra-nuclear compound Ag2Cu2L4withL= 2-diphenylphosphino-3-methylindole. A statistical analysis of the weighted least-squares refinement residual terms is performed to test the importance of the scaling procedure.


2018 ◽  
Vol 7 (6) ◽  
pp. 33
Author(s):  
Morteza Marzjarani

Selecting a proper model for a data set is a challenging task. In this article, an attempt was made to answer and to find a suitable model for a given data set. A general linear model (GLM) was introduced along with three different methods for estimating the parameters of the model. The three estimation methods considered in this paper were ordinary least squares (OLS), generalized least squares (GLS), and feasible generalized least squares (FGLS). In the case of GLS, two different weights were selected for improving the severity of heteroscedasticity and the proper weight (s) was deployed. The third weight was selected through the application of FGLS. Analyses showed that only two of the three weights including the FGLS were effective in improving or reducing the severity of heteroscedasticity. In addition, each data set was divided into Training, Validation, and Testing producing a more reliable set of estimates for the parameters in the model. Partitioning data is a relatively new approach is statistics borrowed from the field of machine learning. Stepwise and forward selection methods along with a number of statistics including the average square error testing (ASE), Adj. R-Sq, AIC, AICC, and ASE validate along with proper hierarchies were deployed to select a more appropriate model(s) for a given data set. Furthermore, the response variable in both data files was transformed using the Box-Cox method to meet the assumption of normality. Analysis showed that the logarithmic transformation solved this issue in a satisfactory manner. Since the issues of heteroscedasticity, model selection, and partitioning of data have not been addressed in fisheries, for introduction and demonstration purposes only, the 2015 and 2016 shrimp data in the Gulf of Mexico (GOM) were selected and the above methods were applied to these data sets. At the conclusion, some variations of the GLM were identified as possible leading candidates for the above data sets.


Sign in / Sign up

Export Citation Format

Share Document