Ridge regression models of urban crime

1979 ◽  
Vol 9 (2-3) ◽  
pp. 247-260
Author(s):  
Tarald O. Kvålseth
2020 ◽  
Vol 20 (1) ◽  
pp. 163-176
Author(s):  
Sebastian Gnat

AbstractResearch background: Mass appraisal is a process in which multiple properties are appraised simultaneously, with a uniform approach. One of the tools that can be used in this area are multiple regression models. In the valuation of real estate features are often described on an ordinal or nominal scale. Replacing them with dummy variables with an insufficient number of observations leads to multicollinearity. On the other hand, there is a risk of overfitting the model. One of the ways to eliminate or weaken these phenomena is to introduce regularization based on a model’s penalization for the high values of its weights.Purpose: The aim of the study is to verify the hypothesis whether regularized regression reduces the errors of property valuation and which of the analyzed methods is the most effective in this context.Research methodology: The article will present a study in which two ways of regularization will be applied – ridge and lasso regression, in the context of their impact on the errors of property valuation. The analyzed data set includes over 300 land properties valued by property appraisers. The key aspects of the study are the selection of optimal values of the regularization parameter and its influence on model’s errors with a different number of observations in the training sets.Results: The study showed that regularization improves valuation results and, more specifically, allows for lower average absolute percentage errors. The improvement of model effectiveness was more pronounced in the case of ridge regression. An important result is also that regularization has provided a higher accuracy of valuation compared to multiple regression models for smaller training sets.Novelty: The article confirms the effectiveness of regularization as a way to eliminate the problem of multicollinearity or overfitting of the model. The results showed that ridge regression can be an effective way of modelling the value of real estate. Especially in the case of a small amount of market data, which is an important conclusion in the context of the real estate market.


Author(s):  
Husam H. Alkinani ◽  
Abo Taleb T. Al-Hameedi ◽  
Shari Dunn-Norman ◽  
Munir Aldin ◽  
Deepak Gokaraju ◽  
...  

AbstractElastic moduli such as Young’s modulus (E), Poisson’s ratio (v), and bulk modulus (K) are vital to creating geomechanical models for wellbore stability, hydraulic fracturing, sand production, etc. Due to the difficulty of obtaining core samples and performing rock testing, alternatively, wireline measurements can be used to estimate dynamic moduli. However, dynamic moduli are significantly different from elastic moduli due to many factors. In this paper, correlations for three zones (Nahr Umr shale, Zubair shale, and Zubair sandstone) located in southern Iraq were created to estimate static E, K, and ν from dynamic data. Core plugs from the aforementioned three zones alongside wireline measurements for the same sections were acquired. Single-stage triaxial (SST) tests with CT scans were executed for the core plugs. The data were separated into two parts; training (70%), and testing (30%) to ensure the models can be generalized to new data. Regularized ridge regression models were created to estimate static E, K, and ν from dynamic data (wireline measurements). The shrinkage parameter (α) was selected for each model based on an iterative process, where the goal is to ensure having the smallest error. The results showed that all models had testing R2 ranging between 0.92 and 0.997 and consistent with the training results. All models of E, K, and ν were linear besides ν for the Zubair sandstone and shale which were second-degree polynomial. Furthermore, root means squared error (RMSE) and mean absolute error (MAE) were utilized to assess the error of the models. Both RMSE and MAE were consistently low in training and testing without a large discrepancy. Thus, with the regularization of ridge regression and consistent low error during the training and testing, it can be concluded that the proposed models can be generalized to new data and no overfitting can be observed. The proposed models for Nahr Umr shale, Zubair shale, and Zubair sandstone can be utilized to estimate E, K, and ν based on readily available dynamic data which can contribute to creating robust geomechanical models for hydraulic fracturing, sand production, wellbore stability, etc.


Author(s):  
Pradeep Lall ◽  
Dinesh Arunachalam ◽  
Jeff Suhling

Goldmann Constants and Norris-Landzberg acceleration factors for lead-free solders have been developed based on ridge regression models (RR) for reliability prediction and part selection of area-array packaging architectures under thermo-mechanical loads. Ridge regression adds a small positive bias to the diagonal of the covariance matrix to prevent high sensitivity to variables that are correlated. The proposed procedure proves to be a better tool for prediction than multiple-linear regression models. Models have been developed in conjunction with Stepwise Regression Methods for identification of the main effects. Package architectures studied include, BGA packages mounted on copper-core and no-core printed circuit assemblies in harsh environments. The models have been developed based on thermo-mechanical reliability data acquired on copper-core and no-core assemblies in four different thermal cycling conditions. Packages with Sn3Ag0.5Cu solder alloy interconnects have been examined. The models have been developed based on perturbation of accelerated test thermo-mechanical failure data. Data has been gathered on nine different thermal cycle conditions with SAC305 alloys. The thermal cycle conditions differ in temperature range, dwell times, maximum temperature and minimum temperature to enable development of constants needed for the life prediction and assessment of acceleration factors. Norris-Landzberg acceleration factors have been benchmarked against previously published values. In addition, model predictions have been validated against validation datasets which have not been used for model development. Convergence of statistical models with experimental data has been demonstrated using a single factor design of experiment study for individual factors including temperature cycle magnitude, relative coefficient of thermal expansion, and diagonal length of the chip. The predicted and measured acceleration factors have also been computed and correlated. Good correlations have been achieved for parameters examined.


2007 ◽  
Vol 20 (12) ◽  
pp. 2810-2826 ◽  
Author(s):  
Timothy DelSole

Abstract This paper presents a framework based on Bayesian regression and constrained least squares methods for incorporating prior beliefs in a linear regression problem. Prior beliefs are essential in regression theory when the number of predictors is not a small fraction of the sample size, a situation that leads to overfitting—that is, to fitting variability due to sampling errors. Under suitable assumptions, both the Bayesian estimate and the constrained least squares solution reduce to standard ridge regression. New generalizations of ridge regression based on priors relevant to multimodel combinations also are presented. In all cases, the strength of the prior is measured by a parameter called the ridge parameter. A “two-deep” cross-validation procedure is used to select the optimal ridge parameter and estimate the prediction error. The proposed regression estimates are tested on the Development of a European Multimodel Ensemble System for Seasonal to Interannual Prediction (DEMETER) hindcasts of seasonal mean 2-m temperature over land. Surprisingly, none of the regression models proposed here can consistently beat the skill of a simple multimodel mean, despite the fact that one of the regression models recovers the multimodel mean in a suitable limit. This discrepancy arises from the fact that methods employed to select the ridge parameter are themselves sensitive to sampling errors. It is plausible that incorporating the prior belief that regression parameters are “large scale” can reduce overfitting and result in improved performance relative to the multimodel mean. Despite this, results from the multimodel mean demonstrate that seasonal mean 2-m temperature is predictable for at least three months in several regions.


2021 ◽  
Vol 13 (7) ◽  
pp. 1307
Author(s):  
Pariya Pourmohammadi ◽  
Michael P. Strager ◽  
Michael J. Dougherty ◽  
Donald A. Adjeroh

Land development processes are driven by complex interactions between socio-economic and spatial factors. Acquiring an understanding of such processes and the underlying procedures helps urban and regional planners, environmental scientists, and policy makers to base their decisions on valid and profound information. In this work, remote-sensing-derived land-cover data were used to characterize the patterns of land development from the beginning of 1985 to the beginning of 2015, in the state of West Virginia (WV), US. We applied spatial pattern analysis, ridge regression, and Geographically Weighted Ridge Regression (GWRR) to examine the impact of population, energy resources, existing land developments dynamics, and economic status on land transformation. We showed that in presence of multicollinearity of explanatory variables, how penalizing regression models in both local and global levels lead to a better fit and decreases the model’s variance. We used geographical error analysis of regression models to visualize the difference between the model estimates and actual values. The findings of this research indicate that because of shifting geography of opportunities, the patterns and processes of land development in the studied region are unstable. This leads to fragmented land developments and prevents formation of large communities.


Sign in / Sign up

Export Citation Format

Share Document