Effects of measurement errors in predictor selection of linear regression model

2007 ◽  
Vol 52 (2) ◽  
pp. 1183-1195 ◽  
Author(s):  
Kimmo Vehkalahti ◽  
Simo Puntanen ◽  
Lauri Tarkkonen
2021 ◽  
Vol 2099 (1) ◽  
pp. 012024
Author(s):  
V N Lutay ◽  
N S Khusainov

Abstract This paper discusses constructing a linear regression model with regularization of the system matrix of normal equations. In contrast to the conventional ridge regression, where positive parameters are added to all diagonal terms of a matrix, in the method proposed only those matrix diagonal entries that correspond to the data with a high correlation are increased. This leads to a decrease in the matrix conditioning and, therefore, to a decrease in the corresponding coefficients of the regression equation. The selection of the entries to be increased is based on the triangular decomposition of the correlation matrix of the original dataset. The effectiveness of the method is tested on a known dataset, and it is performed not only with a ridge regression, but also with the results of applying the widespread algorithms LARS and Lasso.


2021 ◽  
Vol 20 (3) ◽  
pp. 425-449
Author(s):  
Haruka Murayama ◽  
Shota Saito ◽  
Yuji Iikubo ◽  
Yuta Nakahara ◽  
Toshiyasu Matsushima

AbstractPrediction based on a single linear regression model is one of the most common way in various field of studies. It enables us to understand the structure of data, but might not be suitable to express the data whose structure is complex. To express the structure of data more accurately, we make assumption that the data can be divided in clusters, and has a linear regression model in each cluster. In this case, we can assume that each explanatory variable has their own role; explaining the assignment to the clusters, explaining the regression to the target variable, or being both of them. Introducing probabilistic structure to the data generating process, we derive the optimal prediction under Bayes criterion and the algorithm which calculates it sub-optimally with variational inference method. One of the advantages of our algorithm is that it automatically weights the probabilities of being each number of clusters in the process of the algorithm, therefore it solves the concern about selection of the number of clusters. Some experiments are performed on both synthetic and real data to demonstrate the above advantages and to discover some behaviors and tendencies of the algorithm.


Sign in / Sign up

Export Citation Format

Share Document