ESTIMATION OF THE DYNAMICS OF FACTOR CONTRIBUTIONS IN A LINEAR REGRESSION MODEL

Author(s):  
С.И. Носков

Разработаны две алгоритмические схемы оценивания параметров линейной регрессии с требованием равенства нулю ошибки аппроксимации для заданного наблюдения и на их основе способы расчета динамических оценок вкладов факторов, входящих в состав правой части линейной регрессионной модели, в значения зависимой переменной. Одна из этих схем основана на решении задачи квадратичного программирования, а вторая предусматривает использование взвешенного метода наименьших квадратов. Организованный при этом итерационный процесс предполагает пересчет матрицы весовых коэффициентов для каждого наблюдения обрабатываемой выборки данных. Рассчитаны вклады следующих факторов для регрессионной модели погрузки на железнодорожном транспорте: объема добычи угля, объема вывезенной древесины, рабочего парка груженых железнодорожных вагонов (в среднем в сутки). Установлено, что наибольшее влияние на выходную переменную оказывает объем добычи угля, хотя это влияние и имеет некоторую общую тенденцию к снижению: почти на 4 пункта за 14 лет. Также несколько ослабевает, на 3 пункта, влияние и второго по значимости фактора - рабочего парка груженых железнодорожных вагонов. А наименее значимый показатель (объем вывезенной древесины) имеет явную тенденцию к усилению своего влияния, которое выросло почти на 7 пунктов I developed two algorithmic schemes for estimating the parameters of linear regression with the requirement that the approximation error for a given observation is zero and, on their basis, methods for calculating the dynamic estimates of the contributions of the factors included in the right side of the linear regression model to the values of the dependent variable. One of these schemes is based on solving a quadratic programming problem, and the second involves the use of a weighted least squares method. The iterative process organized in this case involves recalculating the matrix of weighting coefficients for each observation of the processed data sample. I calculated the contributions of the following factors for the regression model of loading on railway transport: the volume of coal production, the volume of exported timber, the working fleet of loaded railway cars (on average per day). I found that the largest influence on the output variable is exerted by the volume of coal production, although this influence has some general tendency to decrease - by almost 4 points over 14 years. Also, the influence of the second most important factor - the working fleet of loaded railway cars, is also weakening by 3 points. But the least significant indicator - the volume of exported timber - has a clear tendency to increase its influence, which has grown by almost 7 points

Author(s):  
Chahyun Oh ◽  
Boohwi Hong ◽  
Yumin Jo ◽  
Woosuk Chung ◽  
Hoseop Kim ◽  
...  

Background: The optimal insertion length for right subclavian vein catheterization in infants has not been determined. This study retrospectively compared landmark-based and linear regression model-based estimation of optimal insertion length for right subclavian vein catheterization in pediatric patients of corrected age < 1 year. Methods: Fifty catheterizations of the right subclavian vein were analyzed. The landmark related distances were: from the needle insertion point (I) to the tip of the sternal head of the right clavicle (A) and from A to the midpoint (B) of the perpendicular line drawn from the sternal head of the right clavicle to the line connecting the nipples. The optimal length of insertion was retrospectively determined by reviewing post-procedural chest radiographs. Estimates using a landmark-based equation (IA + AB – intercept) and a linear regression model were compared with the optimal length of insertion. Results: A landmark-based equation was determined as IA + AB – 5. The mean difference between the landmark-based estimate and the optimal insertion length was 1.0 mm (95% limits of agreement –18.2 to 20.3 mm). The mean difference between the linear regression model (26.681 – 4.014 × weight + 0.576 × IA + 0.537 × AB – 0.482 × postmenstrual age) and the optimal insertion length was 0 mm (95% limits of agreement –16.7 to 16.7 mm). The difference between the estimates using these two methods was not significant. Conclusion: A simple landmark-based equation may be useful for estimating optimal insertion length in pediatric patients of corrected age < 1 year undergoing right subclavian vein catheterization.


Author(s):  
Cécile Haberstich ◽  
Anthony Nouy ◽  
Guillaume Perrin

One of the most challenging tasks in computational science is the approximation of high-dimensional functions. Most of the time, only a few information on the functions is available, and approximating high-dimensional functions requires exploiting low-dimensional structures of these functions. In this work, the approximation of a function u is built using point evaluations of the function, where these evaluations are selected adaptively. Such problems are encountered when the function represents the output of a black-box computer code, a system or a physical experiment for a given value of a set of input variables. This algorithm relies on an extension of principal components analysis (PCA) to multivariate functions in order to estimate the tensors $v_{\alpha}$. In practice, the PCA is realized on sample-based projections of the function u, using interpolation or least-squares regression. Least-squares regression can provide a stable projection but it usually requires a high number of evaluations of u, which is not affordable when one evaluation is very costly. In [1] the authors proposed an optimal weighted least-squares method, with a choice of weights and samples that garantee an approximation error of the order of the best approximation error using a minimal number of samples. We here present an extension of this methodology for the approximation in tree-based format, where optimal weighted least-squares method is used for the projection onto tensor product spaces. This approach will be compared with a strategy using standard least-squares method or interpolation (as proposed in [2]).


2021 ◽  
Vol 25 (2) ◽  
pp. 83-108
Author(s):  
Qiqing Yu ◽  

Under the right censorship model and under the linear regression model where may not exist, the modified semi-parametric MLE (MSMLE) was proposed by Yu and Wong [17]. The MSMLE of satisfying infinitely often) if is discontinuous, and the simulation study suggests that it is also consistent and efficient under certain regularity conditions. In this paper, we establish the consistency of the MSMLE under the necessary and sufficient condition that is identifiable. Notice that under the latter assumption, the Buckley-James estimator and the median regression estimator can be inconsistent (see Yu and Dong [20]).


2021 ◽  
Vol 27 (127) ◽  
pp. 213-228
Author(s):  
Qasim Mohammed Saheb ◽  
Saja Mohammad Hussein

Linear regression is one of the most important statistical tools through which it is possible to know the relationship between the response variable and one variable (or more) of the independent variable(s), which is often used in various fields of science. Heteroscedastic is one of the linear regression problems, the effect of which leads to inaccurate conclusions. The problem of heteroscedastic may be accompanied by the presence of extreme outliers in the independent variables (High leverage points) (HLPs), the presence of (HLPs) in the data set result unrealistic estimates and misleading inferences. In this paper, we review some of the robust weighted estimation methods that accommodate both Robust and classical methods in the detection of extreme outliers (High leverage points) (HLPs) and the determination of weights. The methods include both Diagnostic Robust Generalized Potential Based on Minimum Volume Ellipsoid (DRGP (MVE)), Diagnostic Robust Generalized Potential Based on Minimum Covariance Determinant (DRGP (MCD)), and Diagnostic Robust Generalized Potential Based on Index Set Equality (DRGP (ISE)). The comparison was made according to the standard error criterion of the estimated parameters  SE ( ) and SE ( ) of general linear regression model, for sample sizes (n=60, n=100, n=160), with different degree (severity) of heterogeneity, and contamination percentage (HLPs) are (τ =10%, τ=30%). it was found through comparison that weighted least squares estimation based on the weights of the DRGP (ISE) method are considered the best in estimating the parameters of the multiple linear regression model because they have the lowest standard error values of the estimators ( ) and ( )  as compared to other methods. Paper type: A case study


Sign in / Sign up

Export Citation Format

Share Document