variance inflation factor
Recently Published Documents


TOTAL DOCUMENTS

71
(FIVE YEARS 43)

H-INDEX

9
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Osman U. Ekiz ◽  

In multiple linear regression analysis, the variance inflation factor is a well-known collinearity measure. It is defined as the function of the coefficient of determination between the explanatory variables, and it is based on the maximum likelihood estimator of the regression coefficients. Nevertheless, in addition to outliers, leverage observations can have significant impact on the coefficient of determination, and thereby the variance inflation factor. This study presents an improved robust variance inflation factor estimator that is not affected by these observations. Simulation studies and a real data analysis indicate that the modified robust variance inflation factor estimator performs better than the traditional one.


2021 ◽  
Vol 17 (5) ◽  
pp. 636-646
Author(s):  
Shelan Saied Ismaeel ◽  
Habshah Midi ◽  
Muhammed Sani

It is now evident that high leverage points (HLPs) can induce the multicollinearity pattern of a data in fixed effect panel data model. Those observations that are responsible for this phenomenon are called high leverage collinearity-enhancing observations (HLCEO). The commonly used within group ordinary least squares (WOLS) estimator for estimating the parameters of fixed effect panel data model is easily affected by HLCEOs. In their presence, the WOLS estimates may produce large variances and this would lead to erroneous interpretation. Therefore, it is imperative to detect the multicollinearity which is caused by HLCEOs. The classical Variance Inflation Factor (CVIF) is the commonly used diagnostic method for detecting multicollinearity in panel data. However, it is not correctly diagnosed multicollinearity in the presence of HLCEOs. Hence, in this paper three new robust diagnostic methods of diagnosing multicollinearity in panel data are proposed, namely the RVIF (WGM-FIMGT), RVIF (WGM-DRGP) and RVIF (WMM) and compared their performances with the CVIF. The numerical evidences show that the CVIF incorrectly diagnosed multicollinearity but our proposed methods correctly diagnosed no multicollinearity in the presence of HLCEOs where RVIF (WGM-FIMGT) being the best method as it has the least computational running time.


Media Wisata ◽  
2021 ◽  
Vol 12 (2) ◽  
Author(s):  
Ali Hasan ◽  
Irma Kharisma Hatibie

The purpose of the authors conducted research to determine the effectiveness of E-Marketing Interests Against Tourist Visits in Gorontalo Saronde Island and can be beneficial for the institutions, the object of research, author and upcoming research. The location of the research is Saronde Island in Gorontalo. E-Marketing is one of the factors that play an important role in running a marketing business. E-marketing or Electronic Marketing is one of the breakthroughs that are quite reliable in marketing a tourism product. Saronde Island as one of the tourist attractions managed by GAB (Natural Gorontalo Maritime) uses the concept of E-Marketing as one way powerful enough to market their products. This research uses quantitative, measurement data in the form of numbers. Variables examined include Facebook marketing, email marketing, web marketing and blackberry messenger marketing. The number of samples from the total sample is 60 respondents to the normality test, autocorrelation test results and test multilinear data formula that relies on the analysis of the results of Durbin Watson and VIF (Variance Inflation Factor). With the technique of multiple regression analysis. The analysis result showed Blackberry Messenger is the most effective changer variable, which conducted to influent visiting of tourists, than others variable. The results showed a significant effect on the interest in visiting tourists at Saronde island of Gorontalo. The implications of this study are to be helpful to the reader, or for anyone who wants to continue the same thing in future research. Hypothesis Testing depends variable to the independent variable, the probability value is < 0.01.


Author(s):  
V. G. Jemilohun

This study investigates the impact of violation of the assumption of the hierarchical linear model where covariate of level – 1 collinear with the correct functional and omitted variable model. This was carried out via Monte Carlo simulation. In an attempt to achieve this omitted variable bias was introduced. The study considers the multicollinearity effects when the models are in the correct form and when they are not in the correct form.  Also, multicollinearity test was carried out on the data set to find out whether there is presence of multicollinearity among the data set using Variance Inflation Factor (VIF).  At the end of the study, the result shows that, omitted variable has tremendous impact on hierarchical linear model.


Mathematics ◽  
2021 ◽  
Vol 9 (13) ◽  
pp. 1570
Author(s):  
Daniel Homocianu ◽  
Aurelian-Petruș Plopeanu ◽  
Rodica Ianole-Calin

The paper aims to emphasize the advantages of several advanced statistical and data mining techniques when applied to the dense literature on corruption measurements and determinants. For this purpose, we used all seven waves of the World Values Survey and we employed the Naive Bayes technique in SQL Server Analysis Services 2016, the LASSO package together with logit and melogit regressions with raw coefficients in Stata 16. We further conducted different types of tests and cross-validations on the wave, country, gender, and age categories. For eliminating multicollinearity, we used predictor correlation matrices. Moreover, we assessed the maximum computed variance inflation factor (VIF) against a maximum acceptable threshold, depending on the model’s R squared in Ordinary Least Square (OLS) regressions. Our main contribution consists of a methodology for exploring and validating the most important predictors of the risk associated with bribery tolerance. We found the significant role of three influences corresponding to questions about attitudes towards the property, authority, and public services, and other people in terms of anti-cheating, anti-evasion, and anti-violence. We used scobit, probit, and logit regressions with average marginal effects to build and test the index based on these attitudes. We successfully tested the index using also risk prediction nomograms and accuracy measurements (AUCROC > 0.9).


Author(s):  
Deshiwa Budilaksana ◽  
I Made Sukarsa ◽  
Anak Agung Ketut Agung Cahyawan Wiranatha

The demand for automotive in Indonesia has never subsided, considering that the human need for transportation greatly affects people's daily lives. Various attempts are made by manufacturers to produce cars of a quality that is comparable to the costs incurred and following market demand. Prediction is a process that can be done to achieve this goal. One of the prediction methods that can be used in this case is the kNearest Neighbor. The prediction process consists of a preprocessing stage that cleans and filters unnecessary variables, followed by a variable multicollinearity test stage with Variance Inflation Factor (VIF). The multicollinearity test found 4 variables that had a specific influence in predicting the VIF value of these variables, respectively 2.22, 2.08, 1.53, 1.10 for Horse Power, Car Width, Highend, and, Hatchback respectively. The four variables of the VIF test results have a positive correlation with the price variable as the dependent variable. The prediction model is made using 4 variables selected based on the VIF test, to determine the accuracy of the method used, the Linear Regression model and, the kNearest Neighbor through the validation test with Mean Absolute Error (MAE) and R2. The kNearest Neighbor method produces an MAE test of 0.06 and R2 results are 0.843. This can be concluded if the overall kNearest Neighbor method has qualified performance in making predictions with continuous value variables or in other words using the concept of regression.


Author(s):  
Alhassan Umar Ahmad, Et. al.

In this paper, the consequences of missing observations on data-based multicollinearity were analyzed. Different missing values has a different effect on multicollinearity in the system of multiple regression model. Therefore, to ascertain the clear relationship between both multicollinearity and skipping values on monotone and arbitrary missing values, the collinear effects were potentially studied on two types of missing values. Similarly, the comparison was done to investigate each response of multicollinearity on each pattern of the missing values with the same informatics data. It was found that tolerance and variance inflation factor fluctuates due to the missing of information from the sample analyzed at a different percentages of the missing values.It was observed that the more missing values available in the sample obtain from either population statistics or survey than multicollinearity will be found in the system of multiple regression, this is because as the number of Missingness increase it shows a drastic decrease from the tolerance level on both monotone and arbitrary types as observed from the analysis.


2021 ◽  
Author(s):  
Abolfazl Ghanbari ◽  
Behzad Baradaran ◽  
Hamed Ahmadi ◽  
Maryam Ahmadi

Abstract Background: Within six months of the COVID-19 outbreak, 350279 people were infected, and 20125 people died of COVID-19 in Iran. There is an urgent need to find the most accurate effective indicators on this disease's outbreak in order to control and predict. Methods: We examined the effect of 36 demographic, economic, environmental, health infrastructure, social, and topographic independent variables on the COVID-19 infection and mortality rates using the ordinary least squares (OLS) model in ArcGIS 10.5. Regarding adjusted R-squared>0/7, we selected 20 variables for COVID-19 infection rate and 16 variables for the mortality rate. The collinearity problem between the selected variables resolved after using the variance inflation factor (VIF). Then, we performed the OLS and geographically weighted regression (GWR) models in ArcGIS 10.5.Results: Having a large number of men, having a large population, lack of specialist doctors, lack of hospital, having a large urban population, having a large number of people aged 65 and over or older individuals, and high natural mortality rate had the most prominent impact on the COVID-19 infection increasing rate. Also, lack of ICU beds, low number of insured people, lack of subspecialist physicians, and lack of hospital beds had the most prominent impact on increasing of COVID-19 mortality. Then the variables with VIF above 7.5 were removed and finally, high incoming immigrants rate and lack of nurses were identified as two independent variables to predict COVID-19 infection rate. In addition, high incoming immigrants rate and high number of doctor consultation were recognized as two variables to predict mortality rate due to COVID-19. The results of the Akaike information criterion (AIC) and adj.R2 showed that both models were appropriate for these analyses.Conclusions: Based on our results, there would be a considerable increase in COVID-19 infection in Kerman, Esfahan, and Kermanshah provinces. In addition, there would be a remarkable decrease in COVID-19 infection in Khuzestan, Lorestan, Azarbayjan Shargi, and Tehran provinces. Regarding COVID-19 mortality, there would be a substantial rise in Fars and Khorasan Razavi provinces. Moreover, our analyses predicted a considerable diminish in COVID-19 mortality in Tehran, Ardebil, Zanjan, Gilan, Golestan, Lorestan, Khuzestan, Bushehr, and Hormozgan provinces.


Foods ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 632
Author(s):  
Antonio González Ariza ◽  
Ander Arando Arbulu ◽  
Francisco Javier Navas González ◽  
Juan Vicente Delgado Bermejo ◽  
María Esperanza Camacho Vallejo

This study aimed to develop a tool to validate multivariety breed egg quality classification depending on quality-related internal and external traits using a discriminant canonical analysis approach. A flock of 60 Utrerana hens (Franciscan, White, Black, and Partridge) and a control group of 10 Leghorn hens were placed in individual cages to follow the traceability of the eggs and perform an individual internal and external quality assessment. Egg groups were determined depending on their commercial size (S, M, L, and XL), laying hen breed, and variety. Egg weight, major diameter, minor diameter, shell b*, albumen height, and the presence or absence of visual defects in yolk and/or albumen showed multicollinearity problems (variance inflation factor (VIF) > 5) and were discarded. Albumen weight, eggshell weight, and yolk weight were the most responsible traits for the differences among egg quality categories (Wilks’ lambda: 0.335, 0.539, and 0.566 for albumen weight, eggshell weight, and yolk weight, respectively). The combination of traits in the first two dimensions explained 55.02% and 20.62% variability among groups, respectively. Shared properties between Partridge and Franciscan varieties may stem from their eggs presenting heavier yolks and slightly lower weights, while White Utrerana and Leghorn hens’ similarities may be ascribed to hybridization reminiscences.


Sign in / Sign up

Export Citation Format

Share Document