correlated predictors
Recently Published Documents


TOTAL DOCUMENTS

23
(FIVE YEARS 7)

H-INDEX

8
(FIVE YEARS 0)

2021 ◽  
Author(s):  
João Veríssimo

Mixed-effects models containing both fixed and random effects have become widely used in the cognitive sciences, as they are particularly appropriate for the analysis of clustered data. However, testing hypotheses in the presence of random effects is not completely straightforward, and a set of best practices for statistical inference in mixed-effects models is still lacking. Van Doorn et al. (2021) investigated how Bayesian hypothesis testing in mixed-effects models is impacted by particular model specifications. Here, we extend their work to the more complex case of models with three-level factorial predictors and, more generally, with multiple correlated predictors. We show how non-maximal models with correlated predictors contain 'mismatches' between fixed and random effects, in which the same predictor can refer to different effects in the fixed and random parts of a model. We then demonstrate though a series of Bayesian model comparisons that such mismatches can lead to inaccurate estimations of random variance, and in turn to biases in the assessment of evidence for the effect of interest. We present specific recommendations for how researchers can resolve mismatches or avoid them altogether: by fitting maximal models, eliminating correlations between predictors, or by residualising the random effects. Our results reinforce the observation that model comparisons with mixed-effects models can be surprisingly intricate and highlight that researchers should carefully and explicitly consider which hypotheses are being tested by each model comparison. Data and code are publicly available in an OSF repository at https://osf.io/njaup.


Author(s):  
Mariella Gregorich ◽  
Susanne Strohmaier ◽  
Daniela Dunkler ◽  
Georg Heinze

Regression models have been in use for decades to explore and quantify the association between a dependent response and several independent variables in environmental sciences, epidemiology and public health. However, researchers often encounter situations in which some independent variables exhibit high bivariate correlation, or may even be collinear. Improper statistical handling of this situation will most certainly generate models of little or no practical use and misleading interpretations. By means of two example studies, we demonstrate how diagnostic tools for collinearity or near-collinearity may fail in guiding the analyst. Instead, the most appropriate way of handling collinearity should be driven by the research question at hand and, in particular, by the distinction between predictive or explanatory aims.


2020 ◽  
pp. 1471082X2092097
Author(s):  
Lauren J Beesley ◽  
Jeremy MG Taylor

Multistate modelling is a strategy for jointly modelling related time-to-event outcomes that can handle complicated outcome relationships, has appealing interpretations, can provide insight into different aspects of disease development and can be useful for making individualized predictions. A challenge with using multistate modelling in practice is the large number of parameters, and variable selection and shrinkage strategies are needed in order for these models to gain wider adoption. Application of existing selection and shrinkage strategies in the multistate modelling setting can be challenging due to complicated patterns of data missingness, inclusion of highly correlated predictors and hierarchical parameter relationships. In this article, we discuss how to modify and implement several existing Bayesian variable selection and shrinkage methods in a general multistate modelling setting. We compare the performance of these methods in terms of parameter estimation and model selection in a multistate cure model of recurrence and death in patients treated for head and neck cancer. We can view this work as a case study of variable selection and shrinkage in a complicated modelling setting with missing data.


2020 ◽  
Vol 5 (1) ◽  
pp. 48
Author(s):  
Pandit Putu Dharma ◽  
Hari Setijono ◽  
Edy Mintarto

AbstractThe purpose of this study is to analyze  (a) anthropometric factor as multivariant regression model to predict the serves velocities of elite tennis player that compete in Roland Garros 2017.  The athropometric factor was describe as, player height, player weight, age, and the Body Mass Index (BMI) of player. (b) this study also determind the significant level from the model to predict the serves speed. The data were collected from MATCH DETAILS BY IBM SLAMTRACKER.  Results show that correlation between each independent variable (i.e, age, height, weight, BMI) to serves speed of the player respectively are -0.0094, 0.7457, 0.7135, and 0.1944. The data show us that variable Height and Weight have the most correlated predictors of tennis serves speed. More over the correlation level  (R2) from the model was 0.5767 or 57.67 % it indicate that the model can predict 57.67% the dependent variable, i.e serve speed and the other 42.33% was determind by the other factor not included in model. The present finding underline the importance of player height and weight to determind   the serves speed of the players that play in Roland Garros 2017.


2020 ◽  
Vol 4 (1) ◽  
pp. 203-215
Author(s):  
Asep Andri Fauzi ◽  
Agus M. Soleh ◽  
Anik Djuraidah

Highly correlated predictors and nonlinear relationships between response and predictors potentially affected the performance of predictive modeling, especially when using the ordinary least square (OLS) method. The simple technique to solve this problem is by using another method such as Partial Least Square Regression (PLSR), Support Vector Regression with kernel Radial Basis Function (SVR-RBF), and Random Forest Regression (RFR). The purpose of this study is to compare OLS, PLSR, SVR-RBF, and RFR using simulation data. The methods were evaluated by the root mean square error prediction (RMSEP). The result showed that in the linear model, SVR-RBF and RFR have large RMSEP; OLS and PLSR are better than SVR-RBF and RFR, and PLSR provides much more stable prediction than OLS in case of highly correlated predictors and small sample size. In nonlinear data, RFR produced the smallest RMSEP when data contains high correlated predictors.


Sign in / Sign up

Export Citation Format

Share Document