scholarly journals Random effects regression trees for the analysis of INVALSI data

Author(s):  
Giulia Vannucci ◽  
Anna Gottard ◽  
Leonardo Grilli ◽  
Carla Rampichini

Mixed or multilevel models exploit random effects to deal with hierarchical data, where statistical units are clustered in groups and cannot be assumed as independent. Sometimes, the assumption of linear dependence of a response on a set of explanatory variables is not plausible, and model specification becomes a challenging task. Regression trees can be helpful to capture non-linear effects of the predictors. This method was extended to clustered data by modelling the fixed effects with a decision tree while accounting for the random effects with a linear mixed model in a separate step (Hajjem & Larocque, 2011; Sela & Simonoff, 2012). Random effect regression trees are shown to be less sensitive to parametric assumptions and provide improved predictive power compared to linear models with random effects and regression trees without random effects. We propose a new random effect model, called Tree embedded linear mixed model, where the regression function is piecewise-linear, consisting in the sum of a tree component and a linear component. This model can deal with both non-linear and interaction effects and cluster mean dependencies. The proposal is the mixed effect version of the semi-linear regression trees (Vannucci, 2019; Vannucci & Gottard, 2019). Model fitting is obtained by an iterative two-stage estimation procedure, where both the fixed and the random effects are jointly estimated. The proposed model allows a decomposition of the effect of a given predictor within and between clusters. We will show via a simulation study and an application to INVALSI data that these extensions improve the predictive performance of the model in the presence of quasi-linear relationships, avoiding overfitting, and facilitating interpretability.

2020 ◽  
pp. 1-37
Author(s):  
Tal Yarkoni

Abstract Most theories and hypotheses in psychology are verbal in nature, yet their evaluation overwhelmingly relies on inferential statistical procedures. The validity of the move from qualitative to quantitative analysis depends on the verbal and statistical expressions of a hypothesis being closely aligned—that is, that the two must refer to roughly the same set of hypothetical observations. Here I argue that many applications of statistical inference in psychology fail to meet this basic condition. Focusing on the most widely used class of model in psychology—the linear mixed model—I explore the consequences of failing to statistically operationalize verbal hypotheses in a way that respects researchers' actual generalization intentions. I demonstrate that whereas the "random effect" formalism is used pervasively in psychology to model inter-subject variability, few researchers accord the same treatment to other variables they clearly intend to generalize over (e.g., stimuli, tasks, or research sites). The under-specification of random effects imposes far stronger constraints on the generalizability of results than most researchers appreciate. Ignoring these constraints can dramatically inflate false positive rates, and often leads researchers to draw sweeping verbal generalizations that lack a meaningful connection to the statistical quantities they are putatively based on. I argue that failure to take the alignment between verbal and statistical expressions seriously lies at the heart of many of psychology's ongoing problems (e.g., the replication crisis), and conclude with a discussion of several potential avenues for improvement.


2020 ◽  
pp. 1471082X2096691
Author(s):  
Amani Almohaimeed ◽  
Jochen Einbeck

Random effect models have been popularly used as a mainstream statistical technique over several decades; and the same can be said for response transformation models such as the Box–Cox transformation. The latter aims at ensuring that the assumptions of normality and of homoscedasticity of the response distribution are fulfilled, which are essential conditions for inference based on a linear model or a linear mixed model. However, methodology for response transformation and simultaneous inclusion of random effects has been developed and implemented only scarcely, and is so far restricted to Gaussian random effects. We develop such methodology, thereby not requiring parametric assumptions on the distribution of the random effects. This is achieved by extending the ‘Nonparametric Maximum Likelihood’ towards a ‘Nonparametric profile maximum likelihood’ technique, allowing to deal with overdispersion as well as two-level data scenarios.


2018 ◽  
Vol 147 ◽  
Author(s):  
A. Aswi ◽  
S. M. Cramb ◽  
P. Moraga ◽  
K. Mengersen

AbstractDengue fever (DF) is one of the world's most disabling mosquito-borne diseases, with a variety of approaches available to model its spatial and temporal dynamics. This paper aims to identify and compare the different spatial and spatio-temporal Bayesian modelling methods that have been applied to DF and examine influential covariates that have been reportedly associated with the risk of DF. A systematic search was performed in December 2017, using Web of Science, Scopus, ScienceDirect, PubMed, ProQuest and Medline (via Ebscohost) electronic databases. The search was restricted to refereed journal articles published in English from January 2000 to November 2017. Thirty-one articles met the inclusion criteria. Using a modified quality assessment tool, the median quality score across studies was 14/16. The most popular Bayesian statistical approach to dengue modelling was a generalised linear mixed model with spatial random effects described by a conditional autoregressive prior. A limited number of studies included spatio-temporal random effects. Temperature and precipitation were shown to often influence the risk of dengue. Developing spatio-temporal random-effect models, considering other priors, using a dataset that covers an extended time period, and investigating other covariates would help to better understand and control DF transmission.


2020 ◽  
Author(s):  
Amanda Lee ◽  
Meggan Graves ◽  
Andrea Lear ◽  
Sherry Cox ◽  
Marc Caldwell ◽  
...  

AbstractPain management should be utilized with castration to reduce physiological and behavioral changes. Transdermal application of drugs require less animal management and fewer labor risks, which can occur with oral administration or injections. The objective was to determine the effects of transdermal flunixin meglumine on meat goats’ behavior post-castration. Male goats (N = 18; mean body weight ± standard deviation: 26.4 ± 1.6 kg) were housed individually in pens and randomly assigned to 1 of 3 treatments: (1) castrated, dosed with transdermal flunixin meglumine; (2) castrated, dosed with transdermal placebo; and (3) sham castrated, dosed with transdermal flunixin meglumine. Body position, rumination, and head- pressing were observed for 1 h ± 10 minutes twice daily on days −1, 0, 1, 2, and 5 around castration. Each goat was observed once every 5-minutes (scan samples) and reported as percentage of observations. Accelerometers were used to measure standing, lying, and laterality (total time, bouts, and bout duration). A linear mixed model was conducted using GLIMMIX. Fixed effects of treatment, day relative to castration, and treatment*day relative to castration and random effect of date and goat nested within treatment were included. Treatment 1 goats (32.7 ± 2.8%) and treatment 2 goats (32.5 ± 2.8%) ruminated less than treatment 3 goats (47.4 ± 2.8%, P = 0.0012). Head pressing was greater on day of castration in treatment 2 goats (P < 0.001). Standing bout duration was greatest in treatment 2 goats on day 1 post-castration (P < 0.001). Lying bout duration was greatest in treatment 2 goats on day 1 post-castration compared to treatment 1 and treatment 3 goats(P < 0.001). Transdermal flunixin meglumine improved goats’ fluidity of movement post-castration and decreased head pressing, indicating a mitigation of pain behavior.


Stats ◽  
2018 ◽  
Vol 1 (1) ◽  
pp. 48-76
Author(s):  
Freddy Hernández ◽  
Viviana Giampaoli

Mixed models are useful tools for analyzing clustered and longitudinal data. These models assume that random effects are normally distributed. However, this may be unrealistic or restrictive when representing information of the data. Several papers have been published to quantify the impacts of misspecification of the shape of the random effects in mixed models. Notably, these studies primarily concentrated their efforts on models with response variables that have normal, logistic and Poisson distributions, and the results were not conclusive. As such, we investigated the misspecification of the shape of the random effects in a Weibull regression mixed model with random intercepts in the two parameters of the Weibull distribution. Through an extensive simulation study considering six random effect distributions and assuming normality for the random effects in the estimation procedure, we found an impact of misspecification on the estimations of the fixed effects associated with the second parameter σ of the Weibull distribution. Additionally, the variance components of the model were also affected by the misspecification.


2020 ◽  
Author(s):  
Brandon LeBeau

<p>The linear mixed model is a commonly used model for longitudinal or nested data due to its ability to account for the dependency of nested data. Researchers typically rely on the random effects to adequately account for the dependency due to correlated data, however serial correlation can also be used. If the random effect structure is misspecified (perhaps due to convergence problems), can the addition of serial correlation overcome this misspecification and allow for unbiased estimation and accurate inferences? This study explored this question with a simulation. Simulation results show that the fixed effects are unbiased, however inflation of the empirical type I error rate occurs when a random effect is missing from the model. Implications for applied researchers are discussed.</p>


2018 ◽  
Vol 18 (2) ◽  
pp. 303-310 ◽  
Author(s):  
Mervyn Travers ◽  
Penny Moss ◽  
William Gibson ◽  
Dana Hince ◽  
Sheree Yorke ◽  
...  

Abstract Background and aims: Exercise-induced hypoalgesia (EIH) is a well-established phenomenon in pain-free individuals that describes a decrease in pain sensitivity after an acute bout of exercise. The EIH response has been demonstrated to be sub-optimal in the presence of persisting pain. Menstrual pain is a common recurrent painful problem with many women experiencing high levels of pain each cycle. However, the EIH response has not been examined in a cohort of women with high levels of menstrual pain. This research aimed to examine whether EIH manifests differently in women with varying levels of menstrual pain. The primary hypothesis was that women with high levels of menstrual pain would demonstrate compromised EIH. Secondary aims were to explore relationships between EIH and emotional state, sleep quality, body mass index (BMI) or physical activity levels. Methods: Pressure pain thresholds (PPT) were measured in 64 participants using a digital handheld algometer before and after a submaximal isometric-handgrip exercise. EIH index was compared between low (VAS 0–3), moderate (VAS 4–7) and high (VAS 8–10) pain groups, using a linear mixed model analysis with participant as a random effect, and site, menstrual pain category and the interaction between the two, as fixed effects. Results: EIH was consistently induced in all groups. However, there was no statistically significant difference between the pain groups for EIH index (p=0.835) or for any co-variates (p>0.05). Conclusions: EIH was not found to differ between women who report regular low, moderate or high levels of menstrual pain, when measured at a point in their menstrual cycle when they are pain free. Implications: This study provides insight that EIH does not vary in women with differing levels of menstrual pain when they are not currently experiencing pain. The current findings indicate that, although menstrual pain can involve regular episodes of high pain levels, it may not be associated with the same central nervous system dysfunctions as seen in sustained chronic pain conditions.


2021 ◽  
Vol 21 (2) ◽  
pp. 72-80
Author(s):  
ASEP RUSYANA ◽  
KHAIRIL ANWAR NOTODIPUTRO ◽  
BAGUS SARTONO

Generalized Linear Mixed Model (GLMM) is a framework that has a response variable, fixed effects, and random effects. The response variable comes from an exponential family, whereas random effects have a normal distribution. Estimating parameters can be calculated using the maximum likelihood method using the Laplace approach or the Gauss-Hermite Quadrature (GHQ) approach. The purpose of this study was to identify factors that trigger student's interest to continue studying at Universitas Syiah Kuala (USK) using both techniques.  The GLMM is suitable for the data because the variable response has a Bernoulli distribution, and the random effects are assumed to be having a normal distribution. Also, the model helps identify the relationship between the dependent variable and the predictors. This study utilizes data from six high schools in Banda Aceh city drawn using a two-stage sampling technique. Stage 1, we randomly chose six out of sixteen public senior high schools in Banda Aceh. Stage 2, we selected students from each school from four different major classes. The GLMM model includes one binary response variable, five numerical fixed-effects, and two random effects. The response variable is the interest of high school students to continue study at USK (yes or no). The five fixed effects in the model including scores of collaboration (C), Action (A), Emotion (E), Purposes (P), and Hope (H).  Finally, the random effects are schools (S) and majors (M). In this study, both Laplace and GHQ techniques produce identical results. The predictors that can explain student interest are A, E, and H. These predictors have a positive effect. The random effects of schools and majors are not significantly different from zero. The model with three significant predictors is better than the complete predictor model.


Animals ◽  
2021 ◽  
Vol 12 (1) ◽  
pp. 26
Author(s):  
Ali Hardan ◽  
Philip C. Garnsworthy ◽  
Matt J. Bell

The aim of this study was to investigate the use of signal processing to detect eructation peaks in CH4 released by cows during robotic milking, and to compare recordings from three gas analysers (Guardian SP and NG, and IRMAX) differing in volume of air sampled and response time. To allow comparison of gas analysers using the signal processing approach, CH4 in air (parts per million) was measured by each analyser at the same time and continuously every second from the feed bin of a robotic milking station. Peak analysis software was used to extract maximum CH4 amplitude (ppm) from the concentration signal during each milking. A total of 5512 CH4 spot measurements were recorded from 65 cows during three consecutive sampling periods. Data were analysed with a linear mixed model including analyser × period, parity, and days in milk as fixed effects, and cow ID as a random effect. In period one, air sampling volume and recorded CH4 concentration were the same for all analysers. In periods two and three, air sampling volume was increased for IRMAX, resulting in higher CH4 concentrations recorded by IRMAX and lower concentrations recorded by Guardian SP (p < 0.001), particularly in period three, but no change in average concentrations measured by Guardian NG across periods. Measurements by Guardian SP and IRMAX had the highest correlation; Guardian SP and NG produced similar repeatability and detected more variation among cows compared with IRMAX. The findings show that signal processing can provide a reliable and accurate means to detect CH4 eructations from animals when using different gas analysers.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 172-172
Author(s):  
Morgan Van Davelaar ◽  
David R Notter ◽  
D Lee Wright ◽  
Anna M Zajac ◽  
Scott P Greiner ◽  
...  

Abstract The objective was to determine the magnitude of yearly differences in parasite load in growing ram lambs. Data were obtained from the Southwest Virginia Agriculture Research and Extension Center in Glade Spring, VA. The center conducts a forage-based ram growth test during the summer where rams also undergo a parasite challenge. Data consisted of 488 Katahdin rams tested from 2012 to 2018. Rams were dewormed at delivery, and at the start of the test, each ram received an oral dose of 5,000 H. contortus larvae adjusted for body weight. Fecal egg count was measured 70 d later when rams were on average (SD) 200 (18) d old. Fecal egg counts were not normally distributed. The Box-Cox procedure indicated that a log transformation was appropriate, but the residuals were not normally distributed for several linear models. A zero-inflated negative binomial generalized linear mixed model was used for data analysis with the glmmTMB package in R. The model included fixed effects of centered and scaled weight, centered and scaled age, year, birth type, and rearing type, and the random effect of consignor. Birth type, rear type, and age were not significant (P &gt; 0.10). The least square mean (SE) fecal egg counts by year were 344 (118) for 2012, 623 (214) for 2013, 574 (195) for 2014, 1,125 (409) for 2015, 745 (253) for 2016, 408 (142) for 2017, and 239 (86) for 2018. Differences in summer precipitation could affect average parasite load. Despite similar total summer precipitation, 2012 had 1 extremely wet month whereas, 2015 had consistent precipitation throughout the summer. We conclude that producers should not compare fecal egg counts across years because the overall average may be multiple-fold different from year to year. Ranking rams within year will be more effective to select for improved parasite resistance.


Sign in / Sign up

Export Citation Format

Share Document