scholarly journals The generalizability crisis

2020 ◽  
pp. 1-37
Author(s):  
Tal Yarkoni

Abstract Most theories and hypotheses in psychology are verbal in nature, yet their evaluation overwhelmingly relies on inferential statistical procedures. The validity of the move from qualitative to quantitative analysis depends on the verbal and statistical expressions of a hypothesis being closely aligned—that is, that the two must refer to roughly the same set of hypothetical observations. Here I argue that many applications of statistical inference in psychology fail to meet this basic condition. Focusing on the most widely used class of model in psychology—the linear mixed model—I explore the consequences of failing to statistically operationalize verbal hypotheses in a way that respects researchers' actual generalization intentions. I demonstrate that whereas the "random effect" formalism is used pervasively in psychology to model inter-subject variability, few researchers accord the same treatment to other variables they clearly intend to generalize over (e.g., stimuli, tasks, or research sites). The under-specification of random effects imposes far stronger constraints on the generalizability of results than most researchers appreciate. Ignoring these constraints can dramatically inflate false positive rates, and often leads researchers to draw sweeping verbal generalizations that lack a meaningful connection to the statistical quantities they are putatively based on. I argue that failure to take the alignment between verbal and statistical expressions seriously lies at the heart of many of psychology's ongoing problems (e.g., the replication crisis), and conclude with a discussion of several potential avenues for improvement.

Author(s):  
Tal Yarkoni

Most theories and hypotheses in psychology are verbal in nature, yet their evaluation overwhelmingly relies on inferential statistical procedures. The validity of the move from qualitative to quantitative analysis depends on the verbal and statistical expressions of a hypothesis being closely aligned—that is, that the two must refer to roughly the same set of hypothetical observations. Here I argue that many applications of statistical inference in psychology fail to meet this basic condition. Focusing on the most widely used class of model in psychology—the linear mixed model—I explore the consequences of failing to statistically operationalize verbal hypotheses in a way that respects researchers' actual generalization intentions. I demonstrate that whereas the "random effect" formalism is used pervasively in psychology to model inter-subject variability, few researchers accord the same treatment to other variables they clearly intend to generalize over (e.g., stimuli, tasks, or research sites). The under-specification of random effects imposes far stronger constraints on the generalizability of results than most researchers appreciate. Ignoring these constraints can dramatically inflate false positive rates, and often leads researchers to draw sweeping verbal generalizations that lack a meaningful connection to the statistical quantities they are putatively based on. I argue that the failure to problems many of psychology's ongoing problems (e.g., the replication crisis), and conclude with a discussion of several potential avenues for improvement.


2020 ◽  
pp. 1471082X2096691
Author(s):  
Amani Almohaimeed ◽  
Jochen Einbeck

Random effect models have been popularly used as a mainstream statistical technique over several decades; and the same can be said for response transformation models such as the Box–Cox transformation. The latter aims at ensuring that the assumptions of normality and of homoscedasticity of the response distribution are fulfilled, which are essential conditions for inference based on a linear model or a linear mixed model. However, methodology for response transformation and simultaneous inclusion of random effects has been developed and implemented only scarcely, and is so far restricted to Gaussian random effects. We develop such methodology, thereby not requiring parametric assumptions on the distribution of the random effects. This is achieved by extending the ‘Nonparametric Maximum Likelihood’ towards a ‘Nonparametric profile maximum likelihood’ technique, allowing to deal with overdispersion as well as two-level data scenarios.


2018 ◽  
Vol 147 ◽  
Author(s):  
A. Aswi ◽  
S. M. Cramb ◽  
P. Moraga ◽  
K. Mengersen

AbstractDengue fever (DF) is one of the world's most disabling mosquito-borne diseases, with a variety of approaches available to model its spatial and temporal dynamics. This paper aims to identify and compare the different spatial and spatio-temporal Bayesian modelling methods that have been applied to DF and examine influential covariates that have been reportedly associated with the risk of DF. A systematic search was performed in December 2017, using Web of Science, Scopus, ScienceDirect, PubMed, ProQuest and Medline (via Ebscohost) electronic databases. The search was restricted to refereed journal articles published in English from January 2000 to November 2017. Thirty-one articles met the inclusion criteria. Using a modified quality assessment tool, the median quality score across studies was 14/16. The most popular Bayesian statistical approach to dengue modelling was a generalised linear mixed model with spatial random effects described by a conditional autoregressive prior. A limited number of studies included spatio-temporal random effects. Temperature and precipitation were shown to often influence the risk of dengue. Developing spatio-temporal random-effect models, considering other priors, using a dataset that covers an extended time period, and investigating other covariates would help to better understand and control DF transmission.


Author(s):  
Giulia Vannucci ◽  
Anna Gottard ◽  
Leonardo Grilli ◽  
Carla Rampichini

Mixed or multilevel models exploit random effects to deal with hierarchical data, where statistical units are clustered in groups and cannot be assumed as independent. Sometimes, the assumption of linear dependence of a response on a set of explanatory variables is not plausible, and model specification becomes a challenging task. Regression trees can be helpful to capture non-linear effects of the predictors. This method was extended to clustered data by modelling the fixed effects with a decision tree while accounting for the random effects with a linear mixed model in a separate step (Hajjem & Larocque, 2011; Sela & Simonoff, 2012). Random effect regression trees are shown to be less sensitive to parametric assumptions and provide improved predictive power compared to linear models with random effects and regression trees without random effects. We propose a new random effect model, called Tree embedded linear mixed model, where the regression function is piecewise-linear, consisting in the sum of a tree component and a linear component. This model can deal with both non-linear and interaction effects and cluster mean dependencies. The proposal is the mixed effect version of the semi-linear regression trees (Vannucci, 2019; Vannucci & Gottard, 2019). Model fitting is obtained by an iterative two-stage estimation procedure, where both the fixed and the random effects are jointly estimated. The proposed model allows a decomposition of the effect of a given predictor within and between clusters. We will show via a simulation study and an application to INVALSI data that these extensions improve the predictive performance of the model in the presence of quasi-linear relationships, avoiding overfitting, and facilitating interpretability.


2020 ◽  
Author(s):  
Amanda Lee ◽  
Meggan Graves ◽  
Andrea Lear ◽  
Sherry Cox ◽  
Marc Caldwell ◽  
...  

AbstractPain management should be utilized with castration to reduce physiological and behavioral changes. Transdermal application of drugs require less animal management and fewer labor risks, which can occur with oral administration or injections. The objective was to determine the effects of transdermal flunixin meglumine on meat goats’ behavior post-castration. Male goats (N = 18; mean body weight ± standard deviation: 26.4 ± 1.6 kg) were housed individually in pens and randomly assigned to 1 of 3 treatments: (1) castrated, dosed with transdermal flunixin meglumine; (2) castrated, dosed with transdermal placebo; and (3) sham castrated, dosed with transdermal flunixin meglumine. Body position, rumination, and head- pressing were observed for 1 h ± 10 minutes twice daily on days −1, 0, 1, 2, and 5 around castration. Each goat was observed once every 5-minutes (scan samples) and reported as percentage of observations. Accelerometers were used to measure standing, lying, and laterality (total time, bouts, and bout duration). A linear mixed model was conducted using GLIMMIX. Fixed effects of treatment, day relative to castration, and treatment*day relative to castration and random effect of date and goat nested within treatment were included. Treatment 1 goats (32.7 ± 2.8%) and treatment 2 goats (32.5 ± 2.8%) ruminated less than treatment 3 goats (47.4 ± 2.8%, P = 0.0012). Head pressing was greater on day of castration in treatment 2 goats (P < 0.001). Standing bout duration was greatest in treatment 2 goats on day 1 post-castration (P < 0.001). Lying bout duration was greatest in treatment 2 goats on day 1 post-castration compared to treatment 1 and treatment 3 goats(P < 0.001). Transdermal flunixin meglumine improved goats’ fluidity of movement post-castration and decreased head pressing, indicating a mitigation of pain behavior.


Parasitology ◽  
2001 ◽  
Vol 122 (5) ◽  
pp. 563-569 ◽  
Author(s):  
D. A. ELSTON ◽  
R. MOSS ◽  
T. BOULINIER ◽  
C. ARROWSMITH ◽  
X. LAMBIN

The statistical aggregation of parasites among hosts is often described empirically by the negative binomial (Poisson-gamma) distribution. Alternatively, the Poisson-lognormal model can be used. This has the advantage that it can be fitted as a generalized linear mixed model, thereby quantifying the sources of aggregation in terms of both fixed and random effects. We give a worked example, assigning aggregation in the distribution of sheep ticksIxodes ricinuson red grouseLagopus lagopus scoticuschicks to temporal (year), spatial (altitude and location), brood and individual effects. Apparent aggregation among random individuals in random broods fell 8-fold when spatial and temporal effects had been accounted for.


Stats ◽  
2018 ◽  
Vol 1 (1) ◽  
pp. 48-76
Author(s):  
Freddy Hernández ◽  
Viviana Giampaoli

Mixed models are useful tools for analyzing clustered and longitudinal data. These models assume that random effects are normally distributed. However, this may be unrealistic or restrictive when representing information of the data. Several papers have been published to quantify the impacts of misspecification of the shape of the random effects in mixed models. Notably, these studies primarily concentrated their efforts on models with response variables that have normal, logistic and Poisson distributions, and the results were not conclusive. As such, we investigated the misspecification of the shape of the random effects in a Weibull regression mixed model with random intercepts in the two parameters of the Weibull distribution. Through an extensive simulation study considering six random effect distributions and assuming normality for the random effects in the estimation procedure, we found an impact of misspecification on the estimations of the fixed effects associated with the second parameter σ of the Weibull distribution. Additionally, the variance components of the model were also affected by the misspecification.


2000 ◽  
Vol 40 (7) ◽  
pp. 969 ◽  
Author(s):  
R. A. Lawes ◽  
M. K. Wegener ◽  
K. E. Basford ◽  
R. J. Lawn

Commercial cane sugar (CCS), as measured by sugar mills, is in decline in the wet tropics of Australia. One of these mills, Tully Sugar Ltd, has measured CCS in the factory as required by legislation and also measured whole clean stalk CCS through a small mill, which is free of contaminants. ‘Factory CCS’ measures the CCS of cane entering the mill, after it has been harvested. The harvesting and transport process delivers to the mill cane that is contaminated by extraneous matter such as leaf material and soil. Beween 1988 and 1998, 1516 blocks were sampled for ‘small mill CCS’. These data were combined with block productivity information to determine the trends in small mill CCS and factory CCS using a linear mixed model analysis as the data were unbalanced. Other data, including the date of harvest for factory CCS, date of sampling for small mill CCS, farm of origin and cane variety were available and fitted as random effects in the mixed model. Year was fixed to determine time related trends in the 2 measures of CCS. Small mill CCS was higher than factory CCS and remained constant from 1988 to 1998. Predicted factory CCS declined from 12.76 units in 1988 to 10.91 units in 1998. We conclude that the CCS levels in whole clean stalks were actually stable, since small mill CCS remained constant over the 10-year period. Possible reasons for the differences in the trends for the 2 CCS measures are discussed.


2019 ◽  
Vol 97 (Supplement_2) ◽  
pp. 216-216
Author(s):  
Mariana Boscato Menegat ◽  
Joel M DeRouchey ◽  
Jason C Woodworth ◽  
Mike D Tokach ◽  
Steve S Dritz ◽  
...  

Abstract This study was conducted to determine the effects of a multi-species direct-fed microbial (DFM) product based on lactic acid bacteria and Bacillus subtilis on growth performance and carcass characteristics of grow-finish pigs. A total of 1,188 pigs (PIC 359 × 1050; initially 25.8 kg BW) were used in a 121-d growth trial with 27 pigs/pen and 22 pens/treatment. Pigs were allotted to treatments based on initial BW in a randomized complete block design. Treatments included a control diet and the control diet with added DFM (BiOWiSH Technologies Inc., Cincinnati, OH) included at 0.055% of the diet at the expense of corn. Diets were based on corn, distillers dried grains with solubles, and soybean meal and fed in four dietary phases. Data were analyzed using a linear mixed model (PROC GLIMMIX, SAS®) with treatment as fixed effect, block as random effect, and pen as experimental unit. Overall (d 0 to 121), pigs fed the control diet had greater ADG (P < 0.05) and final BW (P < 0.001) compared to pigs fed the DFM diet (Table 1). There was no evidence for differences (P > 0.05) in ADFI or G:F between treatments. The difference in final BW resulted in heavier (P < 0.05) HCW in control pigs compared to DFM pigs, but no evidence for differences (P > 0.05) was observed in carcass yield, backfat, loin depth, and percentage lean between treatments. In conclusion, the inclusion of this multi-species DFM in growing-finishing diets reduced ADG in this commercial study. This response could be related to inclusion rate, feeding duration, or other factors not identified in this study, warranting further research to characterize the effects on pig performance.


Biometrics ◽  
2004 ◽  
Vol 60 (4) ◽  
pp. 945-953 ◽  
Author(s):  
Wendimagegn Ghidey ◽  
Emmanuel Lesaffre ◽  
Paul Eilers

Sign in / Sign up

Export Citation Format

Share Document