Random Effects Won’t Solve the Problem of Generalizability

2021 ◽  
Author(s):  
Adam Bear ◽  
Jonathan Scott Phillips

Yarkoni argues that researchers making broad inferences often use impoverished statistical models that fail to include important sources of variation as random effects. We argue, however, that for many common study designs random effects are inappropriate and insufficient to draw general inferences, as the source of variation is not random, but systematic.

2017 ◽  
Vol 27 (4) ◽  
pp. 321-327 ◽  
Author(s):  
Signe M. Jensen ◽  
Christian Andreasen ◽  
Jens C. Streibig ◽  
Eshagh Keshtkar ◽  
Christian Ritz

AbstractIn recent years germination experiments have become more and more complex. Typically, they are replicated in time as independent runs and at each time point they involve hierarchical, often factorial experimental designs, which are now commonly analysed by means of linear mixed models. However, in order to characterize germination in response to time elapsed, specific event-time models are needed and mixed model extensions of these models are not readily available, neither in theory nor in practice. As a practical workaround we propose a two-step approach that combines and weighs together results from event-time models fitted separately to data from each germination test by means of meta-analytic random effects models. We show that this approach provides a more appropriate appreciation of the sources of variation in hierarchically structured germination experiments as both between- and within-experiment variation may be recovered from the data.


Author(s):  
Mohamed Elhadi Rahmani ◽  
Abdelmalek Amine

Computer modeling of ecological systems is the activity of implementing computer solutions to analyze data related to the fields of remote sensing, earth science, biology, and oceans. The ecologists analyze the data to identify the relationships between a response and a set of predictors, using statistical models that do not accurately describe the main sources of variation in the response variable. Knowledge discovery techniques are often more powerful, flexible, and effective for exploratory analysis than statistical techniques. This chapter aims to test the use of data mining in ecology. It will discuss the exploration of ecological data by defining at first data mining, its advantages, and its different types. Then the authors detail the field of bio-inspiration and meta-heuristics. And finally, they give case studies from where they applied these two areas to explore ecological data.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 240-241
Author(s):  
Hinayah R Oliveira ◽  
Stephen P Miller ◽  
Luiz F Brito ◽  
Flavio S Schenkel

Abstract The goals of this study were to develop a genetic evaluation system for a novel trait called functional heifer longevity (FHL), and determine if this novel trait is heritable. The FHL trait was defined as binary, in which the heifers received the code 1 if they had calved by the end of their third year (n = 377,938), or 0 if they were culled/sold during this period (n = 368,308). Analysis were performed using linear animal models and Bayesian inference. The significant systematic effects included in the statistical models are born by embryo transfer, year-season of birth, and age at calving (in months). Three models, differing according to their random effects (i.e., reduced model, which included only herd-year-season and additive genetic random effects; maternal genetic model, which added maternal genetic effects; and complete model, which further added maternal permanent environmental effects), were compared based on the deviance information criterion (DIC) and the estimates of genetic parameters. The reduced model was preferred according to the DIC values. However, high maternal heritabilities were estimated using the maternal genetic (0.51) and complete (0.36) models, indicating that maternal effects can impact the selection of heifers for breeding. Similar additive genetic heritabilities were estimated among the three models (0.24, 0.27, and 0.25 using the reduced, maternal genetic, and complete models, respectively), and no significant re-ranking of selection candidates were observed based on their additive genetic breeding values. Total heritabilities and correlations estimated between additive genetic and maternal genetic effects were 0.37 and -0.28 for the maternal genetic, and 0.31 and -0.27 for the complete model, respectively. This study shows that FHL is heritable, and that including maternal effects in the statistical models might be important. These results contribute to a larger project studying the genetics of female longevity in Angus cattle.


Author(s):  
Thomas J Glorioso ◽  
Gary K Grunwald ◽  
Colin O’Donnell ◽  
Wenhui Liu ◽  
Thomas M Maddox ◽  
...  

Background: Multilevel models for non-normal outcomes are widely used in outcomes research to estimate effects of covariates on outcomes, e.g. hospital readmissions following percutaneous coronary intervention (PCI). We refine Reference Effect Measures (REM) to compare effects of individual covariates, sets of covariates and random effects on the same scale (e.g. odds ratio, OR) and present a novel approach for displaying these effects. We illustrate this method by studying these sources of variation in 30-day readmission rates for Veterans Administration (VA) patients undergoing PCI. Methods: We used mixed effects logistic regression with 13 patient and 3 hospital covariates to study 30-day readmission rates in a national cohort of 45,521 VA Clinical Assessment, Reporting and Tracking (CART) patients who received a PCI during 10/2007-9/2012 at 49 VA hospitals. OR was used as a REM to compare percentiles of hospital or patient risk with median hospitals or patients to assess levels of variation. Results: Overall 30-day readmission rate was 11.5% ranging from 6.8% to 17.3% across the 49 sites. The figure below shows effects of individual patient and hospital covariates, combined effects of patient and hospital characteristics (shaded bars), and hospital random variation (shaded curve) for 30-day readmissions. The OR for comparing a 97.5 th percentile hospital with a median hospital, all covariates being equal, was 1.43, which was substantially less than the 2.88 OR found when comparing patients at the same percentile of the combined patient risk distribution. The largest patient covariate effect (congestive heart failure) had an OR of 1.58, equivalent to comparing a 99 th percentile hospital with a median hospital, or an 81 st percentile risk patient to a median risk patient. Combining all hospital characteristics, the OR for a 97.5 th percentile hospital versus a median hospital was 1.16. Conclusions: REMs are simple to compute, interpret and graph, and provide direct comparison on the same scale across random effects and covariates. The methods apply generally to linear, logistic, log, count, and time to event outcomes. These methods showed that patient risk drives 30-day readmission rates for VA PCI patients and the figure provided a new way of visualizing variation across these effects.


1993 ◽  
Vol 47 (2) ◽  
pp. 231-240 ◽  
Author(s):  
Karen A. Weissbecker ◽  
Barry Wolf ◽  
Lindon J. Eaves ◽  
Mary L. Marazita ◽  
Walter E. Nance

2003 ◽  
Vol 90 (6) ◽  
pp. 1087-1095 ◽  
Author(s):  
Gary K. Grunwald ◽  
Debra K. Sullivan ◽  
Mary Hise ◽  
Joseph E. Donnelly ◽  
Dennis J. Jacobsen ◽  
...  

Dietary studies are often conducted as longitudinal intervention or crossover trials using multiple days of measurement on each subject during each of several measurement periods, and determining the required numbers of days and subjects is important in designing these studies. Linear mixed statistical models were used to derive equations for precision, statistical power and sample size (number of days and number of subjects) and to obtain estimates of between-subject, period-to-period, and day-to-day variation needed to apply the equations. Two cohorts of an on-going exercise intervention study, and a crossover study of Olestra, each with 14 d of measurement/subject per period, were used to obtain estimates of variability for energy and macronutrient intake. Numerical examples illustrate how the equations for calculating the number of days or number of subjects are applied in typical situations, and sample SAS code is given. It was found that between-subject, period-to-period, and day-to-day variation all contributed significantly to the variation in energy and macronutrient intake. The ratio of period-to-period and day-to-day standard deviations controls the trade-off between the number of days and the number of subjects, and this remained relatively stable across studies and energy and macronutrient intake variables. The greatest gains in precision were seen over the first few measurement days. Greater precision and fewer required days were noted in the study (Olestra) that exerted greater control over the subjects and diets during the feeding protocol.


2021 ◽  
Author(s):  
Bradly Alicea

AbstractAs a biochemical process, direct cellular reprogramming is slow and complex. The early stages of this process is the most critical determinant of successful phenotypic conversion. This study provides insight into the statistical signatures that describe temporal structure in the reprogramming process. We examine two sources of variation in reprogramming cells: clonal instances from various tissues of origin and rate of expansion between these lines. Our analytical strategy involved modeling the potential of populations to reprogram, and then applying statistical models to capture this potential in action. This two-fold approach utilizes both conventional and novel techniques that allow us to infer and confirm a host of properties that define the phenomenon. These results can be summarized in a number of ways, and essentially suggest that reprogramming is organized around changes in gene expression phenotype (phases) which happens sporadically across a cellular population (bursts).


2008 ◽  
Vol 20 (1) ◽  
pp. 64-76 ◽  
Author(s):  
Ying Zhang ◽  
Peng Bi ◽  
Janet E. Hiller

This article reviews studies examining the relationship between climate variability and the transmission of vector- and rodent-borne diseases, including malaria, dengue fever, Ross River virus infection, and hemorrhagic fever with renal syndrome. The review has evaluated their study designs, statistical analysis methods, usage of meteorological variables, and results of those studies. The authors found that the limitations of analytical methods exist in most of the articles. Besides climatic variables, few of them have included other factors that can affect the transmission of vector-borne disease (eg, socioeconomic status). In addition, the quantitative relationship between climate and vector-borne diseases is inconsistent. Further research should be conducted among different populations with various climatic/ecological regions by using appropriate statistical models.


Sign in / Sign up

Export Citation Format

Share Document