scholarly journals LuxUS: DNA methylation analysis using generalized linear mixed model with spatial correlation

2020 ◽  
Vol 36 (17) ◽  
pp. 4535-4543
Author(s):  
Viivi Halla-aho ◽  
Harri Lähdesmäki

Abstract Motivation DNA methylation is an important epigenetic modification, which has multiple functions. DNA methylation and its connections to diseases have been extensively studied in recent years. It is known that DNA methylation levels of neighboring cytosines are correlated and that differential DNA methylation typically occurs rather as regions instead of individual cytosine level. Results We have developed a generalized linear mixed model, LuxUS, that makes use of the correlation between neighboring cytosines to facilitate analysis of differential methylation. LuxUS implements a likelihood model for bisulfite sequencing data that accounts for experimental variation in underlying biochemistry. LuxUS can model both binary and continuous covariates, and mixed model formulation enables including replicate and cytosine random effects. Spatial correlation is included to the model through a cytosine random effect correlation structure. We show with simulation experiments that using the spatial correlation, we gain more power to the statistical testing of differential DNA methylation. Results with real bisulfite sequencing dataset show that LuxUS is able to detect biologically significant differentially methylated cytosines. Availability and implementation The tool is available at https://github.com/hallav/LuxUS. Supplementary information Supplementary data are available at Bioinformatics online.

2019 ◽  
Author(s):  
Viivi Halla-aho ◽  
Harri Lähdesmäki

AbstractMotivationDNA methylation is an important epigenetic modification, which has multiple functions. DNA methylation and its connections to diseases have been extensively studied in recent years. It is known that DNA methylation levels of neighboring cytosines are correlated and that differential DNA methylation typically occurs rather as regions instead of individual cytosine level.ResultsWe have developed a generalized linear mixed model, LuxUS, that makes use of the correlation between neighboring cytosines to facilitate analysis of differential methylation. LuxUS implements a likelihood model for bisulfite sequencing data that accounts for experimental variation in underlying biochemistry. LuxUS can model both binary and continuous covariates, and mixed model formulation enables including replicate and cytosine random effects. Spatial correlation is included to the model through a cytosine random effect correlation structure. We show with simulation experiments that by utilizing the spatial correlation we gain more power to the statistical testing of differential DNA methylation. Results with real bisulfite sequencing data set show that LuxUS is able to detect biologically significant differentially methylated cytosines.AvailabilityThe tool is available at https://github.com/hallav/LuxUS.Supplementary informationSupplementary data are available at bioRxiv.


2020 ◽  
Vol 36 (Supplement_1) ◽  
pp. i128-i135
Author(s):  
Rui Zhu ◽  
Chao Jiang ◽  
Xiaofeng Wang ◽  
Shuang Wang ◽  
Hao Zheng ◽  
...  

Abstract Motivation The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely used in biomedical computation, for instance in the genome-wide association studies (GWASs) that aim to detect genetic variance significantly associated with phenotypes such as human diseases. Collaborative GWAS on large cohorts of patients across multiple institutions is often impeded by the privacy concerns of sharing personal genomic and other health data. To address such concerns, we present in this paper a privacy-preserving Expectation–Maximization (EM) algorithm to build GLMM collaboratively when input data are distributed to multiple participating parties and cannot be transferred to a central server. We assume that the data are horizontally partitioned among participating parties: i.e. each party holds a subset of records (including observational values of fixed effect variables and their corresponding outcome), and for all records, the outcome is regulated by the same set of known fixed effects and random effects. Results Our collaborative EM algorithm is mathematically equivalent to the original EM algorithm commonly used in GLMM construction. The algorithm also runs efficiently when tested on simulated and real human genomic data, and thus can be practically used for privacy-preserving GLMM construction. We implemented the algorithm for collaborative GLMM (cGLMM) construction in R. The data communication was implemented using the rsocket package. Availability and implementation The software is released in open source at https://github.com/huthvincent/cGLMM. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Author(s):  
Viivi Halla-aho ◽  
Harri Lähdesmäki

ABSTRACTBisulfite sequencing (BS-seq) is a popular method for measuring DNA methylation in basepair-resolution. Many BS-seq data analysis tools utilize the assumption of spatial correlation among the neighboring cytosines’ methylation states. While being a fair assumption, most existing methods leave out the possibility of deviation from the spatial correlation pattern. Our approach builds on a method which combines a generalized linear mixed model (GLMM) with a likelihood that is specific for BS-seq data and that incorporates a spatial correlation for methylation levels. We propose a novel technique using a sparsity promoting prior to enable cytosines deviating from the spatial correlation pattern. The method is tested with both simulated and real BS-seq data and compared to other differential methylation analysis tools.


2019 ◽  
Vol 41 (5) ◽  
pp. 733-755
Author(s):  
Amanda Edmonds ◽  
Aarnes Gudmestad ◽  
Thomas Metzger

Abstract This investigation responds to the need for longitudinal data-driven research on additional-language (AL) acquisition by examining grammatical-gender marking among AL learners of French during a 21-month period, which included an academic year abroad (LANGSNAP corpus). The analysis of oral production consists of a generalized linear mixed model that examines a range of linguistic and extralinguistic factors shown to be important for gender marking in previous research, as well as a random effect for participant. Results show evidence of both change across time and consistency in the interlanguage. Drawing on variationism and usage-based approaches, we argue that longitudinal investigations that are focused on how learners use their additional language have much to offer our understanding of AL acquisition processes.


2020 ◽  
Author(s):  
James L. Peugh ◽  
Sarah J. Beal ◽  
Meghan E. McGrady ◽  
Michael D. Toland ◽  
Constance Mara

Author(s):  
Miriam Romero-López ◽  
María Carmen Pichardo ◽  
Ana Justicia-Arráez ◽  
Judit Bembibre-Serrano

The objective of this study is to measure the effectiveness of a program on improving inhibitory and emotional control among children. In addition, it is assessed whether the improvement of these skills has an effect on the reduction of aggressive behavior in pre-school children. The participants were 100 children, 50 belonging to the control group and 50 to the experimental group, aged between 5 and 6 years. Pre-intervention and post-intervention measures of inhibitory and emotional control (BRIEF-P) and aggression (BASC) were taken. A Generalized Linear Mixed Model analysis (GLMM) was performed and found that children in the experimental group scored higher on inhibitory and emotional control compared to their peers in the control group. In addition, these improvements have an effect on the decrease in aggressiveness. In conclusion, preventive research should have among its priorities the design of such program given their implications for psychosocial development.


2020 ◽  
pp. 1-37
Author(s):  
Tal Yarkoni

Abstract Most theories and hypotheses in psychology are verbal in nature, yet their evaluation overwhelmingly relies on inferential statistical procedures. The validity of the move from qualitative to quantitative analysis depends on the verbal and statistical expressions of a hypothesis being closely aligned—that is, that the two must refer to roughly the same set of hypothetical observations. Here I argue that many applications of statistical inference in psychology fail to meet this basic condition. Focusing on the most widely used class of model in psychology—the linear mixed model—I explore the consequences of failing to statistically operationalize verbal hypotheses in a way that respects researchers' actual generalization intentions. I demonstrate that whereas the "random effect" formalism is used pervasively in psychology to model inter-subject variability, few researchers accord the same treatment to other variables they clearly intend to generalize over (e.g., stimuli, tasks, or research sites). The under-specification of random effects imposes far stronger constraints on the generalizability of results than most researchers appreciate. Ignoring these constraints can dramatically inflate false positive rates, and often leads researchers to draw sweeping verbal generalizations that lack a meaningful connection to the statistical quantities they are putatively based on. I argue that failure to take the alignment between verbal and statistical expressions seriously lies at the heart of many of psychology's ongoing problems (e.g., the replication crisis), and conclude with a discussion of several potential avenues for improvement.


Agriculture ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 722
Author(s):  
Bethan Cavendish ◽  
John McDonagh ◽  
Georgios Tzimiropoulos ◽  
Kimberley R. Slinger ◽  
Zoë J. Huggett ◽  
...  

The aim of this study was to characterize calving behavior of dairy cows and to compare the duration and frequency of behaviors for assisted and unassisted dairy cows at calving. Behavioral data from nine hours prior to calving were collected for 35 Holstein-Friesian dairy cows. Cows were continuously monitored under 24 h video surveillance. The behaviors of standing, lying, walking, shuffle, eating, drinking and contractions were recorded for each cow until birth. A generalized linear mixed model was used to assess differences in the duration and frequency of behaviors prior to calving for assisted and unassisted cows. The nine hours prior to calving was assessed in three-hour time periods. The study found that the cows spent a large proportion of their time either lying (0.49) or standing (0.35), with a higher frequency of standing (0.36) and shuffle (0.26) bouts than other behaviors during the study. There were no differences in behavior between assisted and unassisted cows. During the three-hours prior to calving, the duration and bouts of lying, including contractions, were higher than during other time periods. While changes in behavior failed to identify an association with calving assistance, the monitoring of behavioral patterns could be used as an alert to the progress of parturition.


2020 ◽  
pp. 1471082X2096691
Author(s):  
Amani Almohaimeed ◽  
Jochen Einbeck

Random effect models have been popularly used as a mainstream statistical technique over several decades; and the same can be said for response transformation models such as the Box–Cox transformation. The latter aims at ensuring that the assumptions of normality and of homoscedasticity of the response distribution are fulfilled, which are essential conditions for inference based on a linear model or a linear mixed model. However, methodology for response transformation and simultaneous inclusion of random effects has been developed and implemented only scarcely, and is so far restricted to Gaussian random effects. We develop such methodology, thereby not requiring parametric assumptions on the distribution of the random effects. This is achieved by extending the ‘Nonparametric Maximum Likelihood’ towards a ‘Nonparametric profile maximum likelihood’ technique, allowing to deal with overdispersion as well as two-level data scenarios.


Sign in / Sign up

Export Citation Format

Share Document