scholarly journals FastGWA-GLMM: a generalized linear mixed model association tool for biobank-scale data

Author(s):  
Jian Yang ◽  
Longda Jiang ◽  
Zhili Zheng

Abstract Compared to linear mixed model-based genome-wide association (GWA) methods, generalized linear mixed model (GLMM)-based methods have better statistical properties when applied to binary traits but are computationally much slower. Here, leveraging efficient sparse matrix-based algorithms, we developed a GLMM-based GWA tool (called fastGWA-GLMM) that is orders of magnitude faster than the state-of-the-art tool (e.g., ~37 times faster when n=400,000) with more scalable memory usage. We show by simulation that the fastGWA-GLMM test-statistics of both common and rare variants are well-calibrated under the null, even for traits with an extreme case-control ratio (e.g., 0.1%). We applied fastGWA-GLMM to the UK Biobank data of 456,348 individuals, 11,842,647 variants and 2,989 binary traits (full summary statistics available at http://fastgwa.info/ukbimpbin) and identified 259 rare variants associated with 75 traits, demonstrating the use of imputed genotype data in a large cohort to discover rare variants for binary complex traits.

Author(s):  
Yang Hai ◽  
Yalu Wen

Abstract Motivation Accurate disease risk prediction is essential for precision medicine. Existing models either assume that diseases are caused by groups of predictors with small-to-moderate effects or a few isolated predictors with large effects. Their performance can be sensitive to the underlying disease mechanisms, which are usually unknown in advance. Results We developed a Bayesian linear mixed model (BLMM), where genetic effects were modelled using a hybrid of the sparsity regression and linear mixed model with multiple random effects. The parameters in BLMM were inferred through a computationally efficient variational Bayes algorithm. The proposed method can resemble the shape of the true effect size distributions, captures the predictive effects from both common and rare variants, and is robust against various disease models. Through extensive simulations and the application to a whole-genome sequencing dataset obtained from the Alzheimer’s Disease Neuroimaging Initiatives, we have demonstrated that BLMM has better prediction performance than existing methods and can detect variables and/or genetic regions that are predictive. Availability The R-package is available at https://github.com/yhai943/BLMM Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Xuan Zhou ◽  
S. Hong Lee

AbstractComplementary to the genome, the concept of exposome has been proposed to capture the totality of human environmental exposures. While there has been some recent progress on the construction of the exposome, few tools exist that can integrate the genome and exposome for complex trait analyses. Here we propose a linear mixed model approach to bridge this gap, which jointly models the random effects of the two omics layers on phenotypes of complex traits. We illustrate our approach using traits from the UK Biobank (e.g., BMI and height for N ~ 35,000) with a small fraction of the exposome that comprises 28 lifestyle factors. The joint model of the genome and exposome explains substantially more phenotypic variance and significantly improves phenotypic prediction accuracy, compared to the model based on the genome alone. The additional phenotypic variance captured by the exposome includes its additive effects as well as non-additive effects such as genome–exposome (gxe) and exposome–exposome (exe) interactions. For example, 19% of variation in BMI is explained by additive effects of the genome, while additional 7.2% by additive effects of the exposome, 1.9% by exe interactions and 4.5% by gxe interactions. Correspondingly, the prediction accuracy for BMI, computed using Pearson’s correlation between the observed and predicted phenotypes, improves from 0.15 (based on the genome alone) to 0.35 (based on the genome and exposome). We also show, using established theories, that integrating genomic and exposomic data can be an effective way of attaining a clinically meaningful level of prediction accuracy for disease traits. In conclusion, the genomic and exposomic effects can contribute to phenotypic variation via their latent relationships, i.e. genome-exposome correlation, and gxe and exe interactions, and modelling these effects has a potential to improve phenotypic prediction accuracy and thus holds a great promise for future clinical practice.


2019 ◽  
Author(s):  
Wei Zhou ◽  
Zhangchen Zhao ◽  
Jonas B. Nielsen ◽  
Lars G. Fritsche ◽  
Jonathon LeFaive ◽  
...  

AbstractWith very large sample sizes, population-based cohorts and biobanks provide an exciting opportunity to identify genetic components of complex traits. To analyze rare variants, gene or region-based multiple variant aggregate tests are commonly used to increase association test power. However, due to the substantial computation cost, existing region-based rare variant tests cannot analyze hundreds of thousands of samples while accounting for confounders, such as population stratification and sample relatedness. Here we propose a scalable generalized mixed model region-based association test that can handle large sample sizes and accounts for unbalanced case-control ratios for binary traits. This method, SAIGE-GENE, utilizes state-of-the-art optimization strategies to reduce computational and memory cost, and hence is applicable to exome-wide and genome-wide region-based analysis for hundreds of thousands of samples. Through the analysis of the HUNT study of 69,716 Norwegian samples and the UK Biobank data of 408,910 White British samples, we show that SAIGE-GENE can efficiently analyze large sample data (N > 400,000) with type I error rates well controlled.


Author(s):  
Xuan Zhou ◽  
S. Hong Lee

AbstractComplementary to the genome, the concept of exposome has been proposed to capture the totality of human environmental exposures. While there has been some recent progress on the construction of the exposome, few tools exist that can integrate the genome and exposome for complex trait analyses. Here we propose a linear mixed model approach to bridge this gap, which jointly models the random effects of the two omics layers on phenotypes of complex traits. We illustrate our approach using traits from the UK Biobank (e.g., BMI & height for N ~ 40,000) with a small fraction of the exposome that comprises 28 lifestyle factors. The joint model of the genome and exposome explains substantially more phenotypic variance and significantly improves phenotypic prediction accuracy, compared to the model based on the genome alone. The additional phenotypic variance captured by the exposome includes its additive effects as well as non-additive effects such as genome-exposome (gxe) and exposome-exposome (exe) interactions. For example, 19% of variation in BMI is explained by additive effects of the genome, while additional 7.2% by additive effects of the exposome, 1.9% by exe interactions and 4.5% by gxe interactions. Correspondingly, the prediction accuracy for BMI, computed using Pearson’s correlation between the observed and predicted phenotypes, improves from 0.15 (based on the genome alone) to 0.35 (based on the genome & exposome). We also show, using established theories, integrating genomic and exposomic data is essential to attaining a clinically meaningful level of prediction accuracy for disease traits. In conclusion, the genomic and exposomic effects can contribute to phenotypic variation via their latent relationships, i.e. genome-exposome correlation, and gxe and exe interactions, and modelling these effects has a great potential to improve phenotypic prediction accuracy and thus holds a great promise for future clinical practice.


2020 ◽  
Author(s):  
James L. Peugh ◽  
Sarah J. Beal ◽  
Meghan E. McGrady ◽  
Michael D. Toland ◽  
Constance Mara

Author(s):  
Miriam Romero-López ◽  
María Carmen Pichardo ◽  
Ana Justicia-Arráez ◽  
Judit Bembibre-Serrano

The objective of this study is to measure the effectiveness of a program on improving inhibitory and emotional control among children. In addition, it is assessed whether the improvement of these skills has an effect on the reduction of aggressive behavior in pre-school children. The participants were 100 children, 50 belonging to the control group and 50 to the experimental group, aged between 5 and 6 years. Pre-intervention and post-intervention measures of inhibitory and emotional control (BRIEF-P) and aggression (BASC) were taken. A Generalized Linear Mixed Model analysis (GLMM) was performed and found that children in the experimental group scored higher on inhibitory and emotional control compared to their peers in the control group. In addition, these improvements have an effect on the decrease in aggressiveness. In conclusion, preventive research should have among its priorities the design of such program given their implications for psychosocial development.


Agriculture ◽  
2021 ◽  
Vol 11 (8) ◽  
pp. 722
Author(s):  
Bethan Cavendish ◽  
John McDonagh ◽  
Georgios Tzimiropoulos ◽  
Kimberley R. Slinger ◽  
Zoë J. Huggett ◽  
...  

The aim of this study was to characterize calving behavior of dairy cows and to compare the duration and frequency of behaviors for assisted and unassisted dairy cows at calving. Behavioral data from nine hours prior to calving were collected for 35 Holstein-Friesian dairy cows. Cows were continuously monitored under 24 h video surveillance. The behaviors of standing, lying, walking, shuffle, eating, drinking and contractions were recorded for each cow until birth. A generalized linear mixed model was used to assess differences in the duration and frequency of behaviors prior to calving for assisted and unassisted cows. The nine hours prior to calving was assessed in three-hour time periods. The study found that the cows spent a large proportion of their time either lying (0.49) or standing (0.35), with a higher frequency of standing (0.36) and shuffle (0.26) bouts than other behaviors during the study. There were no differences in behavior between assisted and unassisted cows. During the three-hours prior to calving, the duration and bouts of lying, including contractions, were higher than during other time periods. While changes in behavior failed to identify an association with calving assistance, the monitoring of behavioral patterns could be used as an alert to the progress of parturition.


Sign in / Sign up

Export Citation Format

Share Document