Genome-Wide Association Analyses of Fertility Traits in Beef Heifers

The ability of livestock to reproduce efficiently is critical to the sustainability of animal agriculture. Antral follicle count (AFC) and reproductive tract scores (RTS) can be used to estimate fertility in beef heifers, but the genetic mechanisms influencing variation in these measures are not well understood. Two genome-wide association studies (GWAS) were conducted to identify the significant loci associated with these traits. In total, 293 crossbred beef heifers were genotyped on the Bovine GGP 50K chip and genotypes were imputed to 836,121 markers. A GWAS was performed with the AFC phenotype for 217 heifers with a multi-locus mixed model, conducted using the year, age at time of sampling and principal component analysis groupings as the covariates. The RTS GWAS was performed with 289 heifers using an additive correlation/trend test comparing prepubertal to pubertal heifers. The loci on chromosomes 2, 3 and 23 were significant in the AFC GWAS and the loci on chromosomes 2, 8, 10 and 11 were significant in the RTS GWAS. The significant region on chromosome 2 was similar between both analyses. These regions contained genes associated with cell proliferation, transcription, apoptosis and development. This study proposes candidate genes for beef cattle fertility, although future research is needed to elucidate the precise mechanisms.

Download Full-text

GWAS-Flow: A GPU accelerated framework for efficient permutation based genome-wide association studies

10.1101/783100 ◽

2019 ◽

Cited By ~ 2

Author(s):

Jan A. Freudenthal ◽

Markus J. Ankenbrand ◽

Dominik G. Grimm ◽

Arthur Korte

Keyword(s):

Complex Traits ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Large Datasets ◽

Genome Wide Association ◽

Small Data ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Non Gaussian

AbstractMotivationGenome-wide association studies (GWAS) are one of the most commonly used methods to detect associations between complex traits and genomic polymorphisms. As both genotyping and phenotyping of large populations has become easier, typical modern GWAS have to cope with massive amounts of data. Thus, the computational demand for these analyses grew remarkably during the last decades. This is especially true, if one wants to implement permutation-based significance thresholds, instead of using the naïve Bonferroni threshold. Permutation-based methods have the advantage to provide an adjusted multiple hypothesis correction threshold that takes the underlying phenotypic distribution into account and will thus remove the need to find the correct transformation for non Gaussian phenotypes. To enable efficient analyses of large datasets and the possibility to compute permutation-based significance thresholds, we used the machine learning framework TensorFlow to develop a linear mixed model (GWAS-Flow) that can make use of the available CPU or GPU infrastructure to decrease the time of the analyses especially for large datasets.ResultsWe were able to show that our application GWAS-Flow outperforms custom GWAS scripts in terms of speed without loosing accuracy. Apart from p-values, GWAS-Flow also computes summary statistics, such as the effect size and its standard error for each individual marker. The CPU-based version is the default choice for small data, while the GPU-based version of GWAS-Flow is especially suited for the analyses of big data.AvailabilityGWAS-Flow is freely available on GitHub (https://github.com/Joyvalley/GWAS_Flow) and is released under the terms of the MIT-License.

Download Full-text

Maximizing the Power of Principal-Component Analysis of Correlated Phenotypes in Genome-wide Association Studies

The American Journal of Human Genetics ◽

10.1016/j.ajhg.2014.03.016 ◽

2014 ◽

Vol 94 (5) ◽

pp. 662-676 ◽

Cited By ~ 85

Author(s):

Hugues Aschard ◽

Bjarni J. Vilhjálmsson ◽

Nicolas Greliche ◽

Pierre-Emmanuel Morange ◽

David-Alexandre Trégouët ◽

...

Keyword(s):

Principal Component Analysis ◽

Association Studies ◽

Principal Component ◽

Component Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide

Download Full-text

Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies

Methods ◽

10.1016/j.ymeth.2018.04.021 ◽

2018 ◽

Vol 145 ◽

pp. 2-9 ◽

Cited By ~ 1

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Heterogeneous Datasets

Download Full-text

Genome-Wide Association Studies Reveal Susceptibility Loci for Digital Dermatitis in Holstein Cattle

Animals ◽

10.3390/ani10112009 ◽

2020 ◽

Vol 10 (11) ◽

pp. 2009

Author(s):

Ellen Lai ◽

Alexa L. Danner ◽

Thomas R. Famula ◽

Anita M. Oberbauer

Keyword(s):

Predictive Value ◽

Mixed Model ◽

Linear Mixed Model ◽

Bos Taurus ◽

Association Studies ◽

Bayesian Regression ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Digital Dermatitis ◽

Genome Wide

Digital dermatitis (DD) causes lameness in dairy cattle. To detect the quantitative trait loci (QTL) associated with DD, genome-wide association studies (GWAS) were performed using high-density single nucleotide polymorphism (SNP) genotypes and binary case/control, quantitative (average number of FW per hoof trimming record) and recurrent (cases with ≥2 DD episodes vs. controls) phenotypes from cows across four dairies (controls n = 129 vs. FW n = 85). Linear mixed model (LMM) and random forest (RF) approaches identified the top SNPs, which were used as predictors in Bayesian regression models to assess the SNP predictive value. The LMM and RF analyses identified QTL regions containing candidate genes on Bos taurus autosome (BTA) 2 for the binary and recurrent phenotypes and BTA7 and 20 for the quantitative phenotype that related to epidermal integrity, immune function, and wound healing. Although larger sample sizes are necessary to reaffirm these small effect loci amidst a strong environmental effect, the sample cohort used in this study was sufficient for estimating SNP effects with a high predictive value.

Download Full-text

Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2017.8217687 ◽

2017 ◽

Cited By ~ 9

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Heterogeneous Datasets

Download Full-text

A mixed model reduces spurious genetic associations produced by population stratification in genome-wide association studies

Genomics ◽

10.1016/j.ygeno.2015.01.006 ◽

2015 ◽

Vol 105 (4) ◽

pp. 191-196 ◽

Cited By ~ 18

Author(s):

Jimin Shin ◽

Chaeyoung Lee

Keyword(s):

Population Stratification ◽

Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genetic Associations ◽

Genome Wide

Download Full-text

Sparse Principal Component Analysis for Identifying Ancestry-Informative Markers in Genome-Wide Association Studies

Genetic Epidemiology ◽

10.1002/gepi.21621 ◽

2012 ◽

Vol 36 (4) ◽

pp. 293-302 ◽

Cited By ~ 26

Author(s):

Seokho Lee ◽

Michael P. Epstein ◽

Richard Duncan ◽

Xihong Lin

Keyword(s):

Principal Component Analysis ◽

Association Studies ◽

Principal Component ◽

Component Analysis ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Ancestry Informative Markers ◽

Sparse Principal Component Analysis ◽

Genome Wide

Download Full-text

A Mixed Model Approach to Genome-Wide Association Studies for Selection Signatures, with Application to Mice Bred for Voluntary Exercise Behavior

Genetics ◽

10.1534/genetics.117.300102 ◽

2017 ◽

pp. genetics.300102.2017 ◽

Cited By ~ 5

Author(s):

Shizhong Xu ◽

Theodore Garland

Keyword(s):

Mixed Model ◽

Association Studies ◽

Exercise Behavior ◽

Voluntary Exercise ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Selection Signatures ◽

Mixed Model Approach ◽

Genome Wide ◽

Model Approach

Download Full-text

Variable Selection in Heterogeneous Datasets: A Truncated-rank Sparse Linear Mixed Model with Applications to Genome-wide Association Studies

10.1101/228106 ◽

2017 ◽

Cited By ~ 2

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Population Structure ◽

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Low Rank ◽

Genome Wide Association Studies ◽

Unified Framework ◽

Genome Wide

AbstractA fundamental and important challenge in modern datasets of ever increasing dimensionality is variable selection, which has taken on renewed interest recently due to the growth of biological and medical datasets with complex, non-i.i.d. structures. Naïvely applying classical variable selection methods such as the Lasso to such datasets may lead to a large number of false discoveries. Motivated by genome-wide association studies in genetics, we study the problem of variable selection for datasets arising from multiple subpopulations, when this underlying population structure is unknown to the researcher. We propose a unified framework for sparse variable selection that adaptively corrects for population structure via a low-rank linear mixed model. Most importantly, the proposed method does not require prior knowledge of sample structure in the data and adaptively selects a covariance structure of the correct complexity. Through extensive experiments, we illustrate the effectiveness of this framework over existing methods. Further, we test our method on three different genomic datasets from plants, mice, and human, and discuss the knowledge we discover with our method.

Download Full-text

Increasing the Efficiency of Genome-wide Association Mapping via Hidden Markov Models

10.1101/039099 ◽

2016 ◽

Author(s):

Hong Gao ◽

Hua Tang ◽

Carlos Bustamante

Keyword(s):

Markov Model ◽

Hidden Markov Model ◽

Large Scale ◽

Hidden Markov ◽

Association Studies ◽

Genome Wide Association ◽

Trend Test ◽

Genome Wide Association Studies ◽

Data Set ◽

Genome Wide

With the rapid production of high dimensional genetic data, one major challenge in genome-wide association studies is to develop effective and efficient statistical tools to resolve the low power problem of detecting causal SNPs with low to moderate susceptibility, whose effects are often obscured by substantial background noises. Here we present a novel method that serves as an optimal technique for reducing background noises and improving detection power in genome-wide association studies. The approach uses hidden Markov model and its derivate Markov hidden Markov model to estimate the posterior probabilities of a markers being in an associated state. We conducted extensive simulations based on the human whole genome genotype data from the GlaxoSmithKline-POPRES project to calibrate the sensitivity and specificity of our method and compared with many popular approaches for detecting positive signals including the χ^2 test for association and the Cochran-Armitage trend test. Our simulation results suggested that at very low false positive rates (<10^-6), our method reaches the power of 0.9, and is more powerful than any other approaches, when the allelic effect of the causal variant is non-additive or unknown. Application of our method to the data set generated by Welcome Trust Case Control Consortium using 14,000 cases and 3,000 controls confirmed its powerfulness and efficiency under the context of the large-scale genome-wide association studies.

Download Full-text