scholarly journals 301 Methods of genome-wide association studies and their applications in dairy cattle

2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 31-31
Author(s):  
Li Ma

Abstract Genome-wide association studies (GWAS) has been widely used to map quantitative trait loci (QTL) of complex traits and diseases since 2007. To date, the human GWAS catalog has accumulated 4,410 publications and 172,351 associations, and the animal QTLdb has curated 983 publications and 130,407 QTLs for cattle, largest in livestock species. During the past 13 years of development, GWAS methods has evolved from simple linear regression, using principal components to address sample relatedness, mixed models, to Bayesian full model approaches. These methods have their advantages and limitations, so it is important to choose an appropriate method, especially for studies in livestock where sample size is often limited. Note that the most popular GWAS approach, the mixed model method, originated from animal breeding and genetics research. Leveraging the national cattle genomic database at the Council on Dairy Cattle Breeding (CDCB), we have conducted GWAS analyses of various dairy traits to identify QTLs and SNP markers of importance. Combining with sequence and functional annotation data, we seek to understand the genetic basis of complex traits and to reveal useful knowledge that can be incorporated into more accurate genomic predictions in the future.

2019 ◽  
Author(s):  
Jan A. Freudenthal ◽  
Markus J. Ankenbrand ◽  
Dominik G. Grimm ◽  
Arthur Korte

AbstractMotivationGenome-wide association studies (GWAS) are one of the most commonly used methods to detect associations between complex traits and genomic polymorphisms. As both genotyping and phenotyping of large populations has become easier, typical modern GWAS have to cope with massive amounts of data. Thus, the computational demand for these analyses grew remarkably during the last decades. This is especially true, if one wants to implement permutation-based significance thresholds, instead of using the naïve Bonferroni threshold. Permutation-based methods have the advantage to provide an adjusted multiple hypothesis correction threshold that takes the underlying phenotypic distribution into account and will thus remove the need to find the correct transformation for non Gaussian phenotypes. To enable efficient analyses of large datasets and the possibility to compute permutation-based significance thresholds, we used the machine learning framework TensorFlow to develop a linear mixed model (GWAS-Flow) that can make use of the available CPU or GPU infrastructure to decrease the time of the analyses especially for large datasets.ResultsWe were able to show that our application GWAS-Flow outperforms custom GWAS scripts in terms of speed without loosing accuracy. Apart from p-values, GWAS-Flow also computes summary statistics, such as the effect size and its standard error for each individual marker. The CPU-based version is the default choice for small data, while the GPU-based version of GWAS-Flow is especially suited for the analyses of big data.AvailabilityGWAS-Flow is freely available on GitHub (https://github.com/Joyvalley/GWAS_Flow) and is released under the terms of the MIT-License.


2019 ◽  
Vol 20 (S23) ◽  
Author(s):  
Haohan Wang ◽  
Tianwei Yue ◽  
Jingkang Yang ◽  
Wei Wu ◽  
Eric P. Xing

Abstract Background Genome-wide Association Studies (GWAS) have contributed to unraveling associations between genetic variants in the human genome and complex traits for more than a decade. While many works have been invented as follow-ups to detect interactions between SNPs, epistasis are still yet to be modeled and discovered more thoroughly. Results In this paper, following the previous study of detecting marginal epistasis signals, and motivated by the universal approximation power of deep learning, we propose a neural network method that can potentially model arbitrary interactions between SNPs in genetic association studies as an extension to the mixed models in correcting confounding factors. Our method, namely Deep Mixed Model, consists of two components: 1) a confounding factor correction component, which is a large-kernel convolution neural network that focuses on calibrating the residual phenotypes by removing factors such as population stratification, and 2) a fixed-effect estimation component, which mainly consists of an Long-short Term Memory (LSTM) model that estimates the association effect size of SNPs with the residual phenotype. Conclusions After validating the performance of our method using simulation experiments, we further apply it to Alzheimer’s disease data sets. Our results help gain some explorative understandings of the genetic architecture of Alzheimer’s disease.


Animals ◽  
2021 ◽  
Vol 11 (2) ◽  
pp. 541
Author(s):  
Long Chen ◽  
Jennie E. Pryce ◽  
Ben J. Hayes ◽  
Hans D. Daetwyler

Structural variations (SVs) are large DNA segments of deletions, duplications, copy number variations, inversions and translocations in a re-sequenced genome compared to a reference genome. They have been found to be associated with several complex traits in dairy cattle and could potentially help to improve genomic prediction accuracy of dairy traits. Imputation of SVs was performed in individuals genotyped with single-nucleotide polymorphism (SNP) panels without the expense of sequencing them. In this study, we generated 24,908 high-quality SVs in a total of 478 whole-genome sequenced Holstein and Jersey cattle. We imputed 4489 SVs with R2 > 0.5 into 35,568 Holstein and Jersey dairy cattle with 578,999 SNPs with two pipelines, FImpute and Eagle2.3-Minimac3. Genome-wide association studies for production, fertility and overall type with these 4489 SVs revealed four significant SVs, of which two were highly linked to significant SNP. We also estimated the variance components for SNP and SV models for these traits using genomic best linear unbiased prediction (GBLUP). Furthermore, we assessed the effect on genomic prediction accuracy of adding SVs to GBLUP models. The estimated percentage of genetic variance captured by SVs for production traits was up to 4.57% for milk yield in bulls and 3.53% for protein yield in cows. Finally, no consistent increase in genomic prediction accuracy was observed when including SVs in GBLUP.


Author(s):  
Meng Luo ◽  
Shiliang Gu

AbstractDuring the past decades, genome-wide association studies (GWAS) have been used to successfully identify tens of thousands of genetic variants associated with complex traits included in humans, animals, and plants. All common genome-wide association (GWA) methods rely on population structure correction to avoid false genotype and phenotype associations. However, population structure correction is a stringent penalization, which also impedes the identification of real associations. Here, we used recent statistical advances and proposed iterative screen regression (ISR), which enables simultaneous multiple marker associations and shown to appropriately correction population stratification and cryptic relatedness in GWAS. Results from analyses of simulated suggest that the proposed ISR method performed well in terms of power (sensitivity) versus FDR (False Discovery Rate) and specificity, also less bias (higher accuracy) in effect (PVE) estimation than the existing multi-loci (mixed) model and the single-locus (mixed) model. We also show the practicality of our approach by applying it to rice, outbred mice, and A.thaliana datasets. It identified several new causal loci that other methods did not detect. Our ISR provides an alternative for multi-loci GWAS, and the implementation was computationally efficient, analyzing large datasets practicable (n>100,000).


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Javed Akhatar ◽  
Anna Goyal ◽  
Navneet Kaur ◽  
Chhaya Atri ◽  
Meenakshi Mittal ◽  
...  

AbstractTimely transition to flowering, maturity and plant height are important for agronomic adaptation and productivity of Indian mustard (B. juncea), which is a major edible oilseed crop of low input ecologies in Indian subcontinent. Breeding manipulation for these traits is difficult because of the involvement of multiple interacting genetic and environmental factors. Here, we report a genetic analysis of these traits using a population comprising 92 diverse genotypes of mustard. These genotypes were evaluated under deficient (N75), normal (N100) or excess (N125) conditions of nitrogen (N) application. Lower N availability induced early flowering and maturity in most genotypes, while high N conditions delayed both. A genotyping-by-sequencing approach helped to identify 406,888 SNP markers and undertake genome wide association studies (GWAS). 282 significant marker-trait associations (MTA's) were identified. We detected strong interactions between GWAS loci and nitrogen levels. Though some trait associated SNPs were detected repeatedly across fertility gradients, majority were identified under deficient or normal levels of N applications. Annotation of the genomic region (s) within ± 50 kb of the peak SNPs facilitated prediction of 30 candidate genes belonging to light perception, circadian, floral meristem identity, flowering regulation, gibberellic acid pathways and plant development. These included over one copy each of AGL24, AP1, FVE, FRI, GID1A and GNC. FLC and CO were predicted on chromosomes A02 and B08 respectively. CDF1, CO, FLC, AGL24, GNC and FAF2 appeared to influence the variation for plant height. Our findings may help in improving phenotypic plasticity of mustard across fertility gradients through marker-assisted breeding strategies.


2021 ◽  
Vol 42 (1) ◽  
Author(s):  
Dinesh K. Saini ◽  
Yuvraj Chopra ◽  
Jagmohan Singh ◽  
Karansher S. Sandhu ◽  
Anand Kumar ◽  
...  

Author(s):  
Nasa Sinnott-Armstrong ◽  
Sahin Naqvi ◽  
Manuel Rivas ◽  
Jonathan K Pritchard

SummaryGenome-wide association studies (GWAS) have been used to study the genetic basis of a wide variety of complex diseases and other traits. However, for most traits it remains difficult to interpret what genes and biological processes are impacted by the top hits. Here, as a contrast, we describe UK Biobank GWAS results for three molecular traits—urate, IGF-1, and testosterone—that are biologically simpler than most diseases, and for which we know a great deal in advance about the core genes and pathways. Unlike most GWAS of complex traits, for all three traits we find that most top hits are readily interpretable. We observe huge enrichment of significant signals near genes involved in the relevant biosynthesis, transport, or signaling pathways. We show how GWAS data illuminate the biology of variation in each trait, including insights into differences in testosterone regulation between females and males. Meanwhile, in other respects the results are reminiscent of GWAS for more-complex traits. In particular, even these molecular traits are highly polygenic, with most of the variance coming not from core genes, but from thousands to tens of thousands of variants spread across most of the genome. Given that diseases are often impacted by many distinct biological processes, including these three, our results help to illustrate why so many variants can affect risk for any given disease.


2021 ◽  
Author(s):  
Dev Paudel ◽  
Rocheteau Dareus ◽  
Julia Rosenwald ◽  
Maria Munoz-Amatriain ◽  
Esteban Rios

Cowpea (Vigna unguiculata [L.] Walp., diploid, 2n = 22) is a major crop used as a protein source for human consumption as well as a quality feed for livestock. It is drought and heat tolerant and has been bred to develop varieties that are resilient to changing climates. Plant adaptation to new climates and their yield are strongly affected by flowering time. Therefore, understanding the genetic basis of flowering time is critical to advance cowpea breeding. The aim of this study was to perform genome-wide association studies (GWAS) to identify marker trait associations for flowering time in cowpea using single nucleotide polymorphism (SNP) markers. A total of 367 accessions from a cowpea mini-core collection were evaluated in Ft. Collins, CO in 2019 and 2020, and 292 accessions were evaluated in Citra, FL in 2018. These accessions were genotyped using the Cowpea iSelect Consortium Array that contained 51,128 SNPs. GWAS revealed seven reliable SNPs for flowering time that explained 8-12% of the phenotypic variance. Candidate genes including FT, GI, CRY2, LSH3, UGT87A2, LIF2, and HTA9 that are associated with flowering time were identified for the significant SNP markers. Further efforts to validate these loci will help to understand their role in flowering time in cowpea, and it could facilitate the transfer of some of this knowledge to other closely related legume species.


Sign in / Sign up

Export Citation Format

Share Document