scholarly journals Genome-Wide Association Studies with a Genomic Relationship Matrix: A Case Study with Wheat andArabidopsis

2016 ◽  
Vol 6 (10) ◽  
pp. 3241-3256 ◽  
Author(s):  
Daniel Gianola ◽  
Maria I. Fariello ◽  
Hugo Naya ◽  
Chris-Carolin Schön
Genetics ◽  
2020 ◽  
Vol 216 (3) ◽  
pp. 651-669
Author(s):  
Yong Jiang ◽  
Jochen C. Reif

The genomic relationship matrix plays a key role in the analysis of genetic diversity, genomic prediction, and genome-wide association studies. The epistatic genomic relationship matrix is a natural generalization of the classic genomic relationship matrix in the sense that it implicitly models the epistatic effects among all markers. Calculating the exact form of the epistatic relationship matrix requires high computational load, and is hence not feasible when the number of markers is large, or when high-degree of epistasis is in consideration. Currently, many studies use the Hadamard product of the classic genomic relationship matrix as an approximation. However, the quality of the approximation is difficult to investigate in the strict mathematical sense. In this study, we derived iterative formulas for the precise form of the epistatic genomic relationship matrix for arbitrary degree of epistasis including both additive and dominance interactions. The key to our theoretical results is the observation of an interesting link between the elements in the genomic relationship matrix and symmetric polynomials, which motivated the application of the corresponding mathematical theory. Based on the iterative formulas, efficient recursive algorithms were implemented. Compared with the approximation by the Hadamard product, our algorithms provided a complete solution to the problem of calculating the exact epistatic genomic relationship matrix. As an application, we showed that our new algorithms easily relieved the computational burden in a previous study on the approximation behavior of two limit models.


2019 ◽  
Author(s):  
Katie O'Connor ◽  
Ben Hayes ◽  
Craig Hardner ◽  
Catherine Nock ◽  
Abdul Baten ◽  
...  

Abstract Background: Breeding for new macadamia cultivars with high nut yield is expensive in terms of time, labour and cost. Most trees set nuts after four to five years, and candidate varieties for breeding are evaluated for at least eight years for various traits. Genome-wide association studies (GWAS) are promising methods to reduce evaluation and selection cycles by identifying genetic markers linked with key traits, potentially enabling early selection through marker-assisted selection. This study used 295 progeny from 32 full-sib families and 29 parents (18 phenotyped) which were planted across four sites, with each tree genotyped for 4,113 SNPs. ASReml-R was used to perform association analyses with linear mixed models including a genomic relationship matrix to account for population structure. Traits investigated were: nut weight (NW), kernel weight (KW), kernel recovery (KR), percentage of whole kernels (WK), tree trunk circumference (TC), percentage of racemes that survived from flowering through to nut set, and number of nuts per raceme. Results: Seven SNPs were significantly associated with NW (at a genome-wide false discovery rate of <0.05), and four with WK. Multiple regression, as well as mapping of markers to genome assembly scaffolds suggested that some SNPs were detecting the same QTL. There were 44 significant SNPs identified for TC although multiple regression suggested detection of 16 separate QTLs. Conclusions: These findings have important implications for macadamia breeding, and highlight the difficulties of heterozygous populations with rapid LD decay. By coupling validated marker-trait associations detected through GWAS with MAS, genetic gain could be increased by reducing the selection time for economically important nut characteristics. Genomic selection may be a more appropriate method to predict complex traits like tree size and yield.


Genome ◽  
2010 ◽  
Vol 53 (11) ◽  
pp. 876-883 ◽  
Author(s):  
Ben Hayes ◽  
Mike Goddard

Results from genome-wide association studies in livestock, and humans, has lead to the conclusion that the effect of individual quantitative trait loci (QTL) on complex traits, such as yield, are likely to be small; therefore, a large number of QTL are necessary to explain genetic variation in these traits. Given this genetic architecture, gains from marker-assisted selection (MAS) programs using only a small number of DNA markers to trace a limited number of QTL is likely to be small. This has lead to the development of alternative technology for using the available dense single nucleotide polymorphism (SNP) information, called genomic selection. Genomic selection uses a genome-wide panel of dense markers so that all QTL are likely to be in linkage disequilibrium with at least one SNP. The genomic breeding values are predicted to be the sum of the effect of these SNPs across the entire genome. In dairy cattle breeding, the accuracy of genomic estimated breeding values (GEBV) that can be achieved and the fact that these are available early in life have lead to rapid adoption of the technology. Here, we discuss the design of experiments necessary to achieve accurate prediction of GEBV in future generations in terms of the number of markers necessary and the size of the reference population where marker effects are estimated. We also present a simple method for implementing genomic selection using a genomic relationship matrix. Future challenges discussed include using whole genome sequence data to improve the accuracy of genomic selection and management of inbreeding through genomic relationships.


2009 ◽  
Vol 15 (32) ◽  
pp. 3764-3772 ◽  
Author(s):  
Amy Murphy ◽  
Jessica Lasky-Su ◽  
Kelan Tantisira ◽  
Augusto Litonjua ◽  
Christoph Lange ◽  
...  

2021 ◽  
Author(s):  
Peter E Chen ◽  
B. Jesse Shapiro

Since the advent of genome-wide association studies (GWAS) in human genomes, an increasing sophistication of methods has been developed for more robust association detection. Currently, the backbone of human GWAS approaches is allele-counting-based methods where the signal of association is derived from alleles that are identical-by-state. Borrowing this approach from human GWAS, allele-counting-based methods have been popularized in microbial GWAS, notably the generalized linear model using either dimension reduction for fixed covariates and/or a genetic relationship matrix as a random effect in a mixed model to control for population stratification. In this work, we show how the effects of linkage disequilibrium (LD) can potentially obscure true-positive genotype-phenotype associations (i.e., genetic variants causally associated with the phenotype of interest) and also lead to unacceptably high rates of false-positive associations when applying these classical approaches to GWAS in weakly recombining microbial genomes. We developed a GWAS method called POUTINE (https://github.com/Peter-Two-Point-O/POUTINE), which relies on homoplastic mutation to both clarify the source of putative causal variants and reduce likely false-positive associations compared to traditional allele counting methods. Using datasets of M. tuberculosis genomes and antibiotic-resistance phenotypes, we show that LD can in fact render all association signals from allele counting methods to be fully indistinguishable from hundreds to thousands of sites scattered across an entire genome. These classic GWAS methods thus fail to pinpoint likely causal genotype-phenotype associations and separate them from background noise, even after applying methods to correct for population structure. We therefore urge caution when utilizing classical approaches, particularly in populations that are strongly clonal.


Sign in / Sign up

Export Citation Format

Share Document