scholarly journals The utility of genomic prediction models in evolutionary genetics

2021 ◽  
Vol 288 (1956) ◽  
pp. 20210693
Author(s):  
Suzanne E. McGaugh ◽  
Aaron J. Lorenz ◽  
Lex E. Flagel

Variation in complex traits is the result of contributions from many loci of small effect. Based on this principle, genomic prediction methods are used to make predictions of breeding value for an individual using genome-wide molecular markers. In breeding, genomic prediction models have been used in plant and animal breeding for almost two decades to increase rates of genetic improvement and reduce the length of artificial selection experiments. However, evolutionary genomics studies have been slow to incorporate this technique to select individuals for breeding in a conservation context or to learn more about the genetic architecture of traits, the genetic value of missing individuals or microevolution of breeding values. Here, we outline the utility of genomic prediction and provide an overview of the methodology. We highlight opportunities to apply genomic prediction in evolutionary genetics of wild populations and the best practices when using these methods on field-collected phenotypes.

2016 ◽  
Vol 283 (1835) ◽  
pp. 20160569 ◽  
Author(s):  
M. E. Goddard ◽  
K. E. Kemper ◽  
I. M. MacLeod ◽  
A. J. Chamberlain ◽  
B. J. Hayes

Complex or quantitative traits are important in medicine, agriculture and evolution, yet, until recently, few of the polymorphisms that cause variation in these traits were known. Genome-wide association studies (GWAS), based on the ability to assay thousands of single nucleotide polymorphisms (SNPs), have revolutionized our understanding of the genetics of complex traits. We advocate the analysis of GWAS data by a statistical method that fits all SNP effects simultaneously, assuming that these effects are drawn from a prior distribution. We illustrate how this method can be used to predict future phenotypes, to map and identify the causal mutations, and to study the genetic architecture of complex traits. The genetic architecture of complex traits is even more complex than previously thought: in almost every trait studied there are thousands of polymorphisms that explain genetic variation. Methods of predicting future phenotypes, collectively known as genomic selection or genomic prediction, have been widely adopted in livestock and crop breeding, leading to increased rates of genetic improvement.


2018 ◽  
Author(s):  
Doug Speed ◽  
David J Balding

LD Score Regression (LDSC) has been widely applied to the results of genome-wide association studies. However, its estimates of SNP heritability are derived from an unrealistic model in which each SNP is expected to contribute equal heritability. As a consequence, LDSC tends to over-estimate confounding bias, under-estimate the total phenotypic variation explained by SNPs, and provide misleading estimates of the heritability enrichment of SNP categories. Therefore, we present SumHer, software for estimating SNP heritability from summary statistics using more realistic heritability models. After demonstrating its superiority over LDSC, we apply SumHer to the results of 24 large-scale association studies (average sample size 121 000). First we show that these studies have tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci has under-reported by about 20%. Next we estimate enrichment for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further twelve categories with above 2-fold enrichment. By contrast, our analysis using SumHer finds that conserved regions are only 1.6-fold (SD 0.06) enriched, and that no category has enrichment above 1.7-fold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.


Cells ◽  
2021 ◽  
Vol 10 (12) ◽  
pp. 3372
Author(s):  
Cesar A. Medina ◽  
Harpreet Kaur ◽  
Ian Ray ◽  
Long-Xi Yu

Agronomic traits such as biomass yield and abiotic stress tolerance are genetically complex and challenging to improve through conventional breeding approaches. Genomic selection (GS) is an alternative approach in which genome-wide markers are used to determine the genomic estimated breeding value (GEBV) of individuals in a population. In alfalfa (Medicago sativa L.), previous results indicated that low to moderate prediction accuracy values (<70%) were obtained in complex traits, such as yield and abiotic stress resistance. There is a need to increase the prediction value in order to employ GS in breeding programs. In this paper we reviewed different statistic models and their applications in polyploid crops, such as alfalfa and potato. Specifically, we used empirical data affiliated with alfalfa yield under salt stress to investigate approaches that use DNA marker importance values derived from machine learning models, and genome-wide association studies (GWAS) of marker-trait association scores based on different GWASpoly models, in weighted GBLUP analyses. This approach increased prediction accuracies from 50% to more than 80% for alfalfa yield under salt stress. Finally, we expended the weighted GBLUP approach to potato and analyzed 13 phenotypic traits and obtained similar results. This is the first report on alfalfa to use variable importance and GWAS-assisted approaches to increase the prediction accuracy of GS, thus helping to select superior alfalfa lines based on their GEBVs.


2017 ◽  
Author(s):  
Siraj Ismail Kayondo ◽  
Dunia Pino Del Carpio ◽  
Roberto Lozano ◽  
Alfred Ozimati ◽  
Marnin Wolfe ◽  
...  

AbstractCassava (Manihot esculenta Crantz), a key carbohydrate dietary source for millions of people in Africa, faces severe yield loses due to two viral diseases: cassava brown streak disease (CBSD) and cassava mosaic disease (CMD). The completion of the cassava genome sequence and the whole genome marker profiling of clones from African breeding programs (www.nextgencassava.org) provides cassava breeders the opportunity to deploy additional breeding strategies and develop superior varieties with both farmer and industry preferred traits. Here the identification of genomic segments associated with resistance to CBSD foliar symptoms and root necrosis as measured in two breeding panels at different growth stages and locations is reported. Using genome-wide association mapping and genomic prediction models we describe the genetic architecture for CBSD severity and identify loci strongly associated on chromosomes 4 and 11. Moreover, the significantly associated region on chromosome 4 colocalises with a Manihot glaziovii introgression segment and the significant SNP markers on chromosome 11 are situated within a cluster of nucleotide-binding site leucine-rich repeat (NBS-LRR) genes previously described in cassava. Overall, predictive accuracy values found in this study varied between CBSD severity traits and across GS models with Random Forest and RKHS showing the highest predictive accuracies for foliar and root CBSD severity scores.


2021 ◽  
Author(s):  
Michaela Jung ◽  
Beat Keller ◽  
Morgane Roth ◽  
Maria Jose Aranzana ◽  
Annemarie Auwerkerken ◽  
...  

Implementation of genomic tools is desirable to increase the efficiency of apple breeding. The apple reference population (apple REFPOP) proved useful for rediscovering loci, estimating genomic prediction accuracy, and studying genotype by environment interactions (GxE). Here we show contrasting genetic architecture and genomic prediction accuracies for 30 quantitative traits across up to six European locations using the apple REFPOP. A total of 59 stable and 277 location-specific associations were found using GWAS, 69.2% of which are novel when compared with 41 reviewed publications. Average genomic prediction accuracies of 0.18-0.88 were estimated using single-environment univariate, single-environment multivariate, multi-environment univariate, and multi-environment multivariate models. The GxE accounted for up to 24% of the phenotypic variability. This most comprehensive genomic study in apple in terms of trait-environment combinations provided knowledge of trait biology and prediction models that can be readily applied for marker-assisted or genomic selection, thus facilitating increased breeding efficiency.


2021 ◽  
Author(s):  
Charlotte Brault ◽  
Vincent Segura ◽  
Patrice This ◽  
Loïc Le Cunff ◽  
Timothée Flutre ◽  
...  

Crop breeding involves two selection steps: choosing progenitors and selecting offspring within progenies. Genomic prediction, based on genome-wide marker estimation of genetic values, could facilitate these steps. However, its potential usefulness in grapevine (Vitis vinifera L.) has only been evaluated in non-breeding contexts mainly through cross-validation within a single population. We tested across-population genomic prediction in a more realistic breeding configuration, from a diversity panel to ten bi-parental crosses connected within a half-diallel mating design. Prediction quality was evaluated over 15 traits of interest (related to yield, berry composition, phenology and vigour), for both the average genetic value of each cross (cross mean) and the genetic values of individuals within each cross (individual values). Genomic prediction in these conditions was found useful: for cross mean, average per-trait predictive ability was 0.6, while per-cross predictive ability was halved on average, but reached a maximum of 0.7. Mean predictive ability for individual values within crosses was 0.26, about half the within-half-diallel value taken as a reference. For some traits and/or crosses, these across-population predictive ability values are promising for implementing genomic selection in grapevine breeding. This study also provided key insights on variables affecting predictive ability. Per-cross predictive ability was well predicted by genetic distance between parents and when this predictive ability was below 0.6, it was improved by training set optimization. For individual values, predictive ability mostly depended on trait-related variables (magnitude of the cross effect and heritability). These results will greatly help designing grapevine breeding programs assisted by genomic prediction.


Cells ◽  
2021 ◽  
Vol 10 (11) ◽  
pp. 3184
Author(s):  
Nikolay V. Kondratyev ◽  
Margarita V. Alfimova ◽  
Arkadiy K. Golov ◽  
Vera E. Golimbet

Scientifically interesting as well as practically important phenotypes often belong to the realm of complex traits. To the extent that these traits are hereditary, they are usually ‘highly polygenic’. The study of such traits presents a challenge for researchers, as the complex genetic architecture of such traits makes it nearly impossible to utilise many of the usual methods of reverse genetics, which often focus on specific genes. In recent years, thousands of genome-wide association studies (GWAS) were undertaken to explore the relationships between complex traits and a large number of genetic factors, most of which are characterised by tiny effects. In this review, we aim to familiarise ‘wet biologists’ with approaches for the interpretation of GWAS results, to clarify some issues that may seem counterintuitive and to assess the possibility of using GWAS results in experiments on various complex traits.


2019 ◽  
Author(s):  
Huwenbo Shi ◽  
Kathryn S. Burch ◽  
Ruth Johnson ◽  
Malika K. Freund ◽  
Gleb Kichaev ◽  
...  

AbstractDespite strong transethnic genetic correlations reported in the literature for many complex traits, the non-transferability of polygenic risk scores across populations suggests the presence of population-specific components of genetic architecture. We propose an approach that models GWAS summary data for one trait in two populations to estimate genome-wide proportions of population-specific/shared causal SNPs. In simulations across various genetic architectures, we show that our approach yields approximately unbiased estimates with in-sample LD and slight upward-bias with out-of-sample LD. We analyze 9 complex traits in individuals of East Asian and European ancestry, restricting to common SNPs (MAF > 5%), and find that most common causal SNPs are shared by both populations. Using the genome-wide estimates as priors in an empirical Bayes framework, we perform fine-mapping and observe that high-posterior SNPs (for both the population-specific and shared causal configurations) have highly correlated effects in East Asians and Europeans. In population-specific GWAS risk regions, we observe a 2.8x enrichment of shared high-posterior SNPs, suggesting that population-specific GWAS risk regions harbor shared causal SNPs that are undetected in the other GWAS due to differences in LD, allele frequencies, and/or sample size. Finally, we report enrichments of shared high-posterior SNPs in 53 tissue-specific functional categories and find evidence that SNP-heritability enrichments are driven largely by many low-effect common SNPs.


Author(s):  
Alicia R. Martin ◽  
Solomon Teferra ◽  
Marlo Möller ◽  
Eileen G. Hoal ◽  
Mark J. Daly

Human genetic studies have long been vastly Eurocentric, raising a key question about the generalizability of these study findings to other populations. Because humans originated in Africa, these populations retain more genetic diversity, and yet individuals of African descent have been tremendously underrepresented in genetic studies. The diversity in Africa affords ample opportunities to improve fine-mapping resolution for associated loci, discover novel genetic associations with phenotypes, build more generalizable genetic risk prediction models, and better understand the genetic architecture of complex traits and diseases subject to varying environmental pressures. Thus, it is both ethically and scientifically imperative that geneticists globally surmount challenges that have limited progress in African genetic studies to date while meaningfully including African investigators, as greater inclusivity and enhanced research capacity affords enormous opportunities to accelerate genomic discoveries that translate more effectively to all populations. We review the advantages and challenges of studying the genetic architecture of complex traits and diseases in Africa. For example, with greater genetic diversity comes greater ancestral heterogeneity; this higher level of understudied diversity can yield novel genetic findings, but some methods that assume homogeneous population structure and work well in European populations may work less well in the presence of greater diversity and heterogeneity in African populations. Consequently, we advocate for methodological development that will accelerate studies important for all populations, especially those currently underrepresented in genetics.


2021 ◽  
Vol 12 ◽  
Author(s):  
Md. Abdullah Al Bari ◽  
Ping Zheng ◽  
Indalecio Viera ◽  
Hannah Worral ◽  
Stephen Szwiec ◽  
...  

Phenotypic evaluation and efficient utilization of germplasm collections can be time-intensive, laborious, and expensive. However, with the plummeting costs of next-generation sequencing and the addition of genomic selection to the plant breeder’s toolbox, we now can more efficiently tap the genetic diversity within large germplasm collections. In this study, we applied and evaluated genomic prediction’s potential to a set of 482 pea (Pisum sativum L.) accessions—genotyped with 30,600 single nucleotide polymorphic (SNP) markers and phenotyped for seed yield and yield-related components—for enhancing selection of accessions from the USDA Pea Germplasm Collection. Genomic prediction models and several factors affecting predictive ability were evaluated in a series of cross-validation schemes across complex traits. Different genomic prediction models gave similar results, with predictive ability across traits ranging from 0.23 to 0.60, with no model working best across all traits. Increasing the training population size improved the predictive ability of most traits, including seed yield. Predictive abilities increased and reached a plateau with increasing number of markers presumably due to extensive linkage disequilibrium in the pea genome. Accounting for population structure effects did not significantly boost predictive ability, but we observed a slight improvement in seed yield. By applying the best genomic prediction model (e.g., RR-BLUP), we then examined the distribution of genotyped but nonphenotyped accessions and the reliability of genomic estimated breeding values (GEBV). The distribution of GEBV suggested that none of the nonphenotyped accessions were expected to perform outside the range of the phenotyped accessions. Desirable breeding values with higher reliability can be used to identify and screen favorable germplasm accessions. Expanding the training set and incorporating additional orthogonal information (e.g., transcriptomics, metabolomics, physiological traits, etc.) into the genomic prediction framework can enhance prediction accuracy.


Sign in / Sign up

Export Citation Format

Share Document