scholarly journals Trends of genetic changes uncovered by Env- and Eigen-GWAS in wheat and barley

Author(s):  
Rajiv Sharma ◽  
James Cockram ◽  
Keith A. Gardner ◽  
Joanne Russell ◽  
Luke Ramsay ◽  
...  

Abstract Key message Variety age and population structure detect novel QTL for yield and adaptation in wheat and barley without the need to phenotype. Abstract The process of crop breeding over the last century has delivered new varieties with increased genetic gains, resulting in higher crop performance and yield. However, in many cases, the alleles and genomic regions underpinning this success remain unknown. This is partly due to the difficulty of generating sufficient phenotypic data on large numbers of historical varieties to enable such analyses. Here we demonstrate the ability to circumvent such bottlenecks by identifying genomic regions selected over 100 years of crop breeding using age of a variety as a surrogate for yield. Rather than collecting phenotype data, we deployed ‘environmental genome-wide association scans’ (EnvGWAS) based on variety age in two of the world’s most important crops, wheat and barley, and detected strong signals of selection across both genomes. EnvGWAS identified 16 genomic regions in barley and 10 in wheat with contrasting patterns between spring and winter types of the two crops. To further examine changes in genome structure, we used the genomic relationship matrix of the genotypic data to derive eigenvectors for analysis in EigenGWAS. This detected seven major chromosomal introgressions that contributed to adaptation in wheat. EigenGWAS and EnvGWAS based on variety age avoid costly phenotyping and facilitate the identification of genomic tracts that have been under selection during breeding. Our results demonstrate the potential of using historical cultivar collections coupled with genomic data to identify chromosomal regions under selection and may help guide future plant breeding strategies to maximise the rate of genetic gain and adaptation.

2020 ◽  
Author(s):  
Rajiv Sharma ◽  
James Cockram ◽  
Keith A. Gardner ◽  
Joanne Russell ◽  
Luke Ramsay ◽  
...  

AbstractThe process of crop breeding over the last century has delivered new varieties with increased genetic gains, resulting in higher crop performance and yield. However in many cases, the underlying alleles and genomic regions that have underpinned this success remain unknown. This is due, in part, to the difficulty in generating sufficient phenotypic data on large numbers of historical varieties to allow such analyses to be undertaken. Here we demonstrate the ability to circumvent such bottlenecks by identifying genomic regions selected over 100 years of crop breeding using the age of a variety as a surrogate for yield. Using ‘environmental genome-wide association scans’ (EnvGWAS) on variety age in two of the world’s most important crops, wheat and barley, we found strong signals of selection across the genomes of our target crops. EnvGWAS identified 16 genomic regions in barley and 10 in wheat with contrasting patterns between spring and winter types of the two crops. To further examine changes in genome structure in wheat and barley over the past century, we used the same genotypic data to derive eigenvectors for deployment in EigenGWAS. This resulted in the detection of seven major chromosomal introgressions that contributed to adaptation in wheat. The deployment of both EigenGWAS and EnvGWAS based on variety age avoids costly phenotyping and will facilitate the identification of genomic tracts that have been under selection during plant breeding in underutilized historical cultivar collections. Our results not only demonstrate the potential of using historical cultivar collections coupled with genomic data to identify chromosomal regions that have been under selection but to also guide future plant breeding strategies to maximise the rate of genetic gain and adaptation in crop improvement programs.Significance Statement100 years of plant breeding have greatly improved crop adaptation, resilience, and productivity. Generating the trait data required for these studies is prohibitively expensive and can be impossible on large historical traits. This study reports using variety age and eigenvectors of the genomic relationship matrix as surrogate traits in GWAS to locate the genomic regions that have undergone selection during varietal development in wheat and barley. In several cases these were confirmed as associated with yield and other selected traits. The success and the simplicity of the approach means it can easily be extended to other crops with a recent recorded history of plant breeding and available genomic resources.


2020 ◽  
Author(s):  
Seyed Mohammad Ghoreishifar ◽  
Hossein Moradi-Shahrbabak ◽  
Mohammad Hossein Fallahi ◽  
Ali Jalil Sarghale ◽  
Mohammad Moradi-Shahrbabak ◽  
...  

Abstract Background: Consecutive homozygous fragments of a genome inherited by offspring from a common ancestor are known as runs of homozygosity (ROH). ROH can be used to calculate genomic inbreeding and to identify genomic regions that are potentially under historical selection pressure. The dataset of our study consisted of 254 Azeri (AZ) and 115 Khuzestani (KHZ) river buffalo genotyped for ~65000 SNPs for the following two purposes: 1) to estimate and compare inbreeding calculated using ROH (FROH), excess of homozygosity (FHOM), correlation between uniting gametes (FUNI), and diagonal elements of the genomic relationship matrix (FGRM); 2) to identify frequently occurring ROH (i.e. ROH islands) for our selection signature and gene enrichment studies. Results: In this study, 9102 ROH were identified, with an average number of 21.2±13.1 and 33.2±15.9 segments per animal in AZ and KHZ breeds, respectively. On average in AZ, 4.35% (108.8±120.3 Mb), and in KHZ, 5.96% (149.1±107.7 Mb) of the genome was autozygous. The estimated inbreeding values based on FHOM, FUNI and FGRM were higher in AZ than they were in KHZ, which was in contrast to the FROH estimates. We identified 11 ROH islands (four in AZ and seven in KHZ). In the KHZ breed, the genes located in ROH islands were enriched for multiple Gene Ontology (GO) terms (P≤0.05). The genes located in ROH islands were associated with diverse biological functions and traits such as body size and muscle development (BMP2), immune response (CYP27B1), milk production and components (MARS, ADRA1A, and KCTD16), coat colour and pigmentation (PMEL and MYO1A), reproductive traits (INHBC, INHBE, STAT6 and PCNA), and bone development (SUOX). Conclusion: The calculated FROH was in line with expected higher inbreeding in KHZ than in AZ because of the smaller effective population size of KHZ. Thus, we find that FROH can be used as a robust estimate of genomic inbreeding. Further, the majority of ROH peaks were overlapped with or in close proximity to the previously reported genomic regions with signatures of selection. This tells us that it is likely that the genes in the ROH islands have been subject to artificial or natural selection.


2020 ◽  
Author(s):  
Seyed Mohammad Ghoreishifar ◽  
Hossein Moradi-Shahrbabak ◽  
Mohammad Hossein Fallahi ◽  
Ali Jalil Sarghale ◽  
Mohammad Moradi-Shahrbabak ◽  
...  

Abstract Background: Consecutive homozygous fragments of a genome inherited by offspring from a common ancestor are known as runs of homozygosity (ROH). ROH can be used to calculate genomic inbreeding and to identify genomic regions that are potentially under historical selection pressure. The dataset of our study consisted of 254 Azeri (AZ) and 115 Khuzestani (KHZ ) river buffalo genotyped for ~65000 SNPs for the following two purposes: 1) to estimate and compare inbreeding calculated using ROH (FROH), excess of homozygosity (FHOM), correlation between uniting gametes (FUNI), and diagonal elements of the genomic relationship matrix (FGRM); 2) to identify frequently occurring ROH (i.e. ROH islands) for our selection signature and gene enrichment studies. Results: In this study, 9102 ROH were identified, with an average number of 21.2±13.1 and 33.2±15.9 segments per animal in AZ and KHZ breeds, respectively. On average in AZ, 4.35% (108.8±120.3 Mb), and in KHZ, 5.96% (149.1±107.7 Mb) of the genome was autozygous. The estimated inbreeding values based on FHOM, FUNI and FGRM were higher in AZ than they were in KHZ, which was in contrast to the FROH estimates. We identified 11 ROH islands (four in AZ and seven in KHZ). In the KHZ breed, the genes located in ROH islands were enriched for multiple Gene Ontology (GO) terms (P≤0.05). The genes located in ROH islands were associated with diverse biological functions and traits such as body size and muscle development (BMP2), immune response (CYP27B1), milk production and components (MARS, ADRA1A, and KCTD16), coat colour and pigmentation (PMEL and MYO1A), reproductive traits (INHBC, INHBE, STAT6 and PCNA), and bone development (SUOX). Conclusion: The calculated FROH was in line with expected higher inbreeding in KHZ than in AZ because of the smaller effective population size of KHZ. Thus, we find that FROH can be used as a robust estimate of genomic inbreeding. Further, the majority of ROH peaks were overlapped with or in close proximity to the previously reported genomic regions with signatures of selection. This tells us that it is likely that the genes in the ROH islands have been subject to artificial or natural selection.


2019 ◽  
Author(s):  
Seyed Mohammad Ghoreishifar ◽  
Hossein Moradi-Shahrbabak ◽  
Mohammad Hossein Fallahi ◽  
Ali Jalil-Sarghaleh ◽  
Mohammad Moradi-Shahrbabak ◽  
...  

Abstract Background: Consecutive homozygous fragments of the genome inherited from a common ancestor to offspring are known as runs of homozygosity (ROH). ROH can be used to calculate genomic inbreeding and to identifying genomic regions that are potentially under historical selection pressure. The dataset of our study consisted of 254 Azeri (AZ) and 115 Khuzestani (KHZ) river buffalo genotyped for ~65000 SNPs for the following two purposes: 1) to estimate and compare inbreeding calculated using ROH (FROH), excess of homozygosity (FHOM), correlation between uniting gametes (FUNI), and diagonal elements of the genomic relationship matrix (FGRM); 2) to identify frequently occurring ROH (i.e. ROH islands) for our selection signature and gene enrichment studies. Results: In this study, 9102 ROH were identified, with an average number of 21.2±13.1 and 33.2±15.9 segments per animal in AZ and KHZ breeds, respectively. On average in AZ, 4.35% (108.8±120.3 Mb), and in KHZ, 5.96% (149.1±107.7 Mb) of the genome was autozygous. The estimated inbreeding values based on FHOM, FUNI and FGRM were higher in AZ than they were in KHZ, which was in contrast to the FROH estimates. We identified 11 ROH islands (four in AZ and seven in KHZ). In the KHZ breed, the genes located in ROH islands were enriched for multiple Gene Ontology (GO) terms (P<0.05). The genes located in ROH islands were associated with diverse biological functions and traits such as body size and muscle development (BMP2), immune response (CYP27B1), milk production and components (MARS, ADRA1A, and KCTD16), coat colour and pigmentation (PMEL and MYO1A), reproductive traits (INHBC, INHBE, STAT6 and PCNA), and bone development (SUOX). Conclusion: The calculated FROH was in line with expected higher inbreeding in KHZ than in AZ because of the smaller effective population size of KHZ. Thus, we find that FROH can be used as a robust estimate of genomic inbreeding. Further, the majority of ROH peaks were overlapped with or in close proximity to the previously reported genomic regions with signatures of selection. This tells us that it is likely that the genes in the ROH islands have been subject to artificial and/or natural selection.


2020 ◽  
Author(s):  
Seyed Mohammad Ghoreishifar ◽  
Hossein Moradi-Shahrbabak ◽  
Mohammad Hossein Fallahi ◽  
Ali Jalil Sarghale ◽  
Mohammad Moradi-Shahrbabak ◽  
...  

Abstract Background: Consecutive homozygous fragments of a genome inherited by offspring from a common ancestor are known as runs of homozygosity (ROH). ROH can be used to calculate genomic inbreeding and to identify genomic regions that are potentially under historical selection pressure. The dataset of our study consisted of 254 Azeri (AZ) and 115 Khuzestani (KHZ) river buffalo genotyped for ~65000 SNPs for the following two purposes: 1) to estimate and compare inbreeding calculated using ROH (FROH), excess of homozygosity (FHOM), correlation between uniting gametes (FUNI), and diagonal elements of the genomic relationship matrix (FGRM); 2) to identify frequently occurring ROH (i.e. ROH islands) for our selection signature and gene enrichment studies. Results: In this study, 9102 ROH were identified, with an average number of 21.2±13.1 and 33.2±15.9 segments per animal in AZ and KHZ breeds, respectively. On average in AZ, 4.35% (108.8±120.3 Mb), and in KHZ, 5.96% (149.1±107.7 Mb) of the genome was autozygous. The estimated inbreeding values based on FHOM, FUNI and FGRM were higher in AZ than they were in KHZ, which was in contrast to the FROH estimates. We identified 11 ROH islands (four in AZ and seven in KHZ). In the KHZ breed, the genes located in ROH islands were enriched for multiple Gene Ontology (GO) terms (P≤0.05). The genes located in ROH islands were associated with diverse biological functions and traits such as body size and muscle development (BMP2), immune response (CYP27B1), milk production and components (MARS, ADRA1A, and KCTD16), coat colour and pigmentation (PMEL and MYO1A), reproductive traits (INHBC, INHBE, STAT6 and PCNA), and bone development (SUOX). Conclusion: The calculated FROH was in line with expected higher inbreeding in KHZ than in AZ because of the smaller effective population size of KHZ. Thus, we find that FROH can be used as a robust estimate of genomic inbreeding. Further, the majority of ROH peaks were overlapped with or in close proximity to the previously reported genomic regions with signatures of selection. This tells us that it is likely that the genes in the ROH islands have been subject to artificial or natural selection.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 6-7
Author(s):  
Andre Garcia ◽  
Ignacio Aguilar ◽  
Andres Legarra ◽  
Stephen P Miller ◽  
Shogo Tsuruta ◽  
...  

Abstract With an ever-increasing number of genotyped animals, there is a question of whether to include all genotypes into single-step GBLUP (ssGBLUP) evaluations or to include only genotyped animals with phenotypes and use indirect predictions (IP) for the remaining young genotyped animals. Under ssGBLUP, SNP effects can be backsolved from GEBV, and IP can be calculated as the sum of SNP effects weighted by the gene content. To publish IP, a measure of accuracy that reflects the standard error of prediction, and that is comparable to GEBV accuracy, is needed. Our first objective was to test formulas to compute accuracy of IP by backsolving prediction error covariance (PEC) of GEBV into PEC of SNP effects. The second objective was to investigate the number of genotyped animals needed to obtain robust IP accuracy. Data were provided by the American Angus Association, with 38,000 post-weaning gain phenotypes and 60,000 genotyped animals. Correlations between GEBV and IP were ≥0.99. When all genotyped animals were used for PEC computations, accuracy correlations were also ≥0.99. Additionally, GEBV and IP accuracies were compatible, with both direct inversion of the genomic relationship matrix (G) or using the algorithm for proven and young (APY) to obtain G inverse. As the number of genotyped animals in PEC computations decreased to 15,000, accuracy correlations were still high (≥0.96), but IP accuracies were biased downwards. Indirect prediction accuracy can be successfully obtained from ssGBLUP without running an extra SNP-BLUP evaluation to compute SNP PEC. It is possible to reduce the number of genotyped animals in PEC computations, but accuracies may be slightly underestimated. When the amount of genomic and phenotypic data is large, the polygenic part of GEBV becomes small and IP can be very accurate. Further research is needed to approximate SNP PEC with a large number of genotyped animals.


2019 ◽  
Vol 97 (12) ◽  
pp. 4761-4769 ◽  
Author(s):  
Pâmela A Alexandre ◽  
Laercio R Porto-Neto ◽  
Emre Karaman ◽  
Sigrid A Lehnert ◽  
Antonio Reverter

Abstract The growing concern with the environment is making important for livestock producers to focus on selection for efficiency-related traits, which is a challenge for commercial cattle herds due to the lack of pedigree information. To explore a cost-effective opportunity for genomic evaluations of commercial herds, this study compared the accuracy of bulls’ genomic estimated breeding values (GEBV) using different pooled genotype strategies. We used ten replicates of previously simulated genomic and phenotypic data for one low (t1) and one moderate (t2) heritability trait of 200 sires and 2,200 progeny. Sire’s GEBV were calculated using a univariate mixed model, with a hybrid genomic relationship matrix (h-GRM) relating sires to: 1) 1,100 pools of 2 animals; 2) 440 pools of 5 animals; 3) 220 pools of 10 animals; 4) 110 pools of 20 animals; 5) 88 pools of 25 animals; 6) 44 pools of 50 animals; and 7) 22 pools of 100 animals. Pooling criteria were: at random, grouped sorting by t1, grouped sorting by t2, and grouped sorting by a combination of t1 and t2. The same criteria were used to select 110, 220, 440, and 1,100 individual genotypes for GEBV calculation to compare GEBV accuracy using the same number of individual genotypes and pools. Although the best accuracy was achieved for a given trait when pools were grouped based on that same trait (t1: 0.50–0.56, t2: 0.66–0.77), pooling by one trait impacted negatively on the accuracy of GEBV for the other trait (t1: 0.25–0.46, t2: 0.29–0.71). Therefore, the combined measure may be a feasible alternative to use the same pools to calculate GEBVs for both traits (t1: 0.45–0.57, t2: 0.62–0.76). Pools of 10 individuals were identified as representing a good compromise between loss of accuracy (~10%–15%) and cost savings (~90%) from genotype assays. In addition, we demonstrated that in more than 90% of the simulations, pools present higher sires’ GEBV accuracy than individual genotypes when the number of genotype assays is limited (i.e., 110 or 220) and animals are assigned to pools based on phenotype. Pools assigned at random presented the poorest results (t1: 0.07–0.45, t2: 0.14–0.70). In conclusion, pooling by phenotype is the best approach to implementing genomic evaluation using commercial herd data, particularly when pools of 10 individuals are evaluated. While combining phenotypes seems a promising strategy to allow more flexibility to the estimates made using pools, more studies are necessary in this regard.


2020 ◽  
Author(s):  
Seyed Mohammad Ghoreishifar ◽  
Hossein Moradi-Shahrbabak ◽  
Mohammad Hossein Fallahi ◽  
Ali Jalil Sarghale ◽  
Mohammad Moradi-Shahrbabak ◽  
...  

Abstract Background: Consecutive homozygous fragments of a genome inherited by offspring from a common ancestor are known as runs of homozygosity (ROH). ROH can be used to calculate genomic inbreeding and to identify genomic regions that are potentially under historical selection pressure. The dataset of our study consisted of 254 Azeri (AZ) and 115 Khuzestani (KHZ) river buffalo genotyped for ~65000 SNPs for the following two purposes: 1) to estimate and compare inbreeding calculated using ROH (FROH), excess of homozygosity (FHOM), correlation between uniting gametes (FUNI), and diagonal elements of the genomic relationship matrix (FGRM); 2) to identify frequently occurring ROH (i.e. ROH islands) for our selection signature and gene enrichment studies. Results: In this study, 9102 ROH were identified, with an average number of 21.2±13.1 and 33.2±15.9 segments per animal in AZ and KHZ breeds, respectively. On average in AZ, 4.35% (108.8±120.3 Mb), and in KHZ, 5.96% (149.1±107.7 Mb) of the genome was autozygous. The estimated inbreeding values based on FHOM, FUNI and FGRM were higher in AZ than they were in KHZ, which was in contrast to the FROH estimates. We identified 11 ROH islands (four in AZ and seven in KHZ). In the KHZ breed, the genes located in ROH islands were enriched for multiple Gene Ontology (GO) terms (P≤0.05). The genes located in ROH islands were associated with diverse biological functions and traits such as body size and muscle development (BMP2), immune response (CYP27B1), milk production and components (MARS, ADRA1A, and KCTD16), coat colour and pigmentation (PMEL and MYO1A), reproductive traits (INHBC, INHBE, STAT6 and PCNA), and bone development (SUOX). Conclusion: The calculated FROH was in line with expected higher inbreeding in KHZ than in AZ because of the smaller effective population size of KHZ. Thus, we find that FROH can be used as a robust estimate of genomic inbreeding. Further, the majority of ROH peaks were overlapped with or in close proximity to the previously reported genomic regions with signatures of selection. This tells us that it is likely that the genes in the ROH islands have been subject to artificial or natural selection.


2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Theo Meuwissen ◽  
Irene van den Berg ◽  
Mike Goddard

Abstract Background Whole-genome sequence (WGS) data are increasingly available on large numbers of individuals in animal and plant breeding and in human genetics through second-generation resequencing technologies, 1000 genomes projects, and large-scale genotype imputation from lower marker densities. Here, we present a computationally fast implementation of a variable selection genomic prediction method, that could handle WGS data on more than 35,000 individuals, test its accuracy for across-breed predictions and assess its quantitative trait locus (QTL) mapping precision. Methods The Monte Carlo Markov chain (MCMC) variable selection model (Bayes GC) fits simultaneously a genomic best linear unbiased prediction (GBLUP) term, i.e. a polygenic effect whose correlations are described by a genomic relationship matrix (G), and a Bayes C term, i.e. a set of single nucleotide polymorphisms (SNPs) with large effects selected by the model. Computational speed is improved by a Metropolis–Hastings sampling that directs computations to the SNPs, which are, a priori, most likely to be included into the model. Speed is also improved by running many relatively short MCMC chains. Memory requirements are reduced by storing the genotype matrix in binary form. The model was tested on a WGS dataset containing Holstein, Jersey and Australian Red cattle. The data contained 4,809,520 genotypes on 35,549 individuals together with their milk, fat and protein yields, and fat and protein percentage traits. Results The prediction accuracies of the Jersey individuals improved by 1.5% when using across-breed GBLUP compared to within-breed predictions. Using WGS instead of 600 k SNP-chip data yielded on average a 3% accuracy improvement for Australian Red cows. QTL were fine-mapped by locating the SNP with the highest posterior probability of being included in the model. Various QTL known from the literature were rediscovered, and a new SNP affecting milk production was discovered on chromosome 20 at 34.501126 Mb. Due to the high mapping precision, it was clear that many of the discovered QTL were the same across the five dairy traits. Conclusions Across-breed Bayes GC genomic prediction improved prediction accuracies compared to GBLUP. The combination of across-breed WGS data and Bayesian genomic prediction proved remarkably effective for the fine-mapping of QTL.


2011 ◽  
Vol 93 (3) ◽  
pp. 203-219 ◽  
Author(s):  
KATHRYN E. KEMPER ◽  
DAVID L. EMERY ◽  
STEPHEN C. BISHOP ◽  
HUTTON ODDY ◽  
BENJAMIN J. HAYES ◽  
...  

SummaryGenetic resistance to gastrointestinal worms is a complex trait of great importance in both livestock and humans. In order to gain insights into the genetic architecture of this trait, a mixed breed population of sheep was artificially infected with Trichostrongylus colubriformis (n=3326) and then Haemonchus contortus (n=2669) to measure faecal worm egg count (WEC). The population was genotyped with the Illumina OvineSNP50 BeadChip and 48 640 single nucleotide polymorphism (SNP) markers passed the quality controls. An independent population of 316 sires of mixed breeds with accurate estimated breeding values for WEC were genotyped for the same SNP to assess the results obtained from the first population. We used principal components from the genomic relationship matrix among genotyped individuals to account for population stratification, and a novel approach to directly account for the sampling error associated with each SNP marker regression. The largest marker effects were estimated to explain an average of 0·48% (T. colubriformis) or 0·08% (H. contortus) of the phenotypic variance in WEC. These effects are small but consistent with results from other complex traits. We also demonstrated that methods which use all markers simultaneously can successfully predict genetic merit for resistance to worms, despite the small effects of individual markers. Correlations of genomic predictions with breeding values of the industry sires reached a maximum of 0·32. We estimate that effective across-breed predictions of genetic merit with multi-breed populations will require an average marker spacing of approximately 10 kbp.


Sign in / Sign up

Export Citation Format

Share Document