34 Dimensionality of Genomic Information and Its Impact on GWA and Variant Selection: A Simulation Study

2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 20-20
Author(s):  
Sungbong Jang ◽  
Shogo Tsuruta ◽  
Natalia Leite ◽  
Ignacy Misztal ◽  
Daniela Lourenco

Abstract The ability to identify true-positive variants increases as more genotyped animals are available. Although thousands of animals can be genotyped, the dimensionality of the genomic information is limited. Therefore, there is a certain number of animals that represent all chromosome segments (Me) segregating in the population. The number of Me can be approximated from the eigenvalue decomposition of the genomic relationship matrix (G). Thus, the limited dimensionality may help to identify the number of animals to be used in genome-wide association (GWA). The first objective of this study was to examine different discovery set sizes for GWA, with set sizes based on the number of largest eigenvalues explaining a certain proportion of variance in G. Additionally, we investigated the impact of incorporating variants selected from different set sizes to regular SNP chip used for genomic prediction. Sequence data were simulated that contained 500k SNP and 2k QTL, where the genetic variance was fully explained by QTL. The GWA was conducted using the number of genotyped animals equal to the number of largest eigenvalues of G (EIG) explaining 50, 60, 70, 80, 90, 95, 98, and 99 percent of the variance in G. Significant SNP had a p-value lower than 0.05 with Bonferroni correction. Further, SNP with the largest effect size (top10, 100, 500, 1k, 2k, and 4k) were also selected to be incorporated into the 50k regular chip. Genomic predictions using the 50k combined with selected SNP were conducted using single-step GBLUP (ssGBLUP). Using the number of animals corresponding to at least EIG98 enabled the identification of the largest effect size QTL. The greatest accuracy of prediction was obtained when the top 2k SNP was combined to the 50k chip. The dimensionality of genomic information should be taken into account for variant selection in GWAS.

2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Ivan Pocrnic ◽  
Daniela A. L. Lourenco ◽  
Yutaka Masuda ◽  
Ignacy Misztal

Abstract Background The dimensionality of genomic information is limited by the number of independent chromosome segments (Me), which is a function of the effective population size. This dimensionality can be determined approximately by singular value decomposition of the gene content matrix, by eigenvalue decomposition of the genomic relationship matrix (GRM), or by the number of core animals in the algorithm for proven and young (APY) that maximizes the accuracy of genomic prediction. In the latter, core animals act as proxies to linear combinations of Me. Field studies indicate that a moderate accuracy of genomic selection is achieved with a small dataset, but that further improvement of the accuracy requires much more data. When only one quarter of the optimal number of core animals are used in the APY algorithm, the accuracy of genomic selection is only slightly below the optimal value. This suggests that genomic selection works on clusters of Me. Results The simulation included datasets with different population sizes and amounts of phenotypic information. Computations were done by genomic best linear unbiased prediction (GBLUP) with selected eigenvalues and corresponding eigenvectors of the GRM set to zero. About four eigenvalues in the GRM explained 10% of the genomic variation, and less than 2% of the total eigenvalues explained 50% of the genomic variation. With limited phenotypic information, the accuracy of GBLUP was close to the peak where most of the smallest eigenvalues were set to zero. With a large amount of phenotypic information, accuracy increased as smaller eigenvalues were added. Conclusions A small amount of phenotypic data is sufficient to estimate only the effects of the largest eigenvalues and the associated eigenvectors that contain a large fraction of the genomic information, and a very large amount of data is required to estimate the remaining eigenvalues that account for a limited amount of genomic information. Core animals in the APY algorithm act as proxies of almost the same number of eigenvalues. By using an eigenvalues-based approach, it was possible to explain why the moderate accuracy of genomic selection based on small datasets only increases slowly as more data are added.


2018 ◽  
Vol 53 (6) ◽  
pp. 717-726 ◽  
Author(s):  
Michel Marques Farah ◽  
Marina Rufino Salinas Fortes ◽  
Matthew Kelly ◽  
Laercio Ribeiro Porto-Neto ◽  
Camila Tangari Meira ◽  
...  

Abstract: The objective of this work was to evaluate the effects of genomic information on the genetic evaluation of hip height in Brahman cattle using different matrices built from genomic and pedigree data. Hip height measurements from 1,695 animals, genotyped with high-density SNP chip or imputed from 50 K high-density SNP chip, were used. The numerator relationship matrix (NRM) was compared with the H matrix, which incorporated the NRM and genomic relationship (G) matrix simultaneously. The genotypes were used to estimate three versions of G: observed allele frequency (HGOF), average minor allele frequency (HGMF), and frequency of 0.5 for all markers (HG50). For matrix comparisons, animal data were either used in full or divided into calibration (80% older animals) and validation (20% younger animals) datasets. The accuracy values for the NRM, HGOF, and HG50 were 0.776, 0.813, and 0.594, respectively. The NRM and HGOF showed similar minor variances for diagonal and off-diagonal elements, as well as for estimated breeding values. The use of genomic information resulted in relationship estimates similar to those obtained based on pedigree; however, HGOF is the best option for estimating the genomic relationship matrix and results in a higher prediction accuracy. The ranking of the top 20% animals was very similar for all matrices, but the ranking within them varies depending on the method used.


2019 ◽  
Vol 48 (Supplement_3) ◽  
pp. iii17-iii65
Author(s):  
Louise Marron ◽  
Ricardo Segurado ◽  
Paul Claffey ◽  
Rose Anne Kenny ◽  
Triona McNicholas

Abstract Background Benzodiazepines (BZD) are associated with adverse effects, particularly in older adults. Previous research has shown an association between BZDs and falls and BZDs have been shown to impact sleep quality. The aim of this study is to assess the association between BZD use and falls, and the impact of sleep quality on this association, in community dwelling adults aged over 50. Methods Data from the first wave of The Irish Longitudinal Study on Ageing were used. Participants were classed as BZD users or non-users and asked if they had fallen in the last year, and whether any of these falls were unexplained. Sleep quality was assessed via self-reported trouble falling asleep, daytime somnolence, and early-rising. Logistic regression assessed for an association between BZD use and falls, and the impact of sleep quality on this association was assessed by categorising based on BZD use and each sleep quality variable. Results Of 8,175 individuals, 302 (3.69%) reported taking BZDs. BZD use was associated with falls, controlling for con-founders (OR 1.40; 1.08, 1.82; p-value 0.012). There was no significant association between BZDs and unexplained falls, controlling for con-founders (OR 1.41; 95% CI 0.95, 2.10; p-value 0.09), however a similar effect size to all falls was evident. Participants who take BZDs and report daytime somnolence (OR 1.93; 95% CI 1.12, 3.31; p-value 0.017), early-rising (OR 1.93; 95% CI 1.20, 3.11; p-value 0.007) or trouble falling asleep (OR 1.83; 95% CI 1.12, 2.97; p-value 0.015), have an increased odds of unexplained falls. Conclusion BZD use is associated with falls, with larger effect size in BZD users reporting poor sleep quality in community dwelling older adults. Appropriate prescription of and regular review of medications such as BZDs is an important public health issue.


Genes ◽  
2020 ◽  
Vol 11 (7) ◽  
pp. 790 ◽  
Author(s):  
Daniela Lourenco ◽  
Andres Legarra ◽  
Shogo Tsuruta ◽  
Yutaka Masuda ◽  
Ignacio Aguilar ◽  
...  

Single-step genomic evaluation became a standard procedure in livestock breeding, and the main reason is the ability to combine all pedigree, phenotypes, and genotypes available into one single evaluation, without the need of post-analysis processing. Therefore, the incorporation of data on genotyped and non-genotyped animals in this method is straightforward. Since 2009, two main implementations of single-step were proposed. One is called single-step genomic best linear unbiased prediction (ssGBLUP) and uses single nucleotide polymorphism (SNP) to construct the genomic relationship matrix; the other is the single-step Bayesian regression (ssBR), which is a marker effect model. Under the same assumptions, both models are equivalent. In this review, we focus solely on ssGBLUP. The implementation of ssGBLUP into the BLUPF90 software suite was done in 2009, and since then, several changes were made to make ssGBLUP flexible to any model, number of traits, number of phenotypes, and number of genotyped animals. Single-step GBLUP from the BLUPF90 software suite has been used for genomic evaluations worldwide. In this review, we will show theoretical developments and numerical examples of ssGBLUP using SNP data from regular chips to sequence data.


2020 ◽  
Vol 98 (11) ◽  
Author(s):  
Sabrina T Amorim ◽  
Haipeng Yu ◽  
Mehdi Momen ◽  
Lúcia Galvão de Albuquerque ◽  
Angélica S Cravo Pereira ◽  
...  

Abstract An important criterion to consider in genetic evaluations is the extent of genetic connectedness across management units (MU), especially if they differ in their genetic mean. Reliable comparisons of genetic values across MU depend on the degree of connectedness: the higher the connectedness, the more reliable the comparison. Traditionally, genetic connectedness was calculated through pedigree-based methods; however, in the era of genomic selection, this can be better estimated utilizing new approaches based on genomics. Most procedures consider only additive genetic effects, which may not accurately reflect the underlying gene action of the evaluated trait, and little is known about the impact of non-additive gene action on connectedness measures. The objective of this study was to investigate the extent of genomic connectedness measures, for the first time, in Brazilian field data by applying additive and non-additive relationship matrices using a fatty acid profile data set from seven farms located in the three regions of Brazil, which are part of the three breeding programs. Myristic acid (C14:0) was used due to its importance for human health and reported presence of non-additive gene action. The pedigree included 427,740 animals and 925 of them were genotyped using the Bovine high-density genotyping chip. Six relationship matrices were constructed, parametrically and non-parametrically capturing additive and non-additive genetic effects from both pedigree and genomic data. We assessed genome-based connectedness across MU using the prediction error variance of difference (PEVD) and the coefficient of determination (CD). PEVD values ranged from 0.540 to 1.707, and CD from 0.146 to 0.456. Genomic information consistently enhanced the measures of connectedness compared to the numerator relationship matrix by at least 63%. Combining additive and non-additive genomic kernel relationship matrices or a non-parametric relationship matrix increased the capture of connectedness. Overall, the Gaussian kernel yielded the largest measure of connectedness. Our findings showed that connectedness metrics can be extended to incorporate genomic information and non-additive genetic variation using field data. We propose that different genomic relationship matrices can be designed to capture additive and non-additive genetic effects, increase the measures of connectedness, and to more accurately estimate the true state of connectedness in herds.


2019 ◽  
Vol 97 (Supplement_3) ◽  
pp. 50-50
Author(s):  
Daniela Lourenco ◽  
Shogo Tsuruta ◽  
Ivan Pocrnic ◽  
Ignacy Misztal

Abstract Large-scale single-step GBLUP (ssGBLUP) evaluations rely on techniques to approximate or avoid the inversion of the genomic relationship matrix (G). The algorithm for proven and young (APY) was developed to create the inverse of G without explicit inversion, and relies on the clustering of genotyped animals into two groups, namely core and non-core. Although the correlation between GEBV from regular ssGBLUP and APY ssGBLUP is greater than 0.99 when the appropriate number of core animals is used, reranking is still observed when different core groups are used. We investigated which animals are more suitable to reranking and how the changes in GEBV can be minimized. Datasets from beef and dairy cattle, and pigs were used. The beef cattle data comprised phenotypes on 3 growth traits for up to 6.8M animals, pedigree for 8.2M, and genotypes for 66k. A dairy cattle data with 9M phenotypes for udder depth, 10M animals in pedigree, and 570K genotyped was used. The pig dataset had up to 770k phenotypes recorded on 4 traits, pedigree for 2.6M animals and genotypes for 54k. Investigations included using several different core groups, increasing the number of core animals beyond the optimal number obtained by the eigenvalue decomposition, and comparisons with GEBV from ssGBLUP with direct inversion (except for dairy). Additionally, observed changes were compared with possible changes based on SE of GEBV. In all datasets, larger changes in GEBV by using different core groups were observed for animals with lower accuracy. The observed changes relative to standard deviations of GEBV were, on average, 5% and ranged from 0 to 30%. Increasing the number of core animals beyond the optimal value helped to asymptotically reduce changes in GEBV. Although core-dependent changes in GEBV exist, they are small and can be reduced with larger core groups.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 31-32
Author(s):  
Ignacy Misztal

Abstract Genetic parameters are important in animal breeding for many tasks, including as input to a model for genetic evaluation, to estimate genetic gain due to selection, and to estimate correlated response due to selection on major traits. Before the genomic era, parameter estimation was facilitated by sparse structure of mixed model equations. Methods such as AI REML with sparse matrix inversion or MCMC via Gibbs sampling could estimate parameters for populations exceeding 1 million animals. With genomic selection (GS) and single-step GBLUP, the genomic matrices are mostly dense, and costs of parameter estimation increased dramatically. The estimation with 20K genotyped animals can take many days. Details in matching pedigree and genomic information influence estimated parameters. Estimation without the genomic information when GS is practiced leads to biases due to genomic-preselection. Truncating data to too few generations or to only genotyped animals leads to additional biases by excluding data on which the selection was practiced. Current studies indicate strong declines in heritability due to GS. Regular models for parameter estimation compute parameters only for the base population. Models that trace changes of parameters over time, such as random regression model on year of birth or a multiple trait model treating times slices as separate traits, are very expensive. A good compromise in parameter estimation under GS is to use slices of only 2–3 generations, with genotypes of young animals removed. When complete populations are genotyped, estimations with large number of genotyped animals are possible either with a SNP model or with GBLUP (inversion of genomic relationship matrix by APY algorithm). For simple models, Method R can provide estimates for any data size. An indirect indication of changing parameters over time is reduced predictivity or lower genetic trend despite increased data. Parameter estimation in GS would benefit from new, efficient tools.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 32-33
Author(s):  
Daniela Lourenco ◽  
Shogo Tsuruta ◽  
Jorge Hidalgo ◽  
Ignacy Misztal

Abstract Under traditional BLUP, animals without new data in subsequent evaluations had stable EBV even when its precision was low. However, changes in EBV could still be observed for some animals when fixed effects were redefined, independently of accuracy levels. Given no model changes were made, the high stability of BLUP generated great confidence in the method. With the adoption of genomic evaluations by the livestock industry and the accumulation of genotypes, changes in genomic EBV (GEBV) in subsequent evaluations are being frequently reported; this is true even for animals that have no added phenotypes. Consequently, there are questions about why such changes happen. The main cause is the increase in connectivity among animals. As genomic relationships consider alleles that are identical by state, all genotyped animals can have some level of relationships, even though they do not share common ancestors in the current population. In this way, phenotypes added for a portion of the genotyped animals in the evaluation will be shared among nearly all genotyped animals. Conversely, in BLUP, those phenotypes are shared only with relatives in the pedigree; therefore, fewer changes are observed. Besides the increased connectivity, changes in blending and scaling of the genomic relationship matrix can cause changes in subsequent GEBV. Using real livestock data, we observed average changes of 0.1 additive genetic standard deviation (SDa) when newly recorded phenotypes were added to the genomic evaluation system; however, outliers had changes as high as 1 SDa, which agrees with the normal distribution theory. Although changes under genomic evaluations are more frequent than in BLUP, the observed fluctuations are small compared to the possible changes based on prediction error variance. Making selection decisions based on groups of animals with high average accuracy instead of individual animals may help to offset the impact of fluctuations in genomic evaluations.


2016 ◽  
Vol 94 (suppl_5) ◽  
pp. 138-139
Author(s):  
I. Pocrnic ◽  
D. A. L. Lourenco ◽  
Y. Masuda ◽  
A. Legarra ◽  
I. Misztal

GIS Business ◽  
2019 ◽  
Vol 14 (4) ◽  
pp. 122-129
Author(s):  
Monika Bansal ◽  
Sh. Lbs Arya Mahila

Youth Mentoring is the process of matching mentors with young people who need or want a caring responsible adult in their lives. It is defined as an on-going relationship between a caring adult and a young person which is required for self-development, professional growth and carrier development of the mentee and mentors both and all this must be placed within a specific institution context. The purpose of this article is to quantitatively review the three major areas of mentoring research (youth, academic, and workplace) to determine the overall effect size associated with mentoring outcomes for students.


Sign in / Sign up

Export Citation Format

Share Document