Improving the Efficiency of Genomic Selection in Chinese Simmental beef cattle

Genomic selection is an accurate and efficient method of estimating genetic merits by using high-density genome-wide single nucleotide polymorphisms (SNPs).In this study, we investigate an approach to increase the efficiency of genomic prediction by using genome-wide markers. The approach is a feature selection based on genomic best linear unbiased prediction (GBLUP),which is a statistical method used to predict breeding values using SNPs for selection in animal and plant breeding. The objective of this study is the choice of kinship matrix for genomic best linear unbiased prediction (GBLUP).The G-matrix is using the information of genome-wide dense markers. We compare three kinds of kinships based on different combinations of centring and scaling of marker genotypes.And find a suitable kinship approach that adjusts for the resource population of Chinese Simmental beef cattle.Single nucleotide polymorphism (SNPs) can be used to estimate kinship matrix and individual inbreeding coefficients more accurately. So in our research a genomic relationship matrix was developed for 1059 Chinese Simmental beef cattle using 640000 single nucleotide polymorphisms and breeding values were estimated using phenotypes about Carcass weight and Sirloin weight. The number of SNPs needed to accurately estimate a genomic relationship matrix was evaluated in this population. Another aim of this study was to optimize the selection of markers and determine the required number of SNPs for estimation of kinship in the Chinese Simmental beef cattle. We find that the feature selection of GBLUP using Xu’s and the Astle and Balding’s kinships model performed similarly well, and were the best-performing methods in our study. Inbreeding and kinship matrix can be estimated with high accuracy using ≥12,000s in Chinese Simmental beef cattle.

Download Full-text

Genomic Prediction Using Alternative Strategies of Weighted Single-Step Genomic BLUP for Yearling Weight and Carcass Traits in Hanwoo Beef Cattle

Genes ◽

10.3390/genes12020266 ◽

2021 ◽

Vol 12 (2) ◽

pp. 266

Author(s):

Hossein Mehrban ◽

Masoumeh Naserkheil ◽

Deuk Hwan Lee ◽

Chungil Cho ◽

Taejeong Choi ◽

...

Keyword(s):

Quantitative Trait Loci ◽

Beef Cattle ◽

Genomic Prediction ◽

Quantitative Trait ◽

Carcass Traits ◽

Best Linear Unbiased Prediction ◽

Single Step ◽

Linear Unbiased Prediction ◽

Single Nucleotide ◽

Best Linear Unbiased

The weighted single-step genomic best linear unbiased prediction (GBLUP) method has been proposed to exploit information from genotyped and non-genotyped relatives, allowing the use of weights for single-nucleotide polymorphism in the construction of the genomic relationship matrix. The purpose of this study was to investigate the accuracy of genetic prediction using the following single-trait best linear unbiased prediction methods in Hanwoo beef cattle: pedigree-based (PBLUP), un-weighted (ssGBLUP), and weighted (WssGBLUP) single-step genomic methods. We also assessed the impact of alternative single and window weighting methods according to their effects on the traits of interest. The data was comprised of 15,796 phenotypic records for yearling weight (YW) and 5622 records for carcass traits (backfat thickness: BFT, carcass weight: CW, eye muscle area: EMA, and marbling score: MS). Also, the genotypic data included 6616 animals for YW and 5134 for carcass traits on the 43,950 single-nucleotide polymorphisms. The ssGBLUP showed significant improvement in genomic prediction accuracy for carcass traits (71%) and yearling weight (99%) compared to the pedigree-based method. The window weighting procedures performed better than single SNP weighting for CW (11%), EMA (11%), MS (3%), and YW (6%), whereas no gain in accuracy was observed for BFT. Besides, the improvement in accuracy between window WssGBLUP and the un-weighted method was low for BFT and MS, while for CW, EMA, and YW resulted in a gain of 22%, 15%, and 20%, respectively, which indicates the presence of relevant quantitative trait loci for these traits. These findings indicate that WssGBLUP is an appropriate method for traits with a large quantitative trait loci effect.

Download Full-text

Genome-Wide Association Study of Body Weight Traits in Chinese Fine-Wool Sheep

Animals ◽

10.3390/ani10010170 ◽

2020 ◽

Vol 10 (1) ◽

pp. 170 ◽

Cited By ~ 2

Author(s):

Zengkui Lu ◽

Yaojing Yue ◽

Chao Yuan ◽

Jianbin Liu ◽

Zhiqiang Chen ◽

...

Keyword(s):

Body Weight ◽

Muscle Development ◽

Genome Wide Association Study ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genome Wide ◽

Economic Trait ◽

Significance Levels ◽

Genomic Regions ◽

Selection Of

Body weight is an important economic trait for sheep and it is vital for their successful production and breeding. Therefore, identifying the genomic regions and biological pathways that contribute to understanding variability in body weight traits is significant for selection purposes. In this study, the genome-wide associations of birth, weaning, yearling, and adult weights of 460 fine-wool sheep were determined using resequencing technology. The results showed that 113 single nucleotide polymorphisms (SNPs) reached the genome-wide significance levels for the four body weight traits and 30 genes were annotated effectively, including AADACL3, VGF, NPC1, and SERPINA12. The genes annotated by these SNPs significantly enriched 78 gene ontology terms and 25 signaling pathways, and were found to mainly participate in skeletal muscle development and lipid metabolism. These genes can be used as candidate genes for body weight in sheep, and provide useful information for the production and genomic selection of Chinese fine-wool sheep.

Download Full-text

Indirect predictions with a large number of genotyped animals using the algorithm for proven and young

Journal of Animal Science ◽

10.1093/jas/skaa154 ◽

2020 ◽

Vol 98 (6) ◽

Cited By ~ 1

Author(s):

Andre L S Garcia ◽

Yutaka Masuda ◽

Shogo Tsuruta ◽

Stephen Miller ◽

Ignacy Misztal ◽

...

Keyword(s):

Random Sample ◽

Best Linear Unbiased Prediction ◽

Single Step ◽

Nucleotide Polymorphisms ◽

Linear Unbiased Prediction ◽

Single Nucleotide ◽

Large Numbers ◽

Minimum Number ◽

Best Linear Unbiased ◽

Estimated Breeding Values

Abstract Reliable single-nucleotide polymorphisms (SNP) effects from genomic best linear unbiased prediction BLUP (GBLUP) and single-step GBLUP (ssGBLUP) are needed to calculate indirect predictions (IP) for young genotyped animals and animals not included in official evaluations. Obtaining reliable SNP effects and IP requires a minimum number of animals and when a large number of genotyped animals are available, the algorithm for proven and young (APY) may be needed. Thus, the objectives of this study were to evaluate IP with an increasingly larger number of genotyped animals and to determine the minimum number of animals needed to compute reliable SNP effects and IP. Genotypes and phenotypes for birth weight, weaning weight, and postweaning gain were provided by the American Angus Association. The number of animals with phenotypes was more than 3.8 million. Genotyped animals were assigned to three cumulative year-classes: born until 2013 (N = 114,937), born until 2014 (N = 183,847), and born until 2015 (N = 280,506). A three-trait model was fitted using the APY algorithm with 19,021 core animals under two scenarios: 1) core 2013 (random sample of animals born until 2013) used for all year-classes and 2) core 2014 (random sample of animals born until 2014) used for year-class 2014 and core 2015 (random sample of animals born until 2015) used for year-class 2015. GBLUP used phenotypes from genotyped animals only, whereas ssGBLUP used all available phenotypes. SNP effects were predicted using genomic estimated breeding values (GEBV) from either all genotyped animals or only core animals. The correlations between GEBV from GBLUP and IP obtained using SNP effects from core 2013 were ≥0.99 for animals born in 2013 but as low as 0.07 for animals born in 2014 and 2015. Conversely, the correlations between GEBV from ssGBLUP and IP were ≥0.99 for animals born in all years. IP predictive abilities computed with GEBV from ssGBLUP and SNP predictions based on only core animals were as high as those based on all genotyped animals. The correlations between GEBV and IP from ssGBLUP were ≥0.76, ≥0.90, and ≥0.98 when SNP effects were computed using 2k, 5k, and 15k core animals. Suitable IP based on GEBV from GBLUP can be obtained when SNP predictions are based on an appropriate number of core animals, but a considerable decline in IP accuracy can occur in subsequent years. Conversely, IP from ssGBLUP based on large numbers of phenotypes from non-genotyped animals have persistent accuracy over time.

Download Full-text

A Weighted Genomic Relationship Matrix Based on Fixation Index (FST) Prioritized SNPs for Genomic Selection

Genes ◽

10.3390/genes10110922 ◽

2019 ◽

Vol 10 (11) ◽

pp. 922

Author(s):

Ling-Yun Chang ◽

Sajjad Toghiani ◽

El Hamidi Hay ◽

Samuel E. Aggrey ◽

Romdhane Rekaya

Keyword(s):

Genomic Selection ◽

Statistical Power ◽

Fixation Index ◽

Genomic Relationship Matrix ◽

Relationship Matrix ◽

Nucleotide Polymorphisms ◽

Genomic Relationship ◽

Single Nucleotide ◽

Relative Contribution ◽

Estimation Of Variance

A dramatic increase in the density of marker panels has been expected to increase the accuracy of genomic selection (GS), unfortunately, little to no improvement has been observed. By including all variants in the association model, the dimensionality of the problem should be dramatically increased, and it could undoubtedly reduce the statistical power. Using all Single nucleotide polymorphisms (SNPs) to compute the genomic relationship matrix (G) does not necessarily increase accuracy as the additive relationships can be accurately estimated using a much smaller number of markers. Due to these limitations, variant prioritization has become a necessity to improve accuracy. The fixation index (FST) as a measure of population differentiation has been used to identify genome segments and variants under selection pressure. Using prioritized variants has increased the accuracy of GS. Additionally, FST can be used to weight the relative contribution of prioritized SNPs in computing G. In this study, relative weights based on FST scores were developed and incorporated into the calculation of G and their impact on the estimation of variance components and accuracy was assessed. The results showed that prioritizing SNPs based on their FST scores resulted in an increase in the genetic similarity between training and validation animals and improved the accuracy of GS by more than 5%.

Download Full-text

Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations

10.1101/2021.02.03.429542 ◽

2021 ◽

Author(s):

Zhi Ming Xu ◽

Sina Rüeger ◽

Michaela Zwyer ◽

Daniela Brites ◽

Hellen Hiza ◽

...

Keyword(s):

Association Studies ◽

Imputation Accuracy ◽

Genotype Imputation ◽

Small Subset ◽

Study Cohort ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genome Wide ◽

Selection Of

AbstractGenome-wide association studies rely on the statistical inference of untyped variants, called imputation, to increase the coverage of genotyping arrays. However, the results are often suboptimal in populations underrepresented in existing reference panels and array designs, since the selected single nucleotide polymorphisms (SNPs) may fail to capture population-specific haplotype structures, hence the full extent of common genetic variation. Here, we propose to sequence the full genome of a small subset of an underrepresented study cohort to inform the selection of population-specific add-on SNPs, such that the remaining array-genotyped cohort could be more accurately imputed. Using a Tanzania-based cohort as a proof-of-concept, we demonstrate the validity of our approach by showing improvements in imputation accuracy after the addition of our designed addon SNPs to the base H3Africa array.

Download Full-text

Genome-wide identification of ZmHMAs and association of natural variation in ZmHMA2 and ZmHMA3 with leaf cadmium accumulation in maize

PeerJ ◽

10.7717/peerj.7877 ◽

2019 ◽

Vol 7 ◽

pp. e7877 ◽

Cited By ~ 1

Author(s):

Yanhua Cao ◽

Xiongwei Zhao ◽

Yajuan Liu ◽

Yalong Wang ◽

Wenmei Wu ◽

...

Keyword(s):

Natural Variation ◽

Developmental Stages ◽

Expression Patterns ◽

Cadmium Accumulation ◽

Functional Verification ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Cd Accumulation ◽

Genome Wide ◽

Selection Of

P1B-type ATPases, known as heavy metal ATPases (HMAs), play an important role in the control of cadmium (Cd) accumulation in plants. In this study, a total of 12 ZmHMA genes were identified in the maize genome and particularly classified into six clusters based on their phylogenetic relationship and motif compositions. Furthermore, the expression patterns of different ZmHMA genes varied with developmental stages, and were tissue specific under normal conditions. ZmHMA2 and ZmHMA3 genes exhibited significant up-regulation under Cd treatment. Eventually, the association analysis between 103 inbred lines and alleles in ZmHMA2 and ZmHMA3 revealed that one insertion–deletion (InDel) in the intron from ZmHMA2 was associated with leaf Cd concentration under low Cd condition at the seedling stage. Twenty polymorphisms in ZmHMA3 were significantly associated with leaf Cd concentration under various Cd levels at seedling and maturing stages. Five single nucleotide polymorphisms (SNPs) and two InDels of these significantly associated polymorphic loci from ZmHMA3 caused the amino acid substitutions and insertion or deletion events. Importantly, the proteins encoded by ZmHMA2 and ZmHMA3 genes were located in the plasma membrane. This comprehensive analysis will provide an important theoretical basis for future functional verification of ZmHMA genes to unravel the mechanisms of Cd accumulation in leaves of maize. Additionally, the favorable alleles in ZmHMA3 will lay a foundation for the marker-assisted selection of low Cd accumulation in maize.

Download Full-text

Changing Ploidy as a Strategy: The Irish Potato Famine Pathogen Shifts Ploidy in Relation to Its Sexuality

Molecular Plant-Microbe Interactions ◽

10.1094/mpmi-08-16-0156-r ◽

2017 ◽

Vol 30 (1) ◽

pp. 45-52 ◽

Cited By ~ 34

Author(s):

Ying Li ◽

He Shen ◽

Qian Zhou ◽

Kun Qian ◽

Theo van der Lee ◽

...

Keyword(s):

Ploidy Level ◽

Nuclear Dna ◽

Allelic Variation ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Global Food Security ◽

Genome Wide ◽

Key Factor ◽

Potato Famine ◽

Selection Of

The oomycete Phytophthora infestans was the causal agent of the Irish Great Famine and is a recurring threat to global food security. The pathogen can reproduce both sexually and asexually, with high potential to adapt to various environments and great risk to break disease resistance genes in potato. As are other oomycetes, P. infestans is regarded to be diploid during the vegetative phase of its life cycle, although some studies reported trisomy and polyploidy. Using microsatellite fingerprinting, genome-wide assessment of single nucleotide polymorphisms, nuclear DNA quantification, and microscopic counting of chromosome numbers, we assessed the ploidy level of a comprehensive selection of isolates. All progenies from sexual populations of P. infestans in nature were found to be diploid, in contrast nearly all dominant asexual lineages, including the most important pandemic clonal lineages US-1 and 13_A2 were triploid. Such triploids possess significantly more allelic variation than diploids. We observed that triploid genotype can change to a diploid genome constitution when exposed to artificial stress conditions. This study reveals that fluctuations in the ploidy level may be a key factor in the adaptation process of this notorious plant destroyer and imposes an extra challenge to control this disease.

Download Full-text

Future tools for association mapping in crop plantsThis article is one of a selection of papers from the conference “Exploiting Genome-wide Association in Oilseed Brassicas: a model for genetic improvement of major OECD crops for sustainable farming”.

Genome ◽

10.1139/g10-057 ◽

2010 ◽

Vol 53 (11) ◽

pp. 1017-1023 ◽

Cited By ~ 36

Author(s):

Chris Duran ◽

Dominic Eales ◽

Daniel Marshall ◽

Michael Imelfort ◽

Jiri Stiller ◽

...

Keyword(s):

Association Mapping ◽

Genetic Improvement ◽

Sequence Data ◽

Nucleotide Polymorphisms ◽

Snp Discovery ◽

Single Nucleotide ◽

Genome Wide ◽

Oilseed Brassicas ◽

Paired Read ◽

Selection Of

Association mapping currently relies on the identification of genetic markers. Several technologies have been adopted for genetic marker analysis, with single nucleotide polymorphisms (SNPs) being the most popular where a reasonable quantity of genome sequence data are available. We describe several tools we have developed for the discovery, annotation, and visualization of molecular markers for association mapping. These include autoSNPdb for SNP discovery from assembled sequence data; TAGdb for the identification of gene specific paired read Illumina GAII data; CMap3D for the comparison of mapped genetic and physical markers; and BAC and Gene Annotator for the online annotation of genes and genomic sequences.

Download Full-text

Using population-specific add-on polymorphisms to improve genotype imputation in underrepresented populations

PLoS Computational Biology ◽

10.1371/journal.pcbi.1009628 ◽

2022 ◽

Vol 18 (1) ◽

pp. e1009628

Author(s):

Zhi Ming Xu ◽

Sina Rüeger ◽

Michaela Zwyer ◽

Daniela Brites ◽

Hellen Hiza ◽

...

Keyword(s):

Association Studies ◽

Imputation Accuracy ◽

Genotype Imputation ◽

Small Subset ◽

Study Cohort ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Genome Wide ◽

Selection Of

Genome-wide association studies rely on the statistical inference of untyped variants, called imputation, to increase the coverage of genotyping arrays. However, the results are often suboptimal in populations underrepresented in existing reference panels and array designs, since the selected single nucleotide polymorphisms (SNPs) may fail to capture population-specific haplotype structures, hence the full extent of common genetic variation. Here, we propose to sequence the full genomes of a small subset of an underrepresented study cohort to inform the selection of population-specific add-on tag SNPs and to generate an internal population-specific imputation reference panel, such that the remaining array-genotyped cohort could be more accurately imputed. Using a Tanzania-based cohort as a proof-of-concept, we demonstrate the validity of our approach by showing improvements in imputation accuracy after the addition of our designed add-on tags to the base H3Africa array.

Download Full-text

EpiPen: An R Package to Investigate Two-Locus Epistatic Models

Twin Research and Human Genetics ◽

10.1017/thg.2014.25 ◽

2014 ◽

Vol 17 (4) ◽

Cited By ~ 2

Author(s):

Raymond K. Walters ◽

Charles Laurin ◽

Gitta H. Lubke

Keyword(s):

Power Analysis ◽

R Package ◽

Simulation Studies ◽

Nucleotide Polymorphisms ◽

Single Nucleotide ◽

Epistatic Interactions ◽

Model Interpretation ◽

Genome Wide ◽

Using Data ◽

Power Analyses

Epistasis is a growing area of research in genome-wide studies, but the differences between alternative definitions of epistasis remain a source of confusion for many researchers. One problem is that models for epistasis are presented in a number of formats, some of which have difficult-to-interpret parameters. In addition, the relation between the different models is rarely explained. Existing software for testing epistatic interactions between single-nucleotide polymorphisms (SNPs) does not provide the flexibility to compare the available model parameterizations. For that reason we have developed an R package for investigating epistatic and penetrance models, EpiPen, to aid users who wish to easily compare, interpret, and utilize models for two-locus epistatic interactions. EpiPen facilitates research on SNP-SNP interactions by allowing the R user to easily convert between common parametric forms for two-locus interactions, generate data for simulation studies, and perform power analyses for the selected model with a continuous or dichotomous phenotype. The usefulness of the package for model interpretation and power analysis is illustrated using data on rheumatoid arthritis.

Download Full-text