scholarly journals Genotype Imputation Methods and Their Effects on Genomic Predictions in Cattle

2016 ◽  
Vol 4 (2) ◽  
pp. 79-98 ◽  
Author(s):  
Yining Wang ◽  
Guohui Lin ◽  
Changxi Li ◽  
Paul Stothard
PLoS ONE ◽  
2008 ◽  
Vol 3 (10) ◽  
pp. e3551 ◽  
Author(s):  
Yu-Fang Pei ◽  
Jian Li ◽  
Lei Zhang ◽  
Christopher J. Papasian ◽  
Hong-Wen Deng

PLoS Genetics ◽  
2020 ◽  
Vol 16 (11) ◽  
pp. e1009049
Author(s):  
Simone Rubinacci ◽  
Olivier Delaneau ◽  
Jonathan Marchini

Genotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. In the last 10 years reference panels have increased in size by more than 100 fold. Increasing reference panel size improves accuracy of markers with low minor allele frequencies but poses ever increasing computational challenges for imputation methods. Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. This method continues to refine the observation made in the IMPUTE2 method, that accuracy is optimized via use of a custom subset of haplotypes when imputing each individual. It achieves fast, accurate, and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT). By using the PBWT data structure at genotyped markers, IMPUTE5 identifies locally best matching haplotypes and long identical by state segments. The method then uses the selected haplotypes as conditioning states within the IMPUTE model. Using the HRC reference panel, which has ∼65,000 haplotypes, we show that IMPUTE5 is up to 30x faster than MINIMAC4 and up to 3x faster than BEAGLE5.1, and uses less memory than both these methods. Using simulated reference panels we show that IMPUTE5 scales sub-linearly with reference panel size. For example, keeping the number of imputed markers constant, increasing the reference panel size from 10,000 to 1 million haplotypes requires less than twice the computation time. As the reference panel increases in size IMPUTE5 is able to utilize a smaller number of reference haplotypes, thus reducing computational cost.


2019 ◽  
Vol 35 (21) ◽  
pp. 4321-4326
Author(s):  
Mark Abney ◽  
Aisha ElSherbiny

Abstract Motivation Genotype imputation, though generally accurate, often results in many genotypes being poorly imputed, particularly in studies where the individuals are not well represented by standard reference panels. When individuals in the study share regions of the genome identical by descent (IBD), it is possible to use this information in combination with a study-specific reference panel (SSRP) to improve the imputation results. Kinpute uses IBD information—due to recent, familial relatedness or distant, unknown ancestors—in conjunction with the output from linkage disequilibrium (LD) based imputation methods to compute more accurate genotype probabilities. Kinpute uses a novel method for IBD imputation, which works even in the absence of a pedigree, and results in substantially improved imputation quality. Results Given initial estimates of average IBD between subjects in the study sample, Kinpute uses a novel algorithm to select an optimal set of individuals to sequence and use as an SSRP. Kinpute is designed to use as input both this SSRP and the genotype probabilities output from other LD-based imputation software, and uses a new method to combine the LD imputed genotype probabilities with IBD configurations to substantially improve imputation. We tested Kinpute on a human population isolate where 98 individuals have been sequenced. In half of this sample, whose sequence data was masked, we used Impute2 to perform LD-based imputation and Kinpute was used to obtain higher accuracy genotype probabilities. Measures of imputation accuracy improved significantly, particularly for those genotypes that Impute2 imputed with low certainty. Availability and implementation Kinpute is an open-source and freely available C++ software package that can be downloaded from. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Author(s):  
Grazyella M. Yoshida ◽  
Jean P. Lhorente ◽  
Katharina Correa ◽  
Jose Soto ◽  
Diego Salas ◽  
...  

ABSTRACTFillet yield (FY) and harvest weight (HW) are economically important traits in Nile tilapia production. Genetic improvement of these traits, especially for FY, are lacking, due to the absence of efficient methods to measure the traits without sacrificing fish and the use of information from relatives to selection. However, genomic information could be used by genomic selection to improve traits that are difficult to measure directly in selection candidates, as in the case of FY. The objectives of this study were: (i) to perform genome-wide association studies (GWAS) to dissect the genetic architecture of FY and HW, (ii) to evaluate the accuracy of genotype imputation and (iii) to assess the accuracy of genomic selection using true and imputed low-density (LD) single nucleotide polymorphism (SNP) panels to determine a cost-effective strategy for practical implementation of genomic information in tilapia breeding programs. The data set consisted of 5,866 phenotyped animals and 1,238 genotyped animals (108 parents and 1,130 offspring) using a 50K SNP panel. The GWAS were performed using all genotyped and phenotyped animals. The genotyped imputation was performed from LD panels (LD0.5K, LD1K and LD3K) to high-density panel (HD), using information from parents and 20% of offspring in the reference set and the remaining 80% in the validation set. In addition, we tested the accuracy of genomic selection using true and imputed genotypes comparing the accuracy obtained from pedigree-based best linear unbiased prediction (PBLUP) and genomic predictions. The results from GWAS supports evidence of the polygenic nature of FY and HW. The accuracy of imputation ranged from 0.90 to 0.98 for LD0.5K and LD3K, respectively. The accuracy of genomic prediction outperformed the estimated breeding value from PBLUP. The use of imputation for genomic selection resulted in an increased relative accuracy independent of the trait and LD panel analyzed. The present results suggest that genotype imputation could be a cost-effective strategy for genomic selection in tilapia breeding programs.


2018 ◽  
Vol 29 (1) ◽  
pp. 125-134 ◽  
Author(s):  
Ehsan Ullah ◽  
Raghvendra Mall ◽  
Mostafa M. Abbas ◽  
Khalid Kunji ◽  
Alejandro Q. Nato ◽  
...  

Genes ◽  
2019 ◽  
Vol 10 (9) ◽  
pp. 652 ◽  
Author(s):  
Junjie Chen ◽  
Xinghua Shi

Genotype imputation, where missing genotypes can be computationally imputed, is an essential tool in genomic analysis ranging from genome wide associations to phenotype prediction. Traditional genotype imputation methods are typically based on haplotype-clustering algorithms, hidden Markov models (HMMs), and statistical inference. Deep learning-based methods have been recently reported to suitably address the missing data problems in various fields. To explore the performance of deep learning for genotype imputation, in this study, we propose a deep model called a sparse convolutional denoising autoencoder (SCDA) to impute missing genotypes. We constructed the SCDA model using a convolutional layer that can extract various correlation or linkage patterns in the genotype data and applying a sparse weight matrix resulted from the L1 regularization to handle high dimensional data. We comprehensively evaluated the performance of the SCDA model in different scenarios for genotype imputation on the yeast and human genotype data, respectively. Our results showed that SCDA has strong robustness and significantly outperforms popular reference-free imputation methods. This study thus points to another novel application of deep learning models for missing data imputation in genomic studies.


2016 ◽  
Vol 6 (1) ◽  
Author(s):  
Nab Raj Roshyara ◽  
Katrin Horn ◽  
Holger Kirsten ◽  
Peter Ahnert ◽  
Markus Scholz

2009 ◽  
Vol 3 (Suppl 7) ◽  
pp. S5 ◽  
Author(s):  
Joanna M Biernacka ◽  
Rui Tang ◽  
Jia Li ◽  
Shannon K McDonnell ◽  
Kari G Rabe ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document