scholarly journals Plant-ImputeDB: an integrated multiple plant reference panel database for genotype imputation

2020 ◽  
Vol 49 (D1) ◽  
pp. D1480-D1488
Author(s):  
Yingjie Gao ◽  
Zhiquan Yang ◽  
Wenqian Yang ◽  
Yanbo Yang ◽  
Jing Gong ◽  
...  

Abstract Genotype imputation is a process that estimates missing genotypes in terms of the haplotypes and genotypes in a reference panel. It can effectively increase the density of single nucleotide polymorphisms (SNPs), boost the power to identify genetic association and promote the combination of genetic studies. However, there has been a lack of high-quality reference panels for most plants, which greatly hinders the application of genotype imputation. Here, we developed Plant-ImputeDB (http://gong_lab.hzau.edu.cn/Plant_imputeDB/), a comprehensive database with reference panels of 12 plant species for online genotype imputation, SNP and block search and free download. By integrating genotype data and whole-genome resequencing data of plants from various studies and databases, the current Plant-ImputeDB provides high-quality reference panels of 12 plant species, including ∼69.9 million SNPs from 34 244 samples. It also provides an easy-to-use online tool with the option of two popular tools specifically designed for genotype imputation. In addition, Plant-ImputeDB accepts submissions of different types of genomic variations, and provides free and open access to all publicly available data in support of related research worldwide. In general, Plant-ImputeDB may serve as an important resource for plant genotype imputation and greatly facilitate the research on plant genetic research.

2019 ◽  
Vol 48 (D1) ◽  
pp. D659-D667 ◽  
Author(s):  
Wenqian Yang ◽  
Yanbo Yang ◽  
Cecheng Zhao ◽  
Kun Yang ◽  
Dongyang Wang ◽  
...  

Abstract Animal-ImputeDB (http://gong_lab.hzau.edu.cn/Animal_ImputeDB/) is a public database with genomic reference panels of 13 animal species for online genotype imputation, genetic variant search, and free download. Genotype imputation is a process of estimating missing genotypes in terms of the haplotypes and genotypes in a reference panel. It can effectively increase the density of single nucleotide polymorphisms (SNPs) and thus can be widely used in large-scale genome-wide association studies (GWASs) using relatively inexpensive and low-density SNP arrays. However, most animals except humans lack high-quality reference panels, which greatly limits the application of genotype imputation in animals. To overcome this limitation, we developed Animal-ImputeDB, which is dedicated to collecting genotype data and whole-genome resequencing data of nonhuman animals from various studies and databases. A computational pipeline was developed to process different types of raw data to construct reference panels. Finally, 13 high-quality reference panels including ∼400 million SNPs from 2265 samples were constructed. In Animal-ImputeDB, an easy-to-use online tool consisting of two popular imputation tools was designed for the purpose of genotype imputation. Collectively, Animal-ImputeDB serves as an important resource for animal genotype imputation and will greatly facilitate research on animal genomic selection and genetic improvement.


Author(s):  
Simon F Lashmar ◽  
Donagh P Berry ◽  
Rian Pierneef ◽  
Farai C Muchadeyi ◽  
Carina Visser

Abstract A major obstacle in applying genomic selection (GS) to uniquely adapted local breeds in less-developed countries has been the cost of genotyping at high densities of single nucleotide polymorphisms (SNP). Cost reduction can be achieved by imputing genotypes from lower to higher densities. Locally adapted breeds tend to be admixed and exhibit a high degree of genomic heterogeneity thus necessitating the optimization of SNP selection for downstream imputation. The aim of this study was to quantify the achievable imputation accuracy for a sample of 1,135 South African (SA) Drakensberger using several custom-derived lower-density panels varying in both SNP density and how the SNP were selected. From a pool of 120,608 genotyped SNP, subsets of SNP were chosen 1) at random, 2) with even genomic dispersion, 3) by maximizing the mean minor allele frequency (MAF), 4) using a combined score of MAF and linkage disequilibrium (LD), 5) using a partitioning-around-medoids (PAM) algorithm, and finally 6) using a hierarchical LD-based clustering algorithm. Imputation accuracy to higher density improved as SNP density increased; animal-wise imputation accuracy defined as the within-animal correlation between the imputed and actual alleles ranged from 0.625 to 0.990 when 2,500 randomly selected SNP were chosen versus a range of 0.918 to 0.999 when 50,000 randomly selected SNP were used. At a panel density of 10,000 SNP, the mean (standard deviation) animal-wise allele concordance rate was 0.976 (0.018) versus 0.982 (0.014) when the worst (i.e., random) as opposed to the best (i.e., combination of MAF and LD) SNP selection strategy was employed. A difference of 0.071 units was observed between the mean correlation-based accuracy of imputed SNP categorized as low (0.01<MAF≤0.1) versus high MAF (0.4<MAF≤0.5). Greater mean imputation accuracy was achieved for SNP located on autosomal extremes when these regions were populated with more SNP. The presented results suggested that genotype imputation can be a practical cost-saving strategy for indigenous breeds such as the South African Drakensberger. Based on the results, a genotyping panel consisting of approximately 10,000 SNP selected based on a combination of MAF and LD would suffice in achieving a less than 3% imputation error rate for a breed characterized by genomic admixture on the condition that these SNP are selected based on breed-specific selection criteria.


2016 ◽  
Vol 43 (6) ◽  
pp. 1045-1049 ◽  
Author(s):  
Kwangwoo Kim ◽  
So-Young Bang ◽  
Young Bin Joo ◽  
Taehyeung Kim ◽  
Hye-Soon Lee ◽  
...  

Objective.Cyclophosphamide (CYC) is an immunosuppressant drug widely used to treat various diseases including lupus nephritis, but its efficacy highly varies from individual to individual. This pharmacogenomics association study searched for genetic variations associated with CYC efficacy.Methods.Genome-wide association scan was performed for 109 Korean patients with systemic lupus erythematosus with lupus nephritis (classes III–V) who received intravenous CYC induction therapy. Genetic differences between responders and nonresponders were examined using Cochran–Armitage trend tests, and genotype imputation was used for defining the association locus.Results.Genetic polymorphisms in the Fcγ receptor gene (FCGR) cluster at human chromosome 1q23, previously associated with lupus nephritis susceptibility, were associated with the response to CYC treatment for lupus nephritis. Significant response association was found for 3 perfectly correlated (r2 = 1) single-nucleotide polymorphisms (SNP): rs6697139, rs10917686, and rs10917688, located between the FCGR2B and FCRLA genes (p = 3.4 × 10−8). Carriage of the minor alleles in these SNP was found only in nonresponders (31%) and none in responders (0%).Conclusion.This first genome-wide association approach for CYC response yielded a robust profile of genetic associations including large-effect SNP in the FCGR2B-FCRLA locus, which may provide better insights to CYC metabolism and efficacy.


2020 ◽  
Vol 4 (Supplement_1) ◽  
Author(s):  
Afifah Azam ◽  
Mohammad Arif Shahar ◽  
Siti Liyana Saud Gany ◽  
Norlela Sukor ◽  
Nor Azmi Kamaruddin ◽  
...  

Abstract Primary aldosteronism (PA), also known as Conn’s syndrome, is a common curable cause of hypertension. Family studies of essential hypertensive patients suggest that heritable genetic factors play a role in blood pressure regulation1. Interestingly, single nucleotide polymorphisms (SNP) in genes encoding enzymes involved with adrenal steroidogenesis, CYP11B2, CYP11B1 and CYP17A1, associate with increased risk of hypertension2. Therefore, we analysed whether selected SNPs in these genes are associated with PA. We performed an association study using genotype imputation for selected SNPs of the steroidogenic enzyme genes CYP11B2 (rs4546, rs1799998, rs13268025), CYP11B1 (rs6410, rs149845727), and CYP17A1 (rs1004467, rs138009835, rs2150927) from a pilot genome wide association study of Malaysian PA patients and healthy controls which was merged with the Singapore Genome Variation Project (SGVP) population dataset3. Genotype imputation for minor and major alleles was validated using PCR sequencing (n>10 for each genotype). Further, one SNP from each steroidogenic enzyme (CYP11B2:rs1799998, CYP11B1:rs6410 and CYP17A1:rs1004467) was validated using commercial TaqMan genotyping assays on the ABI 7000 Sequence Detection System which was performed on 149 PA patients and 78 non-hypertensive healthy individuals. Case-control genetic association analysis was performed at http://www.oege.org/software/orcalc.html and the association between genotypes and phenotypes was done using the independent-samples Kruskal-Wallis test on SPSS (version 25). The Minor Allele Frequencies (MAFs) for rs1004467, rs6410 and rs1799998 were similar to East Asian populations but differed significantly different from European, African, American and South Asian populations (rs1004467 MAF: C=0.258/298, rs6410 MAF: A=0.265/298, rs1799998 MAF: C=0.225/298). In Chinese patients matched by gender, heterozygotes for rs6410 had significantly increased risk of PA compared to common homozygotes (OR: 3.15, 95% CI: 1.01–9.8, p=0.04). Across patients of different ethnicity, the distribution of aldosterone levels was significantly different (p=0.039). In summary, only SNP rs6410 in Chinese patients matched by gender showed association with PA in our South East Asian cohort. More functional experiments need to be done to find out whether this is causal for PA or whether the SNP is in linkage disequilibrium with the actual functional causative SNPs. Once the functional SNP is known, identification of these germline variants in asymptomatic family members would allow early screening of PA to be offered and potentially provide novel drug targets to treat the disease. References: 1Timberlake et al., Curr Opin Nephrol Hypertens. 2001 Jan;10(1):71-9. 2MacKenzie et al., Int J Mol Sci. 2017 Mar 7;18(3). pii: E579. 3Teo et al., Genome Res. 2009 Nov;19(11):2154-62.


2021 ◽  
Vol 12 ◽  
Author(s):  
Alison G. Nazareno ◽  
L. Lacey Knowles ◽  
Christopher W. Dick ◽  
Lúcia G. Lohmann

Seed dispersal is crucial to gene flow among plant populations. Although the effects of geographic distance and barriers to gene flow are well studied in many systems, it is unclear how seed dispersal mediates gene flow in conjunction with interacting effects of geographic distance and barriers. To test whether distinct seed dispersal modes (i.e., hydrochory, anemochory, and zoochory) have a consistent effect on the level of genetic connectivity (i.e., gene flow) among populations of riverine plant species, we used unlinked single-nucleotide polymorphisms (SNPs) for eight co-distributed plant species sampled across the Rio Branco, a putative biogeographic barrier in the Amazon basin. We found that animal-dispersed plant species exhibited higher levels of genetic diversity and lack of inbreeding as a result of the stronger genetic connectivity than plant species whose seeds are dispersed by water or wind. Interestingly, our results also indicated that the Rio Branco facilitates gene dispersal for all plant species analyzed, irrespective of their mode of dispersal. Even at a small spatial scale, our findings suggest that ecology rather than geography play a key role in shaping the evolutionary history of plants in the Amazon basin. These results may help improve conservation and management policies in Amazonian riparian forests, where degradation and deforestation rates are high.


F1000Research ◽  
2019 ◽  
Vol 8 ◽  
pp. 318
Author(s):  
Md. Bazlur Rahman Mollah ◽  
Md. Shamsul Alam Bhuiyan ◽  
M.A.M. Yahia Khandoker ◽  
Md. Abdul Jalil ◽  
Gautam Kumar Deb ◽  
...  

The Black Bengal goat (BBG) is a dwarf sized heritage goat (Capra hircus) breed from Bangladesh, and is well known for its high fertility, excellent meat and skin quality. Here we present the first whole genome sequence and genome-wide distributed single nucleotide polymorphisms (SNPs) of the BBG. A total of 833,469,900 raw reads consisting of 125,020,485,000 bases were obtained by sequencing one male BBG sample. The reads were aligned to the San Clemente and the Yunnan black goat genome which resulted in 98.65% (properly paired, 94.81%) and 98.50% (properly paired, 97.10%) of the reads aligning, respectively. Notably, the estimated sequencing coverages were 48.22X and 44.28X compared to published San Clemente and the Yunnan black goat genomes respectively. On the other hand, a total of 9,497,875 high quality SNPs (Q ≥ 20) along with 1,023,359 indels, and 8,746,849 high quality SNPs along with 842,706 indels were identified in BBG against the San Clemente and Yunnan black goat genomes respectively. The dataset is publicly available from NCBI BioSample (SAMN10391846), Sequence Read Archive (SRR8182317, SRR8549413 and SRR8549904), with BioProject ID PRJNA504436. These data might be useful genomic resources in conducting genome wide association studies, identification of quantitative trait loci (QTLs) and functional genomic analysis of the Black Bengal goat.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Kira Delmore ◽  
Juan Carlos Illera ◽  
Javier Pérez-Tris ◽  
Gernot Segelbacher ◽  
Juan S Lugo Ramos ◽  
...  

Seasonal migration is a taxonomically widespread behaviour that integrates across many traits. The European blackcap exhibits enormous variation in migration and is renowned for research on its evolution and genetic basis. We assembled a reference genome for blackcaps and obtained whole genome resequencing data from individuals across its breeding range. Analyses of population structure and demography suggested divergence began ~30,000 ya, with evidence for one admixture event between migrant and resident continent birds ~5000 ya. The propensity to migrate, orientation and distance of migration all map to a small number of genomic regions that do not overlap with results from other species, suggesting that there are multiple ways to generate variation in migration. Strongly associated single nucleotide polymorphisms (SNPs) were located in regulatory regions of candidate genes that may serve as major regulators of the migratory syndrome. Evidence for selection on shared variation was documented, providing a mechanism by which rapid changes may evolve.


2021 ◽  
Author(s):  
Huaxing Zhou ◽  
Tingshuang Pan ◽  
Huan Wang ◽  
He Jiang ◽  
Jun Ling ◽  
...  

Abstract The whole genome resequencing was used to develop single nucleotide polymorphisms (SNP) markers for the yellow catfish (Tachysurus fulvidraco). A total of 46 SNP markers were selected from 5550676 genotyping markers which distributed on 26 chromosomes. Of the 46 SNPs analyzed, 35 SNPs conformed to Hardy-Weinberg equilibrium. The observed and expected heterozygosity of these markers ranged from 0.2519 to 0.771 and from 0.265 to 0.5018, respectively. This set of markers will be of great useful for population genetics of the yellow catfish.


2013 ◽  
Vol 17 (6) ◽  
pp. 501-503 ◽  
Author(s):  
Steffen Bank ◽  
Bjørn Andersen Nexø ◽  
Vibeke Andersen ◽  
Ulla Vogel ◽  
Paal Skytt Andersen

2020 ◽  
Vol 18 (3) ◽  
pp. e0405
Author(s):  
Yousef Naderi ◽  
Saadat Sadeghi

Aim of study: To predict genomic accuracy of binary traits considering different rates of disease incidence.Area of study: SimulationMaterial and methods: Two machine learning algorithms including Boosting and Random Forest (RF) as well as threshold BayesA (TBA) and genomic BLUP (GBLUP) were employed. The predictive ability methods were evaluated for different genomic architectures using imputed (i.e. 2.5K, 12.5K and 25K panels) and their original 50K genotypes. We evaluated the three strategies with different rates of disease incidence (including 16%, 50% and 84% threshold points) and their effects on genomic prediction accuracy.Main results: Genotype imputation performed poorly to estimate the predictive ability of GBLUP, RF, Boosting and TBA methods when using the low-density single nucleotide polymorphisms (SNPs) chip in low linkage disequilibrium (LD) scenarios. The highest predictive ability, when the rate of disease incidence into the training set was 16%, belonged to GBLUP, RF, Boosting and TBA methods. Across different genomic architectures, the Boosting method performed better than TBA, GBLUP and RF methods for all scenarios and proportions of the marker sets imputed. Regarding the changes, the RF resulted in a further reduction compared to Boosting, TBA and GBLUP, especially when the applied data set contained 2.5K panels of the imputed genotypes.Research highlights: Generally, considering high sensitivity of methods to imputation errors, the application of imputed genotypes using RF method should be carefully evaluated.


Sign in / Sign up

Export Citation Format

Share Document