scholarly journals Linkage disequilibrium maps for European and African populations constructed from whole genome sequence data

2019 ◽  
Vol 6 (1) ◽  
Author(s):  
Alejandra Vergara-Lope ◽  
M. Reza Jabalameli ◽  
Clare Horscroft ◽  
Sarah Ennis ◽  
Andrew Collins ◽  
...  

Abstract Quantification of linkage disequilibrium (LD) patterns in the human genome is essential for genome-wide association studies, selection signature mapping and studies of recombination. Whole genome sequence (WGS) data provides optimal source data for this quantification as it is free from biases introduced by the design of array genotyping platforms. The Malécot-Morton model of LD allows the creation of a cumulative map for each choromosome, analogous to an LD form of a linkage map. Here we report LD maps generated from WGS data for a large population of European ancestry, as well as populations of Baganda, Ethiopian and Zulu ancestry. We achieve high average genetic marker densities of 2.3–4.6/kb. These maps show good agreement with prior, low resolution maps and are consistent between populations. Files are provided in BED format to allow researchers to readily utilise this resource.

2015 ◽  
Author(s):  
Shane McCarthy ◽  
Sayantan Das ◽  
Warren Kretzschmar ◽  
Olivier Delaneau ◽  
Andrew R. Wood ◽  
...  

We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1%, a large increase in the number of SNPs tested in association studies and can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.


2019 ◽  
Vol 51 (1) ◽  
Author(s):  
Sanne van den Berg ◽  
Jérémie Vandenplas ◽  
Fred A. van Eeuwijk ◽  
Aniek C. Bouwman ◽  
Marcos S. Lopes ◽  
...  

2021 ◽  
Vol 53 (1) ◽  
Author(s):  
Sunduimijid Bolormaa ◽  
Andrew A. Swan ◽  
Paul Stothard ◽  
Majid Khansefid ◽  
Nasir Moghaddar ◽  
...  

Abstract Background Imputation to whole-genome sequence is now possible in large sheep populations. It is therefore of interest to use this data in genome-wide association studies (GWAS) to investigate putative causal variants and genes that underpin economically important traits. Merino wool is globally sought after for luxury fabrics, but some key wool quality attributes are unfavourably correlated with the characteristic skin wrinkle of Merinos. In turn, skin wrinkle is strongly linked to susceptibility to “fly strike” (Cutaneous myiasis), which is a major welfare issue. Here, we use whole-genome sequence data in a multi-trait GWAS to identify pleiotropic putative causal variants and genes associated with changes in key wool traits and skin wrinkle. Results A stepwise conditional multi-trait GWAS (CM-GWAS) identified putative causal variants and related genes from 178 independent quantitative trait loci (QTL) of 16 wool and skin wrinkle traits, measured on up to 7218 Merino sheep with 31 million imputed whole-genome sequence (WGS) genotypes. Novel candidate gene findings included the MAT1A gene that encodes an enzyme involved in the sulphur metabolism pathway critical to production of wool proteins, and the ESRP1 gene. We also discovered a significant wrinkle variant upstream of the HAS2 gene, which in dogs is associated with the exaggerated skin folds in the Shar-Pei breed. Conclusions The wool and skin wrinkle traits studied here appear to be highly polygenic with many putative candidate variants showing considerable pleiotropy. Our CM-GWAS identified many highly plausible candidate genes for wool traits as well as breech wrinkle and breech area wool cover.


2019 ◽  
Author(s):  
Pierrick Wainschtein ◽  
Deepti P. Jain ◽  
Loic Yengo ◽  
Zhili Zheng ◽  
L. Adrienne Cupples ◽  
...  

AbstractHeritability, the proportion of phenotypic variance explained by genetic factors, can be estimated from pedigree data 1, but such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies (GWAS) on unrelated individuals have shown that for human traits and disease, approximately one-third to two-thirds of heritability is captured by common SNPs 2–5. It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as over-estimation of heritability from pedigree data. Here we show that pedigree heritability for height and body mass index (BMI) appears to be fully recovered from whole-genome sequence (WGS) data on 21,620 unrelated individuals of European ancestry. We assigned 47.1 million genetic variants to groups based upon their minor allele frequencies (MAF) and linkage disequilibrium (LD) with variants nearby, and estimated and partitioned variation accordingly. The estimated heritability was 0.79 (SE 0.09) for height and 0.40 (SE 0.09) for BMI, consistent with pedigree estimates. Low-MAF variants in low LD with neighbouring variants were enriched for heritability, to a greater extent for protein altering variants, consistent with negative selection thereon. Cumulatively variants in the MAF range of 0.0001 to 0.1 explained 0.54 (SE 0.05) and 0.51 (SE 0.11) of heritability for height and BMI, respectively. Our results imply that the still missing heritability of complex traits and disease is accounted for by rare variants, in particular those in regions of low LD.


2015 ◽  
Author(s):  
Hubert Pausch ◽  
Reiner Emmerling ◽  
Hermann Schwarzenbacher ◽  
Ruedi Fries

Background: The availability of whole-genome sequence data from key ancestors provides an exhaustive catalogue of polymorphic sites segregating within and across cattle breeds. Sequence variants from key ancestors can be imputed in animals that have been genotyped using medium- and high-density genotyping arrays. Association analysis with imputed sequences, particularly if applied to multiple traits simultaneously, is a very powerful approach to revealing candidate causal variants underlying complex phenotypes. Results: We used whole-genome sequence data from 157 key ancestors of the German Fleckvieh population to impute 20 561 798 sequence variants in 10 363 animals that had (partly imputed) array-derived genotypes at 634 109 SNP. The imputed sequence data were enriched for rare variants. Association studies with imputed sequence variants were performed using seven correlated udder conformation traits as response variables. The calculation of an approximate multi-trait test statistic enabled us to detect twelve major QTL (P<2.97 x 10-9) controlling different aspects of mammary gland morphology. Imputed sequence variants were the most significantly associated at eleven QTL, whereas the top association signal at a QTL on BTA14 resulted from an array-derived variant. Seven QTL were associated with multiple phenotypes. Most QTL were located in non-coding regions of the genome in close neighborhood, however, to plausible candidate genes for mammary gland morphology (SP5, GC, NPFFR2, CRIM1, RXFP2, TBX5, RBM19, ADAM12). Conclusions: Association analysis with imputed sequence variants allows QTL characterization at maximum resolution. Multi-trait approaches can reveal QTL that are not detected in single-trait association studies. Most QTL for udder conformation traits were located in non-coding elements of the genome suggesting regulatory mutations to be the major determinants of variation in mammary gland morphology in cattle.


2019 ◽  
Author(s):  
Xin Zhou ◽  
Lu Zhang ◽  
Ziming Weng ◽  
David L. Dill ◽  
Arend Sidow

AbstractVariant discovery in personal, whole genome sequence data is critical for uncovering the genetic contributions to health and disease. We introduce a new approach, Aquila, that uses linked-read data for generating a high quality diploid genome assembly, from which it then comprehensively detects and phases personal genetic variation. Assemblies cover >95% of the human reference genome, with over 98% in a diploid state. Thus, the assemblies support detection and accurate genotyping of the most prevalent types of human genetic variation, including single nucleotide polymorphisms (SNPs), small insertions and deletions (small indels), and structural variants (SVs), in all but the most difficult regions. All heterozygous variants are phased in blocks that can approach arm-level length. The final output of Aquila is a diploid and phased personal genome sequence, and a phased VCF file that also contains homozygous and a few unphased heterozygous variants. Aquila represents a cost-effective evolution of whole-genome reconstruction that can be applied to cohorts for variation discovery or association studies, or to single individuals with rare phenotypes that could be caused by SVs or compound heterozygosity.


2019 ◽  
Author(s):  
Sara R. Rashkin ◽  
Rebecca E. Graff ◽  
Linda Kachuri ◽  
Khanh K. Thai ◽  
Stacey E. Alexeeff ◽  
...  

AbstractDeciphering the shared genetic basis of distinct cancers has the potential to elucidate carcinogenic mechanisms and inform broadly applicable risk assessment efforts. However, no studies have investigated pan-cancer pleiotropy within single, well-defined populations unselected for phenotype. We undertook novel genome-wide association studies (GWAS) and comprehensive evaluations of heritability and pleiotropy across 18 cancer types in two large, population-based cohorts: the UK Biobank (413,870 European ancestry individuals; 48,961 cancer cases) and the Kaiser Permanente Genetic Epidemiology Research on Adult Health and Aging cohorts (66,526 European ancestry individuals; 16,001 cancer cases). The GWAS detected 21 novel genome-wide significant risk variants. In addition, numerous cancer sites exhibited clear heritability. Investigations of pleiotropy identified 12 cancer pairs exhibiting either positive or negative genetic correlations and 43 pleiotropic loci. We identified 158 pleiotropic variants, many of which were enriched for regulatory elements and influenced cross-tissue gene expression. Our findings demonstrate widespread pleiotropy and offer further insight into the complex genetic architecture of cross-cancer susceptibility.


Sign in / Sign up

Export Citation Format

Share Document