Linkage disequilibrium maps for European and African populations constructed from whole genome sequence data

Abstract Quantification of linkage disequilibrium (LD) patterns in the human genome is essential for genome-wide association studies, selection signature mapping and studies of recombination. Whole genome sequence (WGS) data provides optimal source data for this quantification as it is free from biases introduced by the design of array genotyping platforms. The Malécot-Morton model of LD allows the creation of a cumulative map for each choromosome, analogous to an LD form of a linkage map. Here we report LD maps generated from WGS data for a large population of European ancestry, as well as populations of Baganda, Ethiopian and Zulu ancestry. We achieve high average genetic marker densities of 2.3–4.6/kb. These maps show good agreement with prior, low resolution maps and are consistent between populations. Files are provided in BED format to allow researchers to readily utilise this resource.

Download Full-text

A reference panel of 64,976 haplotypes for genotype imputation

10.1101/035170 ◽

2015 ◽

Cited By ~ 17

Author(s):

Shane McCarthy ◽

Sayantan Das ◽

Warren Kretzschmar ◽

Olivier Delaneau ◽

Andrew R. Wood ◽

...

Keyword(s):

Genome Sequence ◽

Sequence Data ◽

Association Studies ◽

Allele Frequencies ◽

Genotype Imputation ◽

Reference Panel ◽

Whole Genome Sequence ◽

European Ancestry ◽

Whole Genome ◽

Remote Server

We describe a reference panel of 64,976 human haplotypes at 39,235,157 SNPs constructed using whole genome sequence data from 20 studies of predominantly European ancestry. Using this resource leads to accurate genotype imputation at minor allele frequencies as low as 0.1%, a large increase in the number of SNPs tested in association studies and can help to discover and refine causal loci. We describe remote server resources that allow researchers to carry out imputation and phasing consistently and efficiently.

Download Full-text

Imputation to whole-genome sequence using multiple pig populations and its use in genome-wide association studies

Genetics Selection Evolution ◽

10.1186/s12711-019-0445-y ◽

2019 ◽

Vol 51 (1) ◽

Cited By ~ 8

Author(s):

Sanne van den Berg ◽

Jérémie Vandenplas ◽

Fred A. van Eeuwijk ◽

Aniek C. Bouwman ◽

Marcos S. Lopes ◽

...

Keyword(s):

Genome Sequence ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequence ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Genome Wide

Download Full-text

Genome-wide association studies of growth traits in three dairy cattle breeds using whole-genome sequence data1

Journal of Animal Science ◽

10.2527/jas.2015-9838 ◽

2016 ◽

Vol 94 (4) ◽

pp. 1426-1437 ◽

Cited By ~ 19

Author(s):

X. Mao ◽

G. Sahana ◽

D.-J. De Koning ◽

B. Guldbrandtsen

Keyword(s):

Dairy Cattle ◽

Genome Sequence ◽

Growth Traits ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequence ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Cattle Breeds ◽

Genome Wide

Download Full-text

A conditional multi-trait sequence GWAS discovers pleiotropic candidate genes and variants for sheep wool, skin wrinkle and breech cover traits

Genetics Selection Evolution ◽

10.1186/s12711-021-00651-0 ◽

2021 ◽

Vol 53 (1) ◽

Author(s):

Sunduimijid Bolormaa ◽

Andrew A. Swan ◽

Paul Stothard ◽

Majid Khansefid ◽

Nasir Moghaddar ◽

...

Keyword(s):

Candidate Genes ◽

Genome Sequence ◽

Sequence Data ◽

Association Studies ◽

Whole Genome Sequence ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Merino Wool ◽

Causal Variants ◽

Wool Proteins

Abstract Background Imputation to whole-genome sequence is now possible in large sheep populations. It is therefore of interest to use this data in genome-wide association studies (GWAS) to investigate putative causal variants and genes that underpin economically important traits. Merino wool is globally sought after for luxury fabrics, but some key wool quality attributes are unfavourably correlated with the characteristic skin wrinkle of Merinos. In turn, skin wrinkle is strongly linked to susceptibility to “fly strike” (Cutaneous myiasis), which is a major welfare issue. Here, we use whole-genome sequence data in a multi-trait GWAS to identify pleiotropic putative causal variants and genes associated with changes in key wool traits and skin wrinkle. Results A stepwise conditional multi-trait GWAS (CM-GWAS) identified putative causal variants and related genes from 178 independent quantitative trait loci (QTL) of 16 wool and skin wrinkle traits, measured on up to 7218 Merino sheep with 31 million imputed whole-genome sequence (WGS) genotypes. Novel candidate gene findings included the MAT1A gene that encodes an enzyme involved in the sulphur metabolism pathway critical to production of wool proteins, and the ESRP1 gene. We also discovered a significant wrinkle variant upstream of the HAS2 gene, which in dogs is associated with the exaggerated skin folds in the Shar-Pei breed. Conclusions The wool and skin wrinkle traits studied here appear to be highly polygenic with many putative candidate variants showing considerable pleiotropy. Our CM-GWAS identified many highly plausible candidate genes for wool traits as well as breech wrinkle and breech area wool cover.

Download Full-text

Comparing power and precision of within-breed and multibreed genome-wide association studies of production traits using whole-genome sequence data for 5 French and Danish dairy cattle breeds

Journal of Dairy Science ◽

10.3168/jds.2016-11073 ◽

2016 ◽

Vol 99 (11) ◽

pp. 8932-8945 ◽

Cited By ~ 21

Author(s):

Irene van den Berg ◽

Didier Boichard ◽

Mogens Sandø Lund

Keyword(s):

Dairy Cattle ◽

Sequence Data ◽

Association Studies ◽

Genome Wide Association ◽

Whole Genome Sequence ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Production Traits ◽

Genome Wide ◽

Danish Dairy

Download Full-text

Recovery of trait heritability from whole genome sequence data

10.1101/588020 ◽

2019 ◽

Cited By ~ 56

Author(s):

Pierrick Wainschtein ◽

Deepti P. Jain ◽

Loic Yengo ◽

Zhili Zheng ◽

L. Adrienne Cupples ◽

...

Keyword(s):

Genome Sequence ◽

Association Studies ◽

Whole Genome Sequence ◽

European Ancestry ◽

Phenotypic Variance ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Common Snps ◽

Pedigree Data ◽

Causal Variants

AbstractHeritability, the proportion of phenotypic variance explained by genetic factors, can be estimated from pedigree data 1, but such estimates are uninformative with respect to the underlying genetic architecture. Analyses of data from genome-wide association studies (GWAS) on unrelated individuals have shown that for human traits and disease, approximately one-third to two-thirds of heritability is captured by common SNPs 2–5. It is not known whether the remaining heritability is due to the imperfect tagging of causal variants by common SNPs, in particular if the causal variants are rare, or other reasons such as over-estimation of heritability from pedigree data. Here we show that pedigree heritability for height and body mass index (BMI) appears to be fully recovered from whole-genome sequence (WGS) data on 21,620 unrelated individuals of European ancestry. We assigned 47.1 million genetic variants to groups based upon their minor allele frequencies (MAF) and linkage disequilibrium (LD) with variants nearby, and estimated and partitioned variation accordingly. The estimated heritability was 0.79 (SE 0.09) for height and 0.40 (SE 0.09) for BMI, consistent with pedigree estimates. Low-MAF variants in low LD with neighbouring variants were enriched for heritability, to a greater extent for protein altering variants, consistent with negative selection thereon. Cumulatively variants in the MAF range of 0.0001 to 0.1 explained 0.54 (SE 0.05) and 0.51 (SE 0.11) of heritability for height and BMI, respectively. Our results imply that the still missing heritability of complex traits and disease is accounted for by rare variants, in particular those in regions of low LD.

Download Full-text

A multi-trait meta-analysis with imputed sequence variants reveals twelve QTL for mammary gland morphology in Fleckvieh cattle

10.1101/030981 ◽

2015 ◽

Author(s):

Hubert Pausch ◽

Reiner Emmerling ◽

Hermann Schwarzenbacher ◽

Ruedi Fries

Keyword(s):

Mammary Gland ◽

Association Analysis ◽

Genome Sequence ◽

Sequence Data ◽

Association Studies ◽

Whole Genome Sequence ◽

Sequence Variants ◽

Whole Genome ◽

Genome Sequence Data ◽

Mammary Gland Morphology

Background: The availability of whole-genome sequence data from key ancestors provides an exhaustive catalogue of polymorphic sites segregating within and across cattle breeds. Sequence variants from key ancestors can be imputed in animals that have been genotyped using medium- and high-density genotyping arrays. Association analysis with imputed sequences, particularly if applied to multiple traits simultaneously, is a very powerful approach to revealing candidate causal variants underlying complex phenotypes. Results: We used whole-genome sequence data from 157 key ancestors of the German Fleckvieh population to impute 20 561 798 sequence variants in 10 363 animals that had (partly imputed) array-derived genotypes at 634 109 SNP. The imputed sequence data were enriched for rare variants. Association studies with imputed sequence variants were performed using seven correlated udder conformation traits as response variables. The calculation of an approximate multi-trait test statistic enabled us to detect twelve major QTL (P<2.97 x 10-9) controlling different aspects of mammary gland morphology. Imputed sequence variants were the most significantly associated at eleven QTL, whereas the top association signal at a QTL on BTA14 resulted from an array-derived variant. Seven QTL were associated with multiple phenotypes. Most QTL were located in non-coding regions of the genome in close neighborhood, however, to plausible candidate genes for mammary gland morphology (SP5, GC, NPFFR2, CRIM1, RXFP2, TBX5, RBM19, ADAM12). Conclusions: Association analysis with imputed sequence variants allows QTL characterization at maximum resolution. Multi-trait approaches can reveal QTL that are not detected in single-trait association studies. Most QTL for udder conformation traits were located in non-coding elements of the genome suggesting regulatory mutations to be the major determinants of variation in mammary gland morphology in cattle.

Download Full-text

Aquila: diploid personal genome assembly and comprehensive variant detection based on linked reads

10.1101/660605 ◽

2019 ◽

Cited By ~ 1

Author(s):

Xin Zhou ◽

Lu Zhang ◽

Ziming Weng ◽

David L. Dill ◽

Arend Sidow

Keyword(s):

Genetic Variation ◽

Genome Sequence ◽

Genome Assembly ◽

Sequence Data ◽

Association Studies ◽

Cost Effective ◽

Whole Genome Sequence ◽

Personal Genome ◽

Whole Genome ◽

Nucleotide Polymorphisms

AbstractVariant discovery in personal, whole genome sequence data is critical for uncovering the genetic contributions to health and disease. We introduce a new approach, Aquila, that uses linked-read data for generating a high quality diploid genome assembly, from which it then comprehensively detects and phases personal genetic variation. Assemblies cover >95% of the human reference genome, with over 98% in a diploid state. Thus, the assemblies support detection and accurate genotyping of the most prevalent types of human genetic variation, including single nucleotide polymorphisms (SNPs), small insertions and deletions (small indels), and structural variants (SVs), in all but the most difficult regions. All heterozygous variants are phased in blocks that can approach arm-level length. The final output of Aquila is a diploid and phased personal genome sequence, and a phased VCF file that also contains homozygous and a few unphased heterozygous variants. Aquila represents a cost-effective evolution of whole-genome reconstruction that can be applied to cohorts for variation discovery or association studies, or to single individuals with rare phenotypes that could be caused by SVs or compound heterozygosity.

Download Full-text

Genome-wide association study of endo-parasite phenotypes using imputed whole-genome sequence data in dairy and beef cattle

Genetics Selection Evolution ◽

10.1186/s12711-019-0457-7 ◽

2019 ◽

Vol 51 (1) ◽

Cited By ~ 10

Author(s):

Alan J. Twomey ◽

Donagh P. Berry ◽

Ross D. Evans ◽

Michael L. Doherty ◽

David A. Graham ◽

...

Keyword(s):

Beef Cattle ◽

Association Study ◽

Genome Sequence ◽

Genome Wide Association Study ◽

Sequence Data ◽

Genome Wide Association ◽

Whole Genome Sequence ◽

Whole Genome ◽

Genome Sequence Data ◽

Genome Wide

Download Full-text

Pan-Cancer Study Detects Novel Genetic Risk Variants and Shared Genetic Basis in Two Large Cohorts

10.1101/635367 ◽

2019 ◽

Cited By ~ 3

Author(s):

Sara R. Rashkin ◽

Rebecca E. Graff ◽

Linda Kachuri ◽

Khanh K. Thai ◽

Stacey E. Alexeeff ◽

...

Keyword(s):

Genetic Basis ◽

Association Studies ◽

Large Population ◽

Regulatory Elements ◽

European Ancestry ◽

Genome Wide Association Studies ◽

Risk Variants ◽

Genome Wide ◽

Pan Cancer ◽

Shared Genetic

AbstractDeciphering the shared genetic basis of distinct cancers has the potential to elucidate carcinogenic mechanisms and inform broadly applicable risk assessment efforts. However, no studies have investigated pan-cancer pleiotropy within single, well-defined populations unselected for phenotype. We undertook novel genome-wide association studies (GWAS) and comprehensive evaluations of heritability and pleiotropy across 18 cancer types in two large, population-based cohorts: the UK Biobank (413,870 European ancestry individuals; 48,961 cancer cases) and the Kaiser Permanente Genetic Epidemiology Research on Adult Health and Aging cohorts (66,526 European ancestry individuals; 16,001 cancer cases). The GWAS detected 21 novel genome-wide significant risk variants. In addition, numerous cancer sites exhibited clear heritability. Investigations of pleiotropy identified 12 cancer pairs exhibiting either positive or negative genetic correlations and 43 pleiotropic loci. We identified 158 pleiotropic variants, many of which were enriched for regulatory elements and influenced cross-tissue gene expression. Our findings demonstrate widespread pleiotropy and offer further insight into the complex genetic architecture of cross-cancer susceptibility.

Download Full-text