The impact of non-additive genetic associations on age-related complex diseases

AbstractGenome-wide association studies (GWAS) are not fully comprehensive, as current strategies typically test only the additive model, exclude the X chromosome, and use only one reference panel for genotype imputation. We implement an extensive GWAS strategy, GUIDANCE, which improves genotype imputation by using multiple reference panels and includes the analysis of the X chromosome and non-additive models to test for association. We apply this methodology to 62,281 subjects across 22 age-related diseases and identify 94 genome-wide associated loci, including 26 previously unreported. Moreover, we observe that 27.7% of the 94 loci are missed if we use standard imputation strategies with a single reference panel, such as HRC, and only test the additive model. Among the new findings, we identify three novel low-frequency recessive variants with odds ratios larger than 4, which need at least a three-fold larger sample size to be detected under the additive model. This study highlights the benefits of applying innovative strategies to better uncover the genetic architecture of complex diseases.

Download Full-text

The impact of non-additive genetic associations on age-related complex diseases

10.1101/2020.05.12.084608 ◽

2020 ◽

Author(s):

Marta Guindo-Martínez ◽

Ramon Amela ◽

Silvia Bonàs-Guarch ◽

Montserrat Puiggròs ◽

Cecilia Salvoro ◽

...

Keyword(s):

X Chromosome ◽

Additive Model ◽

Complex Diseases ◽

Additive Models ◽

Genotype Imputation ◽

Genome Wide Association Studies ◽

Age Related ◽

Genome Wide ◽

New Findings ◽

The Impact

AbstractGenome-wide association studies (GWAS) are not fully comprehensive as current strategies typically test only the additive model, exclude the X chromosome, and use only one reference panel for genotype imputation. We implemented an extensive GWAS strategy, GUIDANCE, which improves genotype imputation by using multiple reference panels, includes the analysis of the X chromosome and non-additive models to test for association. We applied this methodology to 62,281 subjects across 22 age-related diseases and identified 94 genome-wide associated loci, including 26 previously unreported. We observed that 27.6% of the 94 loci would be missed if we only used standard imputation strategies and only tested the additive model. Among the new findings, we identified three novel low-frequency recessive variants with odds ratios larger than 4, which would need at least a three-fold larger sample size to be detected under the additive model. This study highlights the benefits of applying innovative strategies to better uncover the genetic architecture of complex diseases.

Download Full-text

A one penny imputed genome from next generation reference panels

10.1101/357806 ◽

2018 ◽

Cited By ~ 1

Author(s):

Brian L. Browning ◽

Ying Zhou ◽

Sharon R. Browning

Keyword(s):

Association Studies ◽

Computational Cost ◽

Computation Time ◽

Genotype Imputation ◽

Reference Panel ◽

Genome Wide Association Studies ◽

Panel Size ◽

Genome Wide ◽

New Genotype ◽

Reference Samples

AbstractGenotype imputation is commonly performed in genome-wide association studies because it greatly increases the number of markers that can be tested for association with a trait. In general, one should perform genotype imputation using the largest reference panel that is available because the number of accurately imputed variants increases with reference panel size. However, one impediment to using larger reference panels is the increased computational cost of imputation. We present a new genotype imputation method, Beagle 5.0, which greatly reduces the computational cost of imputation from large reference panels. We compare Beagle 5.0 with Beagle 4.1, Impute4, Minimac3, and Minimac4 using 1000 Genomes Project data, Haplotype Reference Consortium data, and simulated data for 10k, 100k, 1M, and 10M reference samples. All methods produce nearly identical accuracy, but Beagle 5.0 has the lowest computation time and the best scaling of computation time with increasing reference panel size. For 10k, 100k, 1M, and 10M reference samples and 1000 phased target samples, Beagle 5.0’s computation time is 3× (10k), 12× (100k), 43× (1M), and 533× (10M) faster than the fastest alternative method. Cost data from the Amazon Elastic Compute Cloud show that Beagle 5.0 can perform genome-wide imputation from 10M reference samples into 1000 phased target samples at a cost of less than one US cent per sample.Beagle 5.0 is freely available from https://faculty.washington.edu/browning/beagle/beagle.html.

Download Full-text

Advantages of genotype imputation with ethnically matched reference panel for rare variant association analyses

10.1101/579201 ◽

2019 ◽

Cited By ~ 4

Author(s):

Mart Kals ◽

Tiit Nikopensius ◽

Kristi Läll ◽

Kalle Pärn ◽

Timo Tõnis Sikka ◽

...

Keyword(s):

Rare Variant ◽

Rare Variants ◽

Association Studies ◽

Low Frequency ◽

Genotype Imputation ◽

Reference Panel ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Variant Analysis ◽

Coding Variants

AbstractGenotype imputation has become a standard procedure prior genome-wide association studies (GWASs). For common and low-frequency variants, genotype imputation can be performed sufficiently accurately with publicly available and ethnically heterogeneous reference datasets like 1000 Genomes Project (1000G) and Haplotype Reference Consortium panels. However, the imputation of rare variants has been shown to be significantly more accurate when ethnically matched reference panel is used. Even more, greater genetic similarity between reference panel and target samples facilitates the detection of rare (or even population-specific) causal variants. Notwithstanding, the genome-wide downstream consequences and differences of using ethnically mixed and matched reference panels have not been yet comprehensively explored.We determined and quantified these differences by performing several comparative evaluations of the discovery-driven analysis scenarios. A variant-wise GWAS was performed on seven complex diseases and body mass index by using genome-wide genotype data of ∼37,000 Estonians imputed with ethnically mixed 1000G and ethnically matched imputation reference panels. Although several previously reported common (minor allele frequency; MAF > 5%) variant associations were replicated in both resulting imputed datasets, no major differences were observed among the genome-wide significant findings or in the fine-mapping effort. In the analysis of rare (MAF < 1%) coding variants, 46 significantly associated genes were identified in the ethnically matched imputed data as compared to four genes in the 1000G panel based imputed data. All resulting genes were consequently studied in the UK Biobank data.These associations provide a solid example of how rare variants can be efficiently analysed to discover novel, potentially functional genetic variants in relevant phenotypes. Furthermore, our work serves as proof of a cost-efficient study design, demonstrating that the usage of ethnically matched imputation reference panels can enable substantially improved imputation of rare variants, facilitating novel high-confidence findings in rare variant GWAS scans.Author summaryOver the last decade, genome-wide association studies (GWASs) have been widely used for detecting genetic biomarkers in a wide range of traits. Typically, GWASs are carried out using chip-based genotyping data, which are then combined with a more densely genotyped reference panel to infer untyped genetic variants in chip-typed individuals. The latter method is called genotype imputation and its accuracy depends on multiple factors. Publicly available and ethnically heterogeneous imputation reference panels (IRPs) such as 1000 Genomes Project (1000G) are sufficiently accurate for imputation of common and low-frequency variants, but custom ethnically matched IRPs outperform these in case of rare variants. In this work, we systematically compare downstream association analysis effects on eight complex traits in ∼37,000 Estonians imputed with ethnically mixed and ethnically matched IRPs. We do not observe major differences in the single variant analysis, where both imputed datasets replicate previously reported significant loci. But in the gene-based analysis of rare protein-coding variants we show that ethnically matched panel clearly outperforms 1000G panel based imputation, providing 10-fold increase in significant gene-trait associations. Our study demonstrates empirically that imputed data based on ethnically matched panel is very promising for rare variant analysis – it captures more population-specific variants and makes it possible to efficiently identify novel findings.

Download Full-text

A Two-State Epistasis Model Reduces Missing Heritability of Complex Traits

10.1101/017491 ◽

2015 ◽

Cited By ~ 1

Author(s):

Kerry L. Bubb ◽

Christine Queitsch

Keyword(s):

Complex Traits ◽

Association Studies ◽

Additive Model ◽

Twin Studies ◽

Additive Models ◽

Genome Wide Association Studies ◽

Missing Heritability ◽

Epistatic Interactions ◽

Epistasis Model ◽

Genome Wide

ABSTRACTDespite decade-long efforts, the genetic underpinnings of many complex traits and diseases remain largely elusive. It is increasingly recognized that a purely additive model, upon which most genome-wide association studies (GWAS) rely, is insufficient. Although thousands of significant trait-associated loci have been identified, purely additive models leave much of the inferred genetic variance unexplained. Several factors have been invoked to explain the ‘missing heritability’, including epistasis. Accounting for all possible epistatic interactions is computationally complex and requires very large samples. Here, we propose a simple two-state epistasis model, in which individuals show either high or low variant penetrance with respect to a certain trait. The use of this model increases the power to detect additive trait-associated loci. We show that this model is consistent with current GWAS results and improves fit with heritability observations based on twin studies. We suggest that accounting for variant penetrance will significantly increase our power to identify underlying additive loci.

Download Full-text

The Impact of Phenotypic and Genetic Heterogeneity on Results of Genome Wide Association Studies of Complex Diseases

PLoS ONE ◽

10.1371/journal.pone.0076295 ◽

2013 ◽

Vol 8 (10) ◽

pp. e76295 ◽

Cited By ~ 106

Author(s):

Mirko Manchia ◽

Jeffrey Cullis ◽

Gustavo Turecki ◽

Guy A. Rouleau ◽

Rudolf Uher ◽

...

Keyword(s):

Genetic Heterogeneity ◽

Association Studies ◽

Complex Diseases ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

The Impact

Download Full-text

Genotype imputation using the Positional Burrows Wheeler Transform

10.1101/797944 ◽

2019 ◽

Cited By ~ 7

Author(s):

Simone Rubinacci ◽

Olivier Delaneau ◽

Jonathan Marchini

Keyword(s):

Association Studies ◽

Computational Cost ◽

Computation Time ◽

Fold Increase ◽

Genotype Imputation ◽

Reference Panel ◽

Genome Wide Association Studies ◽

Panel Size ◽

Genome Wide ◽

Burrows Wheeler Transform

AbstractGenotype imputation is the process of predicting unobserved genotypes in a sample of individuals using a reference panel of haplotypes. In the last 10 years reference panels have increased in size by more than 100 fold. Increasing reference panel size improves accuracy of markers with low minor allele frequencies but poses ever increasing computational challenges for imputation methods.Here we present IMPUTE5, a genotype imputation method that can scale to reference panels with millions of samples. This method continues to refine the observation made in the IMPUTE2 method, that accuracy is optimized via use of a custom subset of haplotypes when imputing each individual. It achieves fast, accurate, and memory-efficient imputation by selecting haplotypes using the Positional Burrows Wheeler Transform (PBWT). By using the PBWT data structure at genotyped markers, IMPUTE5 identifies locally best matching haplotypes and long identical by state segments. The method then uses the selected haplotypes as conditioning states within the IMPUTE model.Using the HRC reference panel, which has ~65,000 haplotypes, we show that IMPUTE5 is up to 30x faster than MINIMAC4 and up to 3x faster than BEAGLE5.1, and uses less memory than both these methods. Using simulated reference panels we show that IMPUTE5 scales sub-linearly with reference panel size. For example, keeping the number of imputed markers constant, increasing the reference panel size from 10,000 to 1 million haplotypes requires less than twice the computation time. As the reference panel increases in size IMPUTE5 is able to utilize a smaller number of reference haplotypes, thus reducing computational cost.Author summaryGenome-wide association studies (GWAS) typically use microarray technology to measure genotypes at several hundred thousand positions in the genome. However reference panels of genetic variation consist of haplotype data at >100 fold more positions in the genome. Genotype imputation makes genotype predictions at all the reference panel sites using the GWAS data. Reference panels are continuing to grow in size and this improves accuracy of the predictions, however methods need to be able to scale to increased size. We have developed a new version of the popular IMPUTE software than can handle referenece panels with millions of haplotypes, and has better performance than other published approaches. A notable property of the new method is that it scales sub-linearly with reference panel size. Keeping the number of imputed markers constant, a 100 fold increase in reference panel size requires less than twice the computation time.

Download Full-text

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program

Nature ◽

10.1038/s41586-021-03205-y ◽

2021 ◽

Vol 590 (7845) ◽

pp. 290-299 ◽

Cited By ~ 22

Author(s):

Daniel Taliun ◽

◽

Daniel N. Harris ◽

Michael D. Kessler ◽

Jedidiah Carlson ◽

...

Keyword(s):

Rare Variants ◽

Sequence Data ◽

Association Studies ◽

Genotype Imputation ◽

Genome Wide Association Studies ◽

Phenotypic Data ◽

Treatment And Prevention ◽

Genome Wide ◽

Diverse Backgrounds ◽

Unmapped Reads

AbstractThe Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.

Download Full-text

The Impact of Incomplete Linkage Disequilibrium and Genetic Model Choice on the Analysis and Interpretation of Genome-wide Association Studies

Annals of Human Genetics ◽

10.1111/j.1469-1809.2010.00579.x ◽

2010 ◽

Vol 74 (4) ◽

pp. 375-379 ◽

Cited By ~ 6

Author(s):

Mark M. Iles

Keyword(s):

Linkage Disequilibrium ◽

Genetic Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Model Choice ◽

Genome Wide ◽

The Impact

Download Full-text

Partitioning heritability by functional category using GWAS summary statistics

10.1101/014241 ◽

2015 ◽

Cited By ~ 9

Author(s):

Hilary Kiyo Finucane ◽

Brendan Bulik-Sullivan ◽

Alexander Gusev ◽

Gosia Trynka ◽

Yakir Reshef ◽

...

Keyword(s):

Association Studies ◽

Smoking Behavior ◽

Complex Diseases ◽

New Method ◽

Age At Menarche ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Cell Type ◽

Genome Wide ◽

Cell Type Specific

Recent work has demonstrated that some functional categories of the genome contribute disproportionately to the heritability of complex diseases. Here, we analyze a broad set of functional elements, including cell-type-specific elements, to estimate their polygenic contributions to heritability in genome-wide association studies (GWAS) of 17 complex diseases and traits spanning a total of 1.3 million phenotype measurements. To enable this analysis, we introduce a new method for partitioning heritability from GWAS summary statistics while controlling for linked markers. This new method is computationally tractable at very large sample sizes, and leverages genome-wide information. Our results include a large enrichment of heritability in conserved regions across many traits; a very large immunological disease-specific enrichment of heritability in FANTOM5 enhancers; and many cell-type-specific enrichments including significant enrichment of central nervous system cell types in body mass index, age at menarche, educational attainment, and smoking behavior. These results demonstrate that GWAS can aid in understanding the biological basis of disease and provide direction for functional follow-up.

Download Full-text

A Review on the Impact of Genetics and Genome Wide Association Studies in Autoimmunity

MOJ Proteomics & Bioinformatics ◽

10.15406/mojpb.2017.06.00203 ◽

2017 ◽

Vol 6 (4) ◽

Author(s):

Harishchander Anandaram

Keyword(s):

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

The Impact

Download Full-text