scholarly journals The flashfm approach for fine-mapping multiple quantitative traits

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
N. Hernández ◽  
J. Soenksen ◽  
P. Newcombe ◽  
M. Sandhu ◽  
I. Barroso ◽  
...  

AbstractJoint fine-mapping that leverages information between quantitative traits could improve accuracy and resolution over single-trait fine-mapping. Using summary statistics, flashfm (flexible and shared information fine-mapping) fine-maps signals for multiple traits, allowing for missing trait measurements and use of related individuals. In a Bayesian framework, prior model probabilities are formulated to favour model combinations that share causal variants to capitalise on information between traits. Simulation studies demonstrate that both approaches produce broadly equivalent results when traits have no shared causal variants. When traits share at least one causal variant, flashfm reduces the number of potential causal variants by 30% compared with single-trait fine-mapping. In a Ugandan cohort with 33 cardiometabolic traits, flashfm gave a 20% reduction in the total number of potential causal variants from single-trait fine-mapping. Here we show flashfm is computationally efficient and can easily be deployed across publicly available summary statistics for signals in up to six traits.

2021 ◽  
Author(s):  
Nicolas Hernandez ◽  
Jana Soenksen ◽  
Paul Newcombe ◽  
Manj Sandhu ◽  
Ines Barroso ◽  
...  

Joint fine-mapping that leverages information between quantitative traits could improve accuracy and resolution over single-trait fine-mapping. Using summary statistics, flashfm (FLexible And SHared information Fine-Mapping) fine-maps signals for multiple traits, allowing for missing trait measurements and use of related individuals. In a Bayesian framework, prior model probabilities are formulated to favour model combinations that share causal variants to capitalise on information between traits. Simulation studies demonstrate that both approaches produce broadly equivalent results when traits have no shared causal variants. When traits share at least one causal variant, flashfm reduces the number of potential causal variants by 30% compared with single-trait fine-mapping. In a Ugandan cohort with 33 cardiometabolic traits, flashfm gave a 20% reduction in the total number of potential causal variants from single-trait fine-mapping. Flashfm is computationally efficient and can easily be deployed across publicly available summary statistics for signals in up to six traits.


Author(s):  
Jianhua Wang ◽  
Dandan Huang ◽  
Yao Zhou ◽  
Hongcheng Yao ◽  
Huanhuan Liu ◽  
...  

Abstract Genome-wide association studies (GWASs) have revolutionized the field of complex trait genetics over the past decade, yet for most of the significant genotype-phenotype associations the true causal variants remain unknown. Identifying and interpreting how causal genetic variants confer disease susceptibility is still a big challenge. Herein we introduce a new database, CAUSALdb, to integrate the most comprehensive GWAS summary statistics to date and identify credible sets of potential causal variants using uniformly processed fine-mapping. The database has six major features: it (i) curates 3052 high-quality, fine-mappable GWAS summary statistics across five human super-populations and 2629 unique traits; (ii) estimates causal probabilities of all genetic variants in GWAS significant loci using three state-of-the-art fine-mapping tools; (iii) maps the reported traits to a powerful ontology MeSH, making it simple for users to browse studies on the trait tree; (iv) incorporates highly interactive Manhattan and LocusZoom-like plots to allow visualization of credible sets in a single web page more efficiently; (v) enables online comparison of causal relations on variant-, gene- and trait-levels among studies with different sample sizes or populations and (vi) offers comprehensive variant annotations by integrating massive base-wise and allele-specific functional annotations. CAUSALdb is freely available at http://mulinlab.org/causaldb.


Author(s):  
Karlijn A.C. Meeks ◽  
Ayo P. Doumatey ◽  
Amy R. Bentley ◽  
Mateus H. Gouveia ◽  
Guanjie Chen ◽  
...  

Background - Resistin, a protein linked with inflammation and cardiometabolic diseases, is one of few proteins for which GWAS consistently report variants within and near the coding gene ( RETN ). Here, we took advantage of the reduced linkage disequilibrium in African populations to infer genetic causality for circulating resistin levels by performing GWAS, whole-exome analysis, fine-mapping, Mendelian randomization and transcriptomic data analyses. Methods - GWAS and fine-mapping analyses for resistin were performed in 5621 African ancestry individuals, including 3754 continental Africans (AF) and 1867 African Americans (AA). Causal variants identified were subsequently used as an instrumental variable in Mendelian randomization analyses for homeostatic modelling (HOMA) derived insulin resistance index, BMI and type 2 diabetes. Results - The lead variant (rs3219175, in the promoter region of RETN ) for the single locus detected was the same for AF ( P -value 5.0×10 -111 ) and for AA (9.5×10 -38 ), respectively explaining 12.1% and 8.5% of variance in circulating resistin. Fine-mapping analyses and functional annotation revealed this variant as likely causal affecting circulating resistin levels as a cis -eQTL increasing RETN expression. Additional variants regulating resistin levels were upstream of RETN with genes PCP2 , STXBP2 and XAB2 showing the strongest association using integrative analysis of GWAS with transcriptomic data. Mendelian randomization analyses did not provide evidence for resistin increasing insulin resistance, BMI or type 2 diabetes risk in African-ancestry populations. Conclusions - Taking advantage of the fine-mapping resolution power of African genomes, we identified a single variant (rs3219175) as the likely causal variant responsible for most of the variability in circulating resistin levels. In contrast to findings in some other ancestry populations, we showed that resistin does not seem to increase insulin resistance and related cardiometabolic traits in African-ancestry populations.


2019 ◽  
Author(s):  
Jennifer L Asimit ◽  
Daniel B Rainbow ◽  
Mary D Fortune ◽  
Nastasiya F Grinberg ◽  
Linda S Wicker ◽  
...  

AbstractThousands of genetic variants have been associated with human disease risk, but linkage disequilibrium (LD) hinders fine-mapping the causal variants. We show that stepwise regression, and, to a lesser extent, stochastic search fine mapping can mis-identify as causal, SNPs which jointly tag distinct causal variants. Frequent sharing of causal variants between immune-mediated diseases (IMD) motivated us to develop a computationally efficient multinomial fine-mapping (MFM) approach that borrows information between diseases in a Bayesian framework. We show that MFM has greater accuracy than single disease analysis when shared causal variants exist, and negligible loss of precision otherwise. Applying MFM to data from six IMD revealed causal variants undetected in individual disease analysis, including in IL2RA where we confirm functional effects of multiple causal variants using allele-specific expression in sorted CD4+ T cells from genotype-selected individuals. MFM has the potential to increase fine-mapping resolution in related diseases enabling the identification of associated cellular and molecular phenotypes.


Genetics ◽  
2016 ◽  
Vol 204 (3) ◽  
pp. 933-958 ◽  
Author(s):  
Wenan Chen ◽  
Shannon K. McDonnell ◽  
Stephen N. Thibodeau ◽  
Lori S. Tillmans ◽  
Daniel J. Schaid

2021 ◽  
Author(s):  
Jicai Jiang

Using summary statistics from genome-wide association studies (GWAS) has been widely used for fine-mapping complex traits in humans. The statistical framework was largely developed for unrelated samples. Though it is possible to apply the framework to fine-mapping with related individuals, extensive modifications are needed. Unfortunately, this has often been ignored in summary-statistics-based fine-mapping with related individuals. In this paper, we show in theory and simulation what modifications are necessary to extend the use of summary statistics to related individuals. The analysis also demonstrates that though existing summary-statistics-based fine-mapping methods can be adapted for related individuals, they appear to have no computational advantage over individual-data-based methods.


2016 ◽  
Author(s):  
Yue Li ◽  
Manolis Kellis

Genome wide association studies (GWAS) provide a powerful approach for uncovering disease-associated variants in human, but fine-mapping the causal variants remains a challenge. This is partly remedied by prioritization of disease-associated variants that overlap GWAS-enriched epigenomic annotations. Here, we introduce a new Bayesian model RiVIERA-beta (Risk Variant Inference using Epigenomic Reference Annotations) for inference of driver variants by modelling summary statistics p-values in Beta density function across multiple traits using hundreds of epigenomic annotations. In simulation, RiVIERA-beta promising power in detecting causal variants and causal annotations, the multi-trait joint inference further improved the detection power. We applied RiVIERA-beta to model the existing GWAS summary statistics of 9 autoimmune diseases and Schizophrenia by jointly harnessing the potential causal enrichments among 848 tissue-specific epigenomics annotations from ENCODE/Roadmap consortium covering 127 cell/tissue types and 8 major epigenomic marks. RiVIERA-beta identified meaningful tissue-specific enrichments for enhancer regions defined by H3K4me1 and H3K27ac for Blood T-Cell specifically in the 9 autoimmune diseases and Brain-specific enhancer activities exclusively in Schizophrenia. Moreover, the variants from the 95% credible sets exhibited high conservation and enrichments for GTEx whole-blood eQTLs located within transcription-factor-binding-sites and DNA-hypersensitive-sites. Furthermore, joint modeling the nine immune traits by simultaneously inferring and exploiting the underlying epigenomic correlation between traits further improved the functional enrichments compared to single-trait models.


2019 ◽  
Author(s):  
Anna Hutchinson ◽  
Hope Watson ◽  
Chris Wallace

AbstractGenome Wide Association Studies (GWAS) have successfully identified thousands of loci associated with human diseases. Bayesian genetic fine-mapping studies aim to identify the specific causal variants within GWAS loci responsible for each association, reporting credible sets of plausible causal variants, which are interpreted as containing the causal variant with some “coverage probability”.Here, we use simulations to demonstrate that the coverage probabilities are over-conservative in most fine-mapping situations. We show that this is because fine-mapping data sets are not randomly selected from amongst all causal variants, but from amongst causal variants with larger effect sizes. We present a method to re-estimate the coverage of credible sets using rapid simulations based on the observed, or estimated, SNP correlation structure, we call this the “corrected coverage estimate”. This is extended to find “corrected credible sets”, which are the smallest set of variants such that their corrected coverage estimate meets the target coverage.We use our method to improve the resolution of a fine-mapping study of type 1 diabetes. We found that in 27 out of 39 associated genomic regions our method could reduce the number of potentially causal variants to consider for follow-up, and found that none of the 95% or 99% credible sets required the inclusion of more variants – a pattern matched in simulations of well powered GWAS.Crucially, our correction method requires only GWAS summary statistics and remains accurate when SNP correlations are estimated from a large reference panel. Using our method to improve the resolution of fine-mapping studies will enable more efficient expenditure of resources in the follow-up process of annotating the variants in the credible set to determine the implicated genes and pathways in human diseases.Author summaryPinpointing specific genetic variants within the genome that are causal for human diseases is difficult due to complex correlation patterns existing between variants. Consequently, researchers typically prioritise a set of plausible causal variants for functional validation - these sets of putative causal variants are called “credible sets”. We find that the probabilistic interpretation that these credible sets do indeed contain the true causal variant is variable, in that the reported probabilities often underestimate the true coverage of the causal variant in the credible set. We have developed a method to provide researchers with a “corrected coverage estimate” that the true causal variant appears in the credible set, and this has been extended to find “corrected credible sets”, allowing for more efficient allocation of resources in the expensive follow-up laboratory experiments. We used our method to reduce the number of genetic variants to consider as causal candidates for follow-up in 27 genomic regions that are associated with type 1 diabetes.


2020 ◽  
Vol 29 (R1) ◽  
pp. R81-R88 ◽  
Author(s):  
Anna Hutchinson ◽  
Jennifer Asimit ◽  
Chris Wallace

Abstract Whilst thousands of genetic variants have been associated with human traits, identifying the subset of those variants that are causal requires a further ‘fine-mapping’ step. We review the basic fine-mapping approach, which is computationally fast and requires only summary data, but depends on an assumption of a single causal variant per associated region which is recognized as biologically unrealistic. We discuss different ways that the approach has been built upon to accommodate multiple causal variants in a region and to incorporate additional layers of functional annotation data. We further review methods for simultaneous fine-mapping of multiple datasets, either exploiting different linkage disequilibrium (LD) structures across ancestries or borrowing information between distinct but related traits. Finally, we look to the future and the opportunities that will be offered by increasingly accurate maps of causal variants for a multitude of human traits.


2020 ◽  
Author(s):  
Cue Hyunkyu Lee ◽  
Huwenbo Shi ◽  
Bogdan Pasaniuc ◽  
Eleazar Eskin ◽  
Buhm Han

1AbstractThe identification of pleiotropic loci and the interpretation of the associations at these loci are essential to understand the shared etiology of related traits. A common approach to map pleiotropic loci is to use an existing meta-analysis method to combine summary statistics of multiple traits. This strategy does not take into account the complex genetic architectures of traits such as genetic correlations and heritabilities. Furthermore, the interpretation is challenging because phenotypes often have different characteristics and units. We propose PLEIO, a summary-statistic-based framework to map and interpret pleiotropic loci in a joint analysis of multiple traits. Our method maximizes power by systematically accounting for the genetic correlations and heritabilities of the traits in the association test. Any set of related phenotypes, binary or quantitative traits with differing units, can be combined seamlessly. In addition, our framework offers interpretation and visualization tools to help downstream analyses. Using our method, we combined 18 traits related to cardiovascular disease and identified 20 novel pleiotropic loci, which showed five different patterns of associations. Our method is available at https://github.com/hanlab-SNU/PLEIO.


Sign in / Sign up

Export Citation Format

Share Document