scholarly journals Cross-population Joint Analysis of eQTLs: Fine Mapping and Functional Annotation

2014 ◽  
Author(s):  
Xiaoquan Wen ◽  
Francesca Luca ◽  
Roger Pique-Regi

Mapping expression quantitative trait loci (eQTLs) has been shown as a powerful tool to uncover the genetic underpinnings of many complex traits at the molecular level. In this paper, we present an integrative analysis approach that leverages eQTL data collected from multiple population groups. In particular, our approach effectively identifies multiple independent {\it cis}-eQTL signals that are consistently presented across populations, accounting for heterogeneity in allele frequencies and patterns of linkage disequilibrium. Furthermore, our analysis framework enables integrating high-resolution functional annotations into analysis of eQTLs. We applied our statistical approach to analyze the GEUVADIS data consisting of samples from five population groups. From this analysis, we concluded that i) joint analysis across population groups greatly improves the power of eQTL discovery and the resolution of fine mapping of causal eQTLs; ii) many genes harbor multiple independent eQTLs in their {\it cis} regions; iii) genetic variants that disrupt transcription factor binding are significantly enriched in eQTLs (p-value = 4.93 × 10-22).


2021 ◽  
Author(s):  
Wenmin Zhang ◽  
Hamed S Najafabadi ◽  
Yue Li

Identifying causal variants from genome-wide association studies (GWASs) is challenging due to widespread linkage disequilibrium (LD). Functional annotations of the genome may help prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. However, classical fine-mapping methods have a high computational cost, particularly when the underlying genetic architecture and LD patterns are complex. Here, we propose a novel approach, SparsePro, to efficiently conduct functionally informed statistical fine-mapping. Our method enjoys two major innovations: First, by creating a sparse low-dimensional projection of the high-dimensional genotype, we enable a linear search of causal variants instead of an exponential search of causal configurations used in existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors. We evaluate SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved more accurate and well-calibrated posterior inference with greatly reduced computation time. We demonstrate the utility of SparsePro by investigating the genetic architecture of five functional biomarkers of vital organs. We identify potential causal variants contributing to the genetically encoded coordination mechanisms between vital organs and pinpoint target genes with potential pleiotropic effects. In summary, we have developed an efficient genome-wide fine-mapping method with the ability to integrate functional annotations. Our method may have wide utility in understanding the genetics of complex traits as well as in increasing the yield of functional follow-up studies of GWASs.



2020 ◽  
Author(s):  
Samuel S. Kim ◽  
Kushal K. Dey ◽  
Omer Weissbrod ◽  
Carla Marquez-Luna ◽  
Steven Gazal ◽  
...  

AbstractDespite considerable progress on pathogenicity scores prioritizing both coding and noncoding variants for Mendelian disease, little is known about the utility of these pathogenicity scores for common disease. Here, we sought to assess the informativeness of Mendelian diseasederived pathogenicity scores for common disease, and to improve upon existing scores. We first applied stratified LD score regression to assess the informativeness of annotations defined by top variants from published Mendelian disease-derived pathogenicity scores across 41 independent common diseases and complex traits (average N = 320K). Several of the resulting annotations were informative for common disease, even after conditioning on a broad set of coding, conserved, regulatory and LD-related annotations from the baseline-LD model. We then improved upon the published pathogenicity scores by developing AnnotBoost, a gradient boosting-based framework to impute and denoise pathogenicity scores using functional annotations from the baseline-LD model. AnnotBoost substantially increased the informativeness for common disease of both previously uninformative and previously informative pathogenicity scores, implying pervasive variant-level overlap between Mendelian disease and common disease. The boosted scores also produced significant improvements in heritability model fit and in classifying disease-associated, fine-mapped SNPs. Our boosted scores have high potential to improve candidate gene discovery and fine-mapping for common disease.



2017 ◽  
Author(s):  
Farhad Hormozdiari ◽  
Steven Gazal ◽  
Bryce van de Geijn ◽  
Hilary Finucane ◽  
Chelsea J.-T. Ju ◽  
...  

AbstractThere is increasing evidence that many GWAS risk loci are molecular QTL for gene ex-pression (eQTL), histone modification (hQTL), splicing (sQTL), and/or DNA methylation (meQTL). Here, we introduce a new set of functional annotations based on causal posterior prob-abilities (CPP) of fine-mapped molecular cis-QTL, using data from the GTEx and BLUEPRINT consortia. We show that these annotations are very strongly enriched for disease heritability across 41 independent diseases and complex traits (average N = 320K): 5.84x for GTEx eQTL, and 5.44x for eQTL, 4.27-4.28x for hQTL (H3K27ac and H3K4me1), 3.61x for sQTL and 2.81x for meQTL in BLUEPRINT (all P ≤ 1.39e-10), far higher than enrichments obtained using stan-dard functional annotations that include all significant molecular cis-QTL (1.17-1.80x). eQTL annotations that were obtained by meta-analyzing all 44 GTEx tissues generally performed best, but tissue-specific blood eQTL annotations produced stronger enrichments for autoimmune dis-eases and blood cell traits and tissue-specific brain eQTL annotations produced stronger enrich-ments for brain-related diseases and traits, despite high cis-genetic correlations of eQTL effect sizes across tissues. Notably, eQTL annotations restricted to loss-of-function intolerant genes from ExAC were even more strongly enriched for disease heritability (17.09x; vs. 5.84x for all genes; P = 4.90e-17 for difference). All molecular QTL except sQTL remained significantly enriched for disease heritability in a joint analysis conditioned on each other and on a broad set of functional annotations from previous studies, implying that each of these annotations is uniquely informative for disease and complex trait architectures.



2020 ◽  
Vol 11 (1) ◽  
Author(s):  
Samuel S. Kim ◽  
Kushal K. Dey ◽  
Omer Weissbrod ◽  
Carla Márquez-Luna ◽  
Steven Gazal ◽  
...  

AbstractDespite considerable progress on pathogenicity scores prioritizing variants for Mendelian disease, little is known about the utility of these scores for common disease. Here, we assess the informativeness of Mendelian disease-derived pathogenicity scores for common disease and improve upon existing scores. We first apply stratified linkage disequilibrium (LD) score regression to evaluate published pathogenicity scores across 41 common diseases and complex traits (average N = 320K). Several of the resulting annotations are informative for common disease, even after conditioning on a broad set of functional annotations. We then improve upon published pathogenicity scores by developing AnnotBoost, a machine learning framework to impute and denoise pathogenicity scores using a broad set of functional annotations. AnnotBoost substantially increases the informativeness for common disease of both previously uninformative and previously informative pathogenicity scores, implying that Mendelian and common disease variants share similar properties. The boosted scores also produce improvements in heritability model fit and in classifying disease-associated, fine-mapped SNPs. Our boosted scores may improve fine-mapping and candidate gene discovery for common disease.



2019 ◽  
Author(s):  
Omer Weissbrod ◽  
Farhad Hormozdiari ◽  
Christian Benner ◽  
Ran Cui ◽  
Jacob Ulirsch ◽  
...  

AbstractFine-mapping aims to identify causal variants impacting complex traits. Several recent methods improve fine-mapping accuracy by prioritizing variants in enriched functional annotations. However, these methods can only use information at genome-wide significant loci (or a small number of functional annotations), severely limiting the benefit of functional data. We propose PolyFun, a computationally scalable framework to improve fine-mapping accuracy using genome-wide functional data for a broad set of coding, conserved, regulatory and LD-related annotations. PolyFun prioritizes variants in enriched functional annotations by specifying prior causal probabilities for fine-mapping methods such as SuSiE or FINEMAP, employing special procedures to ensure robustness to model misspecification and winner’s curse. In simulations with in-sample LD, PolyFun + SuSiE and PolyFun + FINEMAP were well-calibrated and identified >20% more variants with posterior causal probability >0.95 than their non-functionally informed counterparts (and >33% more fine-mapped variants than previous functionally-informed fine-mapping methods). In simulations with mismatched reference LD, PolyFun + SuSiE remained well-calibrated when reducing the maximum number of assumed causal SNPs per locus, which reduces absolute power but still produces large relative improvements. In analyses of 49 UK Biobank traits (average N=318K) with in-sample LD, PolyFun + SuSiE identified 3,025 fine-mapped variant-trait pairs with posterior causal probability >0.95, a >32% improvement vs. SuSiE; 223 variants were fine-mapped for multiple genetically uncorrelated traits, indicating pervasive pleiotropy. We used posterior mean per-SNP heritabilities from PolyFun + SuSiE to perform polygenic localization, constructing minimal sets of common SNPs causally explaining 50% of common SNP heritability; these sets ranged in size from 28 (hair color) to 3,400 (height) to 2 million (number of children). In conclusion, PolyFun prioritizes variants for functional follow-up and provides insights into complex trait architectures.



2022 ◽  
Author(s):  
Wenmin Zhang ◽  
Hamed Najafabadi ◽  
Yue Li

Abstract Identifying causal variants from genome-wide association studies (GWASs) is challenging due to widespread linkage disequilibrium (LD). Functional annotations of the genome may help prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. However, classical fine-mapping methods have a high computational cost, particularly when the underlying genetic architecture and LD patterns are complex. Here, we propose a novel approach, SparsePro, to efficiently conduct genome-wide fine-mapping. Our method enjoys two major innovations: First, by creating a sparse low-dimensional projection of the high-dimensional genotype data, we enable a linear search of causal variants instead of a combinatorial search of causal configurations used in most existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors. We evaluate SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved more accurate and well-calibrated posterior inference with greatly reduced computation time. We demonstrate the utility of SparsePro by investigating the genetic architecture of five functional biomarkers of vital organs. We show that, compared to other methods, the causal variants identified by SparsePro are highly enriched for expression quantitative trait loci and explain a larger proportion of trait heritability. We also identify potential causal variants contributing to the genetically encoded coordination mechanisms between vital organs, and pinpoint target genes with potential pleiotropic effects. In summary, we have developed an efficient genome-wide fine-mapping method with the ability to integrate functional annotations. Our method may have wide utility in understanding the genetics of complex traits as well as in increasing the yield of functional follow-up studies of GWASs. SparsePro software is available on GitHub at https://github.com/zhwm/SparsePro.



Author(s):  
Leah C. Solberg Woods ◽  
Abraham A. Palmer
Keyword(s):  


2021 ◽  
Author(s):  
Richard F Oppong ◽  
Pau Navarro ◽  
Chris S Haley ◽  
Sara Knott

We describe a genome-wide analytical approach, SNP and Haplotype Regional Heritability Mapping (SNHap-RHM), that provides regional estimates of the heritability across locally defined regions in the genome. This approach utilises relationship matrices that are based on sharing of SNP and haplotype alleles at local haplotype blocks delimited by recombination boundaries in the genome. We implemented the approach on simulated data and show that the haplotype-based regional GRMs capture variation that is complementary to that captured by SNP-based regional GRMs, and thus justifying the fitting of the two GRMs jointly in a single analysis (SNHap-RHM). SNHap-RHM captures regions in the genome contributing to the phenotypic variation that existing genome-wide analysis methods may fail to capture. We further demonstrate that there are real benefits to be gained from this approach by applying it to real data from about 20,000 individuals from the Generation Scotland: Scottish Family Health Study. We analysed height and major depressive disorder (MDD). We identified seven genomic regions that are genome-wide significant for height, and three regions significant at a suggestive threshold (p-value <1x10^(-5) ) for MDD. These significant regions have genes mapped to within 400kb of them. The genes mapped for height have been reported to be associated with height in humans, whiles those mapped for MDD have been reported to be associated with major depressive disorder and other psychiatry phenotypes. The results show that SNHap-RHM presents an exciting new opportunity to analyse complex traits by allowing the joint mapping of novel genomic regions tagged by either SNPs or haplotypes, potentially leading to the recovery of some of the "missing" heritability.



2020 ◽  
Author(s):  
Lactatia Matsie Motsuku ◽  
Wenlong Carl Chen ◽  
Mazvita Molleen Muchengeti ◽  
Tamlyn Mac Quene ◽  
Patricia Kellett ◽  
...  

Abstract BackgroundSouth Africa (SA) has experienced a rapid transition in the Human Development Index (HDI) over the past decade, which had an effect on the incidence and mortality rates of colorectal cancer (CRC). This study aims to provide CRC incidence and mortality trends by population group and sex in SA from 2002 to 2014.MethodsIncidence data were extracted from the South African National Cancer Registry and mortality data obtained from Statistics South Africa (STATS SA), for the period 2002 to 2014. Age-standardised incidence rates (ASIR) and age-standardised mortality rates (ASMR) were calculated using the STATS SA mid-year population as the denominator and the Segi world standard population data for standardisation. A Joinpoint regression analysis was computed for the CRC ASIR and ASMR by population group and sex.ResultsA total of 33,232 incident CRC cases and 26,836 CRC deaths were reported during the study period. Of the CRC cases reported, 54% were males and 46% were females, and among deaths reported, 47% were males and 53% were females. Overall, there was a 2.5% annual average percentage change (AAPC) increase in ASIR from 2002 to 2014 (95% CI: 0.6- 4.5, p-value <0.001). For ASMR overall, there was 1.3% increase from 2002 to 2014 (95% CI: 0.1- 2.6, p-value <0.001). The ASIR and ASMR among population groups were stable, with the exception of the Black population group. The ASIR increased consistently at 4.3% for black males (95% CI: 1.9- 6.7, p-value <0.001) and 3.4% for black females (95% CI: 1.5- 5.3, p-value <0.001) from 2002 to 2014, respectively. Similarly, ASMR for black males and females increased by 4.2% (95% CI: 2.0- 6.5, p-value <0.001) and 3.4% (, 95%CI: 2.0- 4.8, p-value <0.01) from 2002 to 2014, respectively.ConclusionsThe disparities in the CRC incidence and mortality trends may reflect socioeconomic inequalities across different population groups in SA. The rapid increase in CRC trends among the Black population group is concerning and requires further investigation and increased efforts for cancer prevention, early screening and diagnosis, as well as better access to cancer treatment.



2016 ◽  
Vol 2016 ◽  
pp. 1-6 ◽  
Author(s):  
Wei Wei ◽  
Paula S. Ramos ◽  
Kelly J. Hunt ◽  
Bethany J. Wolf ◽  
Gary Hardiman ◽  
...  

Genome-wide association studies (GWAS) have identified tens of thousands of genetic variants associated with hundreds of phenotypes and diseases, which have provided clinical and medical benefits to patients with novel biomarkers and therapeutic targets. Recently, there has been accumulating evidence suggesting that different complex traits share a common risk basis, namely, pleiotropy. Previously, a statistical method, namely, GPA (Genetic analysis incorporating Pleiotropy and Annotation), was developed to improve identification of risk variants and to investigate pleiotropic structure through a joint analysis of multiple GWAS datasets. While GPA provides a statistically rigorous framework to evaluate pleiotropy between phenotypes, it is still not trivial to investigate genetic relationships among a large number of phenotypes using the GPA framework. In order to address this challenge, in this paper, we propose a novel approach, GPA-MDS, to visualize genetic relationships among phenotypes using the GPA algorithm and multidimensional scaling (MDS). This tool will help researchers to investigate common etiology among diseases, which can potentially lead to development of common treatments across diseases. We evaluate the proposed GPA-MDS framework using a simulation study and apply it to jointly analyze GWAS datasets examining 18 unique phenotypes, which helps reveal the shared genetic architecture of these phenotypes.



Sign in / Sign up

Export Citation Format

Share Document