scholarly journals Constraints on eQTL fine mapping in the presence of multi-site local regulation of gene expression

2016 ◽  
Author(s):  
Biao Zeng ◽  
Luke R. Lloyd-Jones ◽  
Alexander Holloway ◽  
Urko M. Marigorta ◽  
Andres Metspalu ◽  
...  

AbstractExpression QTL (eQTL) detection has emerged as an important tool for unravelling of the relationship between genetic risk factors and disease or clinical phenotypes. Most studies use single marker linear regression to discover primary signals, followed by sequential conditional modeling to detect secondary genetic variants affecting gene expression. However, this approach assumes that functional variants are sparsely distributed and that close linkage between them has little impact on estimation of their precise location and magnitude of effects. In this study, we address the prevalence of secondary signals and bias in estimation of their effects by performing multi-site linear regression on two large human cohort peripheral blood gene expression datasets (each greater than 2,500 samples) with accompanying whole genome genotypes, namely the CAGE compendium of Illumina microarray studies, and the Framingham Heart Study Affymetrix data. Stepwise conditional modeling demonstrates that multiple eQTL signals are present for ~40% of over 3500 eGenes in both datasets, and the number of loci with additional signals reduces by approximately two-thirds with each conditioning step. However, the concordance of specific signals between the two studies is only ~30%, indicating that expression profiling platform is a large source of variance in effect estimation. Furthermore, a series of simulation studies imply that in the presence of multi-site regulation, up to 10% of the secondary signals could be artefacts of incomplete tagging, and at least 5% but up to one quarter of credible intervals may not even include the causal site, which is thus mis-localized. Joint multi-site effect estimation recalibrates effect size estimates by just a small amount on average. Presumably similar conclusions apply to most types of quantitative trait. Given the strong empirical evidence that gene expression is commonly regulated by more than one variant, we conclude that the fine-mapping of causal variants needs to be adjusted for multi-site influences, as conditional estimates can be highly biased by interference among linked sites.

2017 ◽  
Vol 7 (8) ◽  
pp. 2533-2544 ◽  
Author(s):  
Biao Zeng ◽  
Luke R. Lloyd-Jones ◽  
Alexander Holloway ◽  
Urko M. Marigorta ◽  
Andres Metspalu ◽  
...  

Blood ◽  
2013 ◽  
Vol 122 (21) ◽  
pp. 437-437 ◽  
Author(s):  
Daniel E. Bauer ◽  
Sophia C. Kamran ◽  
Samuel Lessard ◽  
Jian Xu ◽  
Yuko Fujiwara ◽  
...  

Abstract Introduction Genome-wide association studies (GWAS) have ascertained numerous trait-associated common genetic variants localized to regulatory DNA. The hypothesis that regulatory variation accounts for substantial heritability has undergone scarce experimental evaluation. Common variation at BCL11A is estimated to explain ∼15% of the trait variance in fetal hemoglobin (HbF) level but the functional variants remain unknown. Materials and Methods We use chromatin immunoprecipitation (ChIP), DNase I sensitivity and chromosome conformation capture to evaluate the BCL11A locus in mouse and human primary erythroblasts. We extensively genotype 1,263 samples from the Collaborative Study of Sickle Cell Disease within three HbF-associated erythroid DNase I hypersensitive sites (DHSs) at BCL11A. We pyrosequence heterozygous erythroblasts to assess allele-specific transcription factor binding and gene expression. We conduct transgenic analysis by mouse zygotic microinjection and genome editing with transcription activator-like effector nucleases (TALENs) and clustered, regularly interspaced, short palindromic repeats (CRISPR)/CRISPR-associated nuclease 9 (Cas9) RNA-guided nucleases. Results Common genetic variation at BCL11A associated with HbF level lies in noncoding sequences decorated by an erythroid enhancer chromatin signature. Fine-mapping this putative regulatory DNA uncovers a motif-disrupting common variant associated with reduced GATA1 and TAL1 transcription factor binding, modestly diminished BCL11A expression and elevated HbF. This variant, rs1427407, accounts for the HbF association of the previously reported sentinel SNPs. The composite element functions in vivo as a developmental stage-specific lineage-restricted enhancer. Genome editing reveals that the enhancer is required in erythroid but dispensable in B-lymphoid cells for expression of BCL11A. We demonstrate species-specific functional components of the composite enhancer in mouse as compared to human erythroid precursor cells. The mouse sequences homologous to the human DHS sufficient to drive reporter activity are dispensable from the mouse composite element, whereas the adjacent DHS, whose human homolog does not direct reporter activity, is absolutely required for BCL11A expression. Conclusions We describe a comprehensive and widely applicable approach, including chromatin mapping followed by fine-mapping, allele-specific ChIP and gene expression studies, and functional analyses, to reveal causal variants and critical elements. We assert that functional validation of regulatory DNA ought to include perturbation of the endogenous genomic context by genome editing and not solely rely on in vitro or ectopic surrogate assays. These results validate the hypothesis that common variation modulates cell type-specific regulatory elements, and reveal that although functional variants themselves may be of modest impact, their harboring elements may be critical for appropriate gene expression. We speculate that species-level functional differences in components of the composite enhancer might partially account for differences in timing of globin gene expression among animals. We suggest that the GWAS-marked BCL11A enhancer represents a highly attractive target for therapeutic genome editing for the major b-hemoglobin disorders. Disclosures: No relevant conflicts of interest to declare.


Circulation ◽  
2016 ◽  
Vol 133 (suppl_1) ◽  
Author(s):  
Chani J Hodonsky ◽  
Ursula Schick ◽  
Jonathan Kocarnik ◽  
Claudia Schurmann ◽  
Steve Buyske ◽  
...  

Introduction: Variability within the normal population range of hematocrit is associated with stroke and myocardial infarction. Published GWAS of hematocrit have identified multiple loci, yet few studies have included populations of Hispanic or African descent, thereby limiting opportunities to identify population-specific variants or narrow associated regions for functional analysis. We present a fine-mapping analysis of six previously identified hematocrit loci in African American and Hispanic/Latino participants of the PAGE study. Methods: Approximately 200,000 genotyped or imputed Metabochip variants were examined for association with hematocrit (proportion of whole blood comprising red blood cells) in 19,822 Hispanic/Latino and 19,973 African American participants. SNPs were excluded on a population-specific basis if effective heterozygosity was < 30. Primary and conditional analyses were performed in Plink, ProbABEL, or SuGen; fixed-effects meta-analyses were performed in Metal. Trans-ethnic and ancestry-specific meta-analyses were performed in MANTRA to generate 99% credible intervals for previously published variants that generalized to our populations. Results: We first examined whether 8,261 variants in five previously identified hematocrit loci ( HFE , ABO , HK1 , SH2B3 / ATXN2 , and TMPRSS6 ) were associated with hematocrit in our study populations. Three loci generalized (p<1.7x10 -4 ) to Hispanic/Latino participants ( ABO, HK1, and TMPRSS6 ) and three generalized to African Americans ( HFE , ABO , and SH2B3/ATXN2 ). Among generalized loci, conditional analyses adjusting for published variants in European-ancestry or East Asian populations did not identify any independently associated SNPs in Hispanic Latinos or African Americans (p<1.3x10 -5 ). Trans-ethnic meta-analysis for the ABO locus resulted in a 5 SNP, 13kb 99% credible interval, shorter than both the Hispanic/Latino (17kb) and African American (360kb) credible intervals. In discovery analysis, we identified one variant associated with hematocrit in Hispanic/Latinos at array-wide significance levels ( PROX1 locus, p<3.0x10 -7 ). No novel loci were identified in African Americans. Conclusion: Our findings provide evidence that the same genomic loci influence normal variation in hematocrit values across diverse ancestral populations. Trans-ethnic fine mapping of the gene-rich ABO locus—which has been associated with ischemic stroke, thrombosis, and myocardial infarction in addition to hematocrit GWA studies—suggests that a functional variant may reside in the first intron of the ABO coding region. Additionally, identification of previously unidentified associations in Hispanic/Latinos emphasizes the importance of including diverse populations in association studies as well as the potential to identify population-specific functional variants within known or discovery loci.


Genes ◽  
2020 ◽  
Vol 11 (5) ◽  
pp. 504 ◽  
Author(s):  
Ruoyu Tian ◽  
Yidan Pan ◽  
Thomas H. A. Etheridge ◽  
Harshavardhan Deshmukh ◽  
Dalia Gulick ◽  
...  

The majority of genetic variants affecting complex traits map to regulatory regions of genes, and typically lie in credible intervals of 100 or more SNPs. Fine mapping of the causal variant(s) at a locus depends on assays that are able to discriminate the effects of polymorphisms or mutations on gene expression. Here, we evaluated a moderate-throughput CRISPR-Cas9 mutagenesis approach, based on replicated measurement of transcript abundance in single-cell clones, by deleting candidate regulatory SNPs, affecting four genes known to be affected by large-effect expression Quantitative Trait Loci (eQTL) in leukocytes, and using Fluidigm qRT-PCR to monitor gene expression in HL60 pro-myeloid human cells. We concluded that there were multiple constraints that rendered the approach generally infeasible for fine mapping. These included the non-targetability of many regulatory SNPs, clonal variability of single-cell derivatives, and expense. Power calculations based on the measured variance attributable to major sources of experimental error indicated that typical eQTL explaining 10% of the variation in expression of a gene would usually require at least eight biological replicates of each clone. Scanning across credible intervals with this approach is not recommended.


2021 ◽  
Vol 80 (Suppl 1) ◽  
pp. 9.1-9
Author(s):  
M. Houtman ◽  
X. Ge ◽  
A. Mcgovern ◽  
K. Klein ◽  
G. Orozco ◽  
...  

Background:Over the past decade, genome wide association studies (GWAS) have identified the JAZF1 locus as a risk locus for several autoimmune diseases, including rheumatoid arthritis (RA)1. However, the exact causal variants in the JAZF1 locus and their underlying regulatory events contributing to RA are still not known. Here, we focus on the effect of these variants on gene expression in synovial fibroblasts (SF).Objectives:To characterize the functional consequences of RA-causal variants in the JAZF1 locus in SF.Methods:Genetic fine-mapping of RA loci was conducted by computing sets of credible variants driving GWAS signals. These credible variant sets were integrated with DNA architecture (ChIP-seq), 3D chromatin interactions (3C, HiC and capture HiC), DNA accessibility (ATAC-seq) and gene expression (RNA-seq and CAGE-seq) datasets to select putative RA-causal variants in SF. Selected variants in the JAZF1 locus were tested for regulatory function by luciferase reporter assays and electrophoretic mobility shift assays (EMSA) in the fibrosarcoma cell line HT1080. The JASPAR2020 database was used to identify putative transcription factors (TF) binding to the selected variants. The expression of HOTTIP was measured by quantitative PCR in hand SF (n=23). Genotyping was done by pyrosequencing.Results:Genetic fine mapping revealed 47 variants in the JAZF1 locus. Integration of these variants with the chromatin datasets prioritized rs2158624, rs57585717 and rs186735625 as the top candidates (posterior probability of causality >0.1) in the JAZF1 locus. We found that rs2158624 and rs186735625 are located in the vicinity of enhancer elements in SF as determined by ATAC-seq. In addition, the region of rs2158624 exhibited strong chromatin interactions with the genomic region of HOTTIP and HOXA13. Both these transcripts were previously shown to be specifically expressed in SF isolated from hands and feet2. Based on this, we selected rs2158624 as the most promising candidate in the JAZF1 locus. We found that the rs2158624-C allele (risk) is associated with lower expression of HOTTIP, but not HOXA13, in hand SF compared to the rs2158624-T allele (non-risk) (p=0.02). Luciferase assays in HT1080 cells demonstrated enhancer activity with both the rs2158624-C allele (p=0.006) and T allele (p=0.04), with no significant difference in enhancer activity between the rs2158624-C and T allele. EMSAs identified stronger specific binding of HT1080-cell nuclear extract for the rs2158624-T allele than for the C allele (risk). Based on the JASPAR2020 database, we identified NFAT5 as a potential TF that can bind to rs2158624 and may regulate the expression of HOTTIP.Conclusion:We were able to substantially narrow down the potential functional variants in the JAZF1 locus using our data integration approach and functional assays. We suggest that the risk allele of rs2158624 influences the binding of TFs controlling the expression of the long non-coding RNA HOTTIP in SF, which might confer specific risk to develop RA in hands.References:[1]Okada Y et al. Genetic of rheumatoid arthritis contributes to biology and drug discovery. Nature 2014;506:376.[2]Frank-Bertoncelj M et al. Epigenetically-driven anatomical diversity of synovial fibroblasts guides joint-specific fibroblast functions. Nat Commun 2017;8:14852.Disclosure of Interests:Miranda Houtman: None declared, Xiangyu Ge: None declared, Amanda McGovern: None declared, Kerstin Klein: None declared, Gisela Orozco: None declared, Mojca Frank Bertoncelj: None declared, Miriam Marks: None declared, Oliver Distler Speakers bureau: Bayer, Boehringer Ingelheim, iQone, Medscape, MSD, Novartis, Pfizer and Roche, Consultant of: Abbvie, Acceleron Pharma, Amgen, AnaMar, Arxx Therapeutics, Bayer, Baecon Discovery, Boehringer, CSL Behring, ChemomAb, Corbus Pharmaceuticals, Galapagos NV, GSK, Glenmark Pharmaceuticals, Horizon Pharmaceuticals, Inventiva, Italfarmaco, iQvia, Kymera, Lilly, Medac, Medscape, Mitsubishi Tanabe Pharma, MSD, Pfizer, Roche, Roivant Sciences, Sanofi and UCB, Grant/research support from: Kymera Therapeutics and Mitsubishi Tanabe, Paul Martin: None declared, Stephen Eyre: None declared, Caroline Ospelt: None declared


2020 ◽  
Vol 477 (16) ◽  
pp. 3091-3104 ◽  
Author(s):  
Luciana E. Giono ◽  
Alberto R. Kornblihtt

Gene expression is an intricately regulated process that is at the basis of cell differentiation, the maintenance of cell identity and the cellular responses to environmental changes. Alternative splicing, the process by which multiple functionally distinct transcripts are generated from a single gene, is one of the main mechanisms that contribute to expand the coding capacity of genomes and help explain the level of complexity achieved by higher organisms. Eukaryotic transcription is subject to multiple layers of regulation both intrinsic — such as promoter structure — and dynamic, allowing the cell to respond to internal and external signals. Similarly, alternative splicing choices are affected by all of these aspects, mainly through the regulation of transcription elongation, making it a regulatory knob on a par with the regulation of gene expression levels. This review aims to recapitulate some of the history and stepping-stones that led to the paradigms held today about transcription and splicing regulation, with major focus on transcription elongation and its effect on alternative splicing.


Sign in / Sign up

Export Citation Format

Share Document