scholarly journals Putative bovine topological association domains and CTCF binding motifs can reduce the search space for causative regulatory variants of complex traits

2018 ◽  
Author(s):  
Min Wang ◽  
Timothy P Hancock ◽  
Amanda J. Chamberlain ◽  
Christy J. Vander Jagt ◽  
Jennie E Pryce ◽  
...  

AbstractBackgroundTopological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants.ResultsWe used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping gene, tRNA genes, CTCF binding motifs, SINEs, H3K4me3 and H3K27ac. Then we showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows’ white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The most significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value ≤ 0.001).ConclusionsOur results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative regulatory variants in the bovine genome.

BMC Genomics ◽  
2018 ◽  
Vol 19 (1) ◽  
Author(s):  
Min Wang ◽  
Timothy P. Hancock ◽  
Amanda J. Chamberlain ◽  
Christy J. Vander Jagt ◽  
Jennie E. Pryce ◽  
...  

2018 ◽  
Author(s):  
Emily C Glassberg ◽  
Ziyue Gao ◽  
Arbel Harpak ◽  
Xun Lant ◽  
Jonathan K Pritchard

Gene expression variation is a major contributor to phenotypic variation in human complex traits. Selection on complex traits may therefore be reflected in constraint on gene expression levels. Here, we explore the effects of stabilizing selection on cis-regulatory genetic variation in humans. We analyze patterns of expression variation at copy number variants and find evidence for selection against large increases in gene expression. Using allele-specific expression (ASE) data, we further show evidence of selection against smaller-effect variants. We estimate that, across all genes, singletons in a sample of 122 individuals have approximately 2.5 × greater effects on expression variance than common variants. Despite their increased effect sizes relative to common variants, we estimate that singletons in the sample studied explain, on average, only 5% of the heritability of gene expression from cis-regulatory variants. Finally, we show that genes depleted for loss-of-function variants are also depleted for cis-eQTLs and have low levels of allelic imbalance, confirming tighter constraint on the expression levels of these genes. We conclude that constraint on gene expression is present, but has relatively weak effects on most cis-regulatory variants, thus permitting high levels of gene-regulatory genetic variation.


Blood ◽  
2018 ◽  
Vol 132 (Supplement 1) ◽  
pp. 2608-2608
Author(s):  
Claudia Gebhard ◽  
Roger Mulet-Lazaro ◽  
Lucia Schwarzfischer ◽  
Dagmar Glatz ◽  
Margit Nuetzel ◽  
...  

Abstract Acute myeloid leukemia (AML) represents a highly heterogeneous myeloid stem cell disorder classified based on various genetic defects. Besides genetic alterations, epigenetic changes are recognized as an additional mechanism contributing to leukemogenesis, but insight into the latter process remains minor. Using a combination of Methyl-CpG-Immunoprecipitation (MCIp-chip) and MALDI-TOF analysis of bisulfite-treated DNA in a cohort of 196 AML patients we previously demonstrated that (cyto)genetically defined AML subtypes, including CBFB-MYH11, AML-ETO, NPM1-mut, CEBPA-mut or IDH1/2-mut subtypes, express specific DNA-methylation profiles (Gebhard et al, Leukemia, 2018). A fraction of AML patients (5/196) displayed a unique abnormal hypermethylation profile that was completely distinct from any other AML subtype. These patients present immature leukemia (FAB M0, M1) with various chromosomal aberrations but very few mutations (e.g. no IDH1/2, KRAS, DNMT3A) that might explain the CpG island methylator phenotype (CIMP) phenotype. The CIMP patients showed high resemblance with a recently reported CEBPA methylated subgroup (Wouters et al, 2007 and Figueroa et al, 2009), which we confirmed by MCIp-chip and MALDI-TOF analysis. To explore the whole range of epigenetic alterations in the CIMP-AML patients we performed in-depth global DNA methylation and gene expression analyses (MCIp-seq and RNA-seq) in 45 AML and 12 CIMP patients from both studies. Principle component analysis and t-distributed stochastic neighbor embedding (t-SNE) revealed that CIMP patients express a unique DNA-methylation and gene-expression signature that separated them from all other AMLs. We could discriminate promoter methylation from non-promoter methylation by selecting MCIp-seq peaks within 3kb around TSS. Promoter hypermethylation was highly associated with repression of genes (PCC = -0.053, p-value = 0.00075). Hypermethylation of non-promoter regions was more strongly associated with upregulation of genes (PCC = 0.046, p-value = 4.613e-06). Interestingly, differentially methylated regions also showed a positive association with myeloid lineage CTCF binding sites (27% vs 18% expected, p-value < 2.2e-16 in a chi-square test of independence). Methylation of CTCF sites causes loss of CTCF binding, which has been reported to disrupt boundaries between so-called topologically associated domains (TADs), allowing enhancers located in a particular TAD to become accessible to genes in adjacent TADs and affect their transcription. Whether this is the case is under investigation. In this study we particularly focused on the role of hypermethylation of promoters in CIMP-AMLs. Promoters of many transcriptional regulators that are involved in the differentiation of myeloid lineages of which several are frequently mutated in AML were hypermethylated and repressed, including CEBPA, CEBPD, IRF8, GATA2, KLF4, MITF or MAFB. Notably, HMGA2, a critical regulator of myeloid progenitor expansion, exhibited the largest degree of CIMP promoter hypermethylation compared to the other AMLs, accompanied by a reduction in gene expression. Moreover, multiple members of the HOXB family and KLF1 (erythroid differentiation) were methylated and repressed as well. In addition, these patients frequently showed hypermethylation of many chromatin factors (e.g. LMNA, CHD7 or TET2). Hypermethylation of the TET2 promoter could result in a loss of maintenance DNA demethylation and therefore successive hypermethylation at CpG islands. We carried out regulome-capture-bisulfite sequencing on CIMP-AMLs compared to other AML samples and normal blood cell controls and confirmed methylation of the same transcription and chromatin factor promoters. We conclude that these leukemias represent very primitive HSCPs which are blocked in differentiation into multiple hematopoietic lineages, due to the absence of regulators of these lineages. Although the underlying cause for the extreme hypermethylation signature is still subject to ongoing studies, the consequence of promoter hypermethylation is silencing of key lineage regulators causing the differentiation arrest in these cells. We argue that these patients may particularly benefit from therapies that revert DNA methylation. Disclosures Ehninger: Cellex Gesellschaft fuer Zellgewinnung mbH: Employment, Equity Ownership; GEMoaB Monoclonals GmbH: Employment, Equity Ownership; Bayer: Research Funding. Thiede:AgenDix: Other: Ownership; Novartis: Honoraria, Research Funding.


2019 ◽  
Author(s):  
Anna Mikhaylova ◽  
Timothy Thornton

AbstractPredicting gene expression with genetic data has garnered significant attention in recent years. PrediXcan is one of the most widely used gene-based association methods for testing imputed gene expression values with a phenotype due to the invaluable insight the method has shown into the relationship between complex traits and the component of gene expression that can be attributed to genetic variation. The prediction models for PrediXcan, however, were obtained using supervised machine learning methods and training data from the Depression and Gene Network (DGN) and the Genotype-Tissue Expression (GTEx) data, where the majority of subjects are of European descent. Many genetic studies, however, include samples from multi-ethnic populations, and in this paper we assess the accuracy of gene expression predictions with PrediXcan in diverse populations. Using transcriptomic data from the GEUVADIS (Genetic European Variation in Health and Disease) RNA sequencing project and whole genome sequencing data from the 1000 Genomes project, we evaluate and compare the predictive performance of PrediXcan in an African population (Yoruban) and four European populations. Prediction results are obtained using a range of models from PrediXcan weight databases, and Pearson’s correlation coefficient is used to measure prediction accuracy. We demonstrate that the predictive performance of PrediXcan varies across populations (F-test p-value < 0.001), where prediction accuracy is the worst in the Yoruban sample compared to European samples. Moreover, the performance of PrediXcan varies not only among distant populations, but also among closely related populations as well. We also find that the qualitative performance of PrediXcan for the populations considered is consistent across all weight databases used.


2020 ◽  
Vol 37 (6) ◽  
pp. 1593-1603 ◽  
Author(s):  
Erik Díaz-Valenzuela ◽  
Ruairidh H Sawers ◽  
Angélica Cibrián-Jaramillo

Abstract The process of domestication requires the rapid transformation of the wild morphology into the cultivated forms that humans select for. This process often takes place through changes in the regulation of genes, yet, there is no definite pattern on the role of cis- and trans-acting regulatory variations in the domestication of the fruit among crops. Using allele-specific expression and network analyses, we characterized the regulatory patterns and the inheritance of gene expression in wild and cultivated accessions of chili pepper, a crop with remarkable fruit morphological variation. We propose that gene expression differences associated to the cultivated form are best explained by cis-regulatory hubs acting through trans-regulatory cascades. We show that in cultivated chili, the expression of genes associated with fruit morphology is partially recessive with respect to those in the wild relative, consistent with the hybrid fruit phenotype. Decreased expression of fruit maturation and growth genes in cultivated chili suggest that selection for loss-of-function took place in its domestication. Trans-regulatory changes underlie the majority of the genes showing regulatory divergence and had larger effect sizes on gene expression than cis-regulatory variants. Network analysis of selected cis-regulated genes, including ARP9 and MED25, indicated their interaction with many transcription factors involved in organ growth and fruit ripening. Differentially expressed genes linked to cis-regulatory variants and their interactions with downstream trans-acting genes have the potential to drive the morphological differences observed between wild and cultivated fruits and provide an attractive mechanism of morphological transformation during the domestication of the chili pepper.


2021 ◽  
Vol 12 ◽  
Author(s):  
Claire P. Prowse-Wilkins ◽  
Jianghui Wang ◽  
Ruidong Xiang ◽  
Josie B. Garner ◽  
Michael E. Goddard ◽  
...  

Genetic variants which affect complex traits (causal variants) are thought to be found in functional regions of the genome. Identifying causal variants would be useful for predicting complex trait phenotypes in dairy cows, however, functional regions are poorly annotated in the bovine genome. Functional regions can be identified on a genome-wide scale by assaying for post-translational modifications to histone proteins (histone modifications) and proteins interacting with the genome (e.g., transcription factors) using a method called Chromatin immunoprecipitation followed by sequencing (ChIP-seq). In this study ChIP-seq was performed to find functional regions in the bovine genome by assaying for four histone modifications (H3K4Me1, H3K4Me3, H3K27ac, and H3K27Me3) and one transcription factor (CTCF) in 6 tissues (heart, kidney, liver, lung, mammary and spleen) from 2 to 3 lactating dairy cows. Eighty-six ChIP-seq samples were generated in this study, identifying millions of functional regions in the bovine genome. Combinations of histone modifications and CTCF were found using ChromHMM and annotated by comparing with active and inactive genes across the genome. Functional marks differed between tissues highlighting areas which might be particularly important to tissue-specific regulation. Supporting the cis-regulatory role of functional regions, the read counts in some ChIP peaks correlated with nearby gene expression. The functional regions identified in this study were enriched for putative causal variants as seen in other species. Interestingly, regions which correlated with gene expression were particularly enriched for potential causal variants. This supports the hypothesis that complex traits are regulated by variants that alter gene expression. This study provides one of the largest ChIP-seq annotation resources in cattle including, for the first time, in the mammary gland of lactating cows. By linking regulatory regions to expression QTL and trait QTL we demonstrate a new strategy for identifying causal variants in cattle.


2014 ◽  
Author(s):  
Alfonso Buil ◽  
Andrew A Brown ◽  
Tuuli Lappalainen ◽  
Ana Viñuela ◽  
Matthew N Davies ◽  
...  

Understanding the genetic architecture of gene expression is an intermediate step to understand the genetic architecture of complex diseases. RNA-seq technologies have improved the quantification of gene expression and allow to measure allelic specific expression (ASE)1-3. ASE is hypothesized to result from the direct effect of cis regulatory variants, but a proper estimation of the causes of ASE has not been performed to date. In this study we take advantage of a sample of twins to measure the relative contribution of genetic and environmental effects on ASE and we found substantial effects of gene x gene (GxG) and gene x environment (GxE) interactions. We propose a model where ASE requires genetic variability in cis, a difference in the sequence of both alleles, but the magnitude of the ASE effect depends on trans genetic and environmental factors that interact with the cis genetic variants. We uncover large GxG and GxE effects on gene expression and likely complex phenotypes that currently remain elusive.


2021 ◽  
Author(s):  
Ching-Hua Shih ◽  
Justin C. Fay

Evolution of cis-regulatory sequences depends on how they effect gene expression and motivates both the identification and prediction of cis-regulatory variants responsible for expression differences within and between species. While much progress has been made in relating cis-regulatory variants to expression levels, the timing of gene activation and repression may also be important to the evolution of cis-regulatory sequences. We investigated allele-specific expression (ASE) dynamics within and between Saccharomyces species during the diauxic shift and found appreciable cis-acting variation in gene expression dynamics. Within species ASE is associated with intergenic variants, but ASE dynamics are more strongly associated with insertions and deletions than ASE levels. To refine these associations we used a high-throughput reporter assay to test promoter regions and individual variants. Within the subset of regions that recapitulated endogenous expression we identified and characterized cis-regulatory variants that affect expression dynamics. Between species, chimeric promoter regions generate novel patterns and indicate constraints on the evolution of gene expression dynamics. We conclude that changes in cis-regulatory sequences can tune gene expression dynamics and that the interplay between expression dynamics and other aspects expression are relevant to the evolution of cis-regulatory sequences.


2018 ◽  
Author(s):  
Cynthia A. Kalita ◽  
Christopher D. Brown ◽  
Andrew Freiman ◽  
Jenna Isherwood ◽  
Xiaoquan Wen ◽  
...  

Many variants associated with complex traits are in non-coding regions, and contribute to phenotypes by disrupting regulatory sequences. To characterize these variants, we developed a streamlined protocol for a high-throughput reporter assay, BiT-STARR-seq (Biallelic Targeted STARR-seq), that identifies allele-specific expression (ASE) while accounting for PCR duplicates through unique molecular identifiers. We tested 75,501 oligos (43,500 SNPs) and identified 2,720 SNPs with significant ASE (FDR 10%). To validate disruption of binding as one of the mechanisms underlying ASE, we developed a new high throughput allele specific binding assay for NFKB-p50. We identified 2,951 SNPs with allele-specific binding (ASB) (FDR 10%); 173 of these SNPs also had ASE (OR=1.97, p-value=0.0006). Of variants associated with complex traits, 1,531 resulted in ASE and 1,662 showed ASB. For example, we characterized that the Crohn’s disease risk variant for rs3810936 increases NFKB binding and results in altered gene expression.


2016 ◽  
Author(s):  
François Aguet ◽  
Andrew A. Brown ◽  
Stephane E. Castel ◽  
Joe R. Davis ◽  
Pejman Mohammadi ◽  
...  

AbstractExpression quantitative trait locus (eQTL) mapping provides a powerful means to identify functional variants influencing gene expression and disease pathogenesis. We report the identification of cis-eQTLs from 7,051 post-mortem samples representing 44 tissues and 449 individuals as part of the Genotype-Tissue Expression (GTEx) project. We find a cis-eQTL for 88% of all annotated protein-coding genes, with one-third having multiple independent effects. We identify numerous tissue-specific cis-eQTLs, highlighting the unique functional impact of regulatory variation in diverse tissues. By integrating large-scale functional genomics data and state-of-the-art fine-mapping algorithms, we identify multiple features predictive of tissue-specific and shared regulatory effects. We improve estimates of cis-eQTL sharing and effect sizes using allele specific expression across tissues. Finally, we demonstrate the utility of this large compendium of cis-eQTLs for understanding the tissue-specific etiology of complex traits, including coronary artery disease. The GTEx project provides an exceptional resource that has improved our understanding of gene regulation across tissues and the role of regulatory variation in human genetic diseases.


Sign in / Sign up

Export Citation Format

Share Document