scholarly journals Systematic identification of cis-regulatory variants that cause gene expression differences in a yeast cross

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Kaushik Renganaath ◽  
Rocky Cheung ◽  
Laura Day ◽  
Sriram Kosuri ◽  
Leonid Kruglyak ◽  
...  

Sequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5832 natural DNA variants in the promoters of 2503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.

2020 ◽  
Author(s):  
Kaushik Renganaath ◽  
Rocky Cheung ◽  
Laura Day ◽  
Sriram Kosuri ◽  
Leonid Kruglyak ◽  
...  

AbstractSequence variation in regulatory DNA alters gene expression and shapes genetically complex traits. However, the identification of individual, causal regulatory variants is challenging. Here, we used a massively parallel reporter assay to measure the cis-regulatory consequences of 5,832 natural DNA variants in the promoters of 2,503 genes in the yeast Saccharomyces cerevisiae. We identified 451 causal variants, which underlie genetic loci known to affect gene expression. Several promoters harbored multiple causal variants. In five promoters, pairs of variants showed non-additive, epistatic interactions. Causal variants were enriched at conserved nucleotides, tended to have low derived allele frequency, and were depleted from promoters of essential genes, which is consistent with the action of negative selection. Causal variants were also enriched for alterations in transcription factor binding sites. Models integrating these features provided modest, but statistically significant, ability to predict causal variants. This work revealed a complex molecular basis for cis-acting regulatory variation.


2019 ◽  
Vol 28 (17) ◽  
pp. 2976-2986 ◽  
Author(s):  
Irfahan Kassam ◽  
Yang Wu ◽  
Jian Yang ◽  
Peter M Visscher ◽  
Allan F McRae

Abstract Despite extensive sex differences in human complex traits and disease, the male and female genomes differ only in the sex chromosomes. This implies that most sex-differentiated traits are the result of differences in the expression of genes that are common to both sexes. While sex differences in gene expression have been observed in a range of different tissues, the biological mechanisms for tissue-specific sex differences (TSSDs) in gene expression are not well understood. A total of 30 640 autosomal and 1021 X-linked transcripts were tested for heterogeneity in sex difference effect sizes in n = 617 individuals across 40 tissue types in Genotype–Tissue Expression (GTEx). This identified 65 autosomal and 66 X-linked TSSD transcripts (corresponding to unique genes) at a stringent significance threshold. Results for X-linked TSSD transcripts showed mainly concordant direction of sex differences across tissues and replicate previous findings. Autosomal TSSD transcripts had mainly discordant direction of sex differences across tissues. The top cis-expression quantitative trait loci (eQTLs) across tissues for autosomal TSSD transcripts are located a similar distance away from the nearest androgen and estrogen binding motifs and the nearest enhancer, as compared to cis-eQTLs for transcripts with stable sex differences in gene expression across tissue types. Enhancer regions that overlap top cis-eQTLs for TSSD transcripts, however, were found to be more dispersed across tissues. These observations suggest that androgen and estrogen regulatory elements in a cis region may play a common role in sex differences in gene expression, but TSSD in gene expression may additionally be due to causal variants located in tissue-specific enhancer regions.


2020 ◽  
Author(s):  
Kaushik Renganaath ◽  
Rocky Cheung ◽  
Laura Day ◽  
Sriram Kosuri ◽  
Leonid Kruglyak ◽  
...  

2019 ◽  
Author(s):  
Christoph D. Rau ◽  
Natalia M. Gonzales ◽  
Joshua S. Bloom ◽  
Danny Park ◽  
Julien Ayroles ◽  
...  

AbstractBackgroundThe majority of quantitative genetic models used to map complex traits assume that alleles have similar effects across all individuals. Significant evidence suggests, however, that epistatic interactions modulate the impact of many alleles. Nevertheless, identifying epistatic interactions remains computationally and statistically challenging. In this work, we address some of these challenges by developing a statistical test for polygenic epistasis that determines whether the effect of an allele is altered by the global genetic ancestry proportion from distinct progenitors.ResultsWe applied our method to data from mice and yeast. For the mice, we observed 49 significant genotype-by-ancestry interaction associations across 14 phenotypes as well as over 1,400 Bonferroni-corrected genotype-by-ancestry interaction associations for mouse gene expression data. For the yeast, we observed 92 significant genotype-by-ancestry interactions across 38 phenotypes. Given this evidence of epistasis, we test for and observe evidence of rapid selection pressure on ancestry specific polymorphisms within one of the cohorts, consistent with epistatic selection.ConclusionsUnlike our prior work in human populations, we observe widespread evidence of ancestry-modified SNP effects, perhaps reflecting the greater divergence present in crosses using mice and yeast.Author SummaryMany statistical tests which link genetic markers in the genome to differences in traits rely on the assumption that the same polymorphism will have identical effects in different individuals. However, there is substantial evidence indicating that this is not the case. Epistasis is the phenomenon in which multiple polymorphisms interact with one another to amplify or negate each other’s effects on a trait. We hypothesized that individual SNP effects could be changed in a polygenic manner, such that the proportion of as genetic ancestry, rather than specific markers, might be used to capture epistatic interactions. Motivated by this possibility, we develop a new statistical test that allowed us to examine the genome to identify polymorphisms which have different effects depending on the ancestral makeup of each individual. We use our test in two different populations of inbred mice and a yeast panel and demonstrate that these sorts of variable effect polymorphisms exist in 14 different physical traits in mice and 38 phenotypes in yeast as well as in murine gene expression. We use the term “polygenic epistasis” to distinguish these interactions from the more conventional two- or multi-locus interactions.


2021 ◽  
Author(s):  
Roshni A. Patel ◽  
Shaila A. Musharoff ◽  
Jeffrey P. Spence ◽  
Harold Pimentel ◽  
Catherine Tcheandjieu ◽  
...  

Despite the growing number of genome-wide association studies (GWAS) for complex traits, it remains unclear whether effect sizes of causal genetic variants differ between populations. In principle, effect sizes of causal variants could differ between populations due to gene-by-gene or gene-by-environment interactions. However, comparing causal variant effect sizes is challenging: it is difficult to know which variants are causal, and comparisons of variant effect sizes are confounded by differences in linkage disequilibrium (LD) structure between ancestries. Here, we develop a method to assess causal variant effect size differences that overcomes these limitations. Specifically, we leverage the fact that segments of European ancestry shared between European-American and admixed African-American individuals have similar LD structure, allowing for unbiased comparisons of variant effect sizes in European ancestry segments. We apply our method to two types of traits: gene expression and low-density lipoprotein cholesterol (LDL-C). We find that causal variant effect sizes for gene expression are significantly different between European-Americans and African-Americans; for LDL-C, we observe a similar point estimate although this is not significant, likely due to lower statistical power. Cross-population differences in variant effect sizes highlight the role of genetic interactions in trait architecture and will contribute to the poor portability of polygenic scores across populations, reinforcing the importance of conducting GWAS on individuals of diverse ancestries and environments.


2022 ◽  
Vol 13 (1) ◽  
Author(s):  
Zhongzi Wu ◽  
Huanfa Gong ◽  
Zhimin Zhou ◽  
Tao Jiang ◽  
Ziqi Lin ◽  
...  

Abstract Background Short tandem repeats (STRs) were recently found to have significant impacts on gene expression and diseases in humans, but their roles on gene expression and complex traits in pigs remain unexplored. This study investigates the effects of STRs on gene expression in liver tissues based on the whole-genome sequences and RNA-Seq data of a discovery cohort of 260 F6 individuals and a validation population of 296 F7 individuals from a heterogeneous population generated from crosses among eight pig breeds. Results We identified 5203 and 5868 significantly expression STRs (eSTRs, FDR < 1%) in the F6 and F7 populations, respectively, most of which could be reciprocally validated (π1 = 0.92). The eSTRs explained 27.5% of the cis-heritability of gene expression traits on average. We further identified 235 and 298 fine-mapped STRs through the Bayesian fine-mapping approach in the F6 and F7 pigs, respectively, which were significantly enriched in intron, ATAC peak, compartment A and H3K4me3 regions. We identified 20 fine-mapped STRs located in 100 kb windows upstream and downstream of published complex trait-associated SNPs, which colocalized with epigenetic markers such as H3K27ac and ATAC peaks. These included eSTR of the CLPB, PGLS, PSMD6 and DHDH genes, which are linked with genome-wide association study (GWAS) SNPs for blood-related traits, leg conformation, growth-related traits, and meat quality traits, respectively. Conclusions This study provides insights into the effects of STRs on gene expression traits. The identified eSTRs are valuable resources for prioritizing causal STRs for complex traits in pigs.


2018 ◽  
Author(s):  
Min Wang ◽  
Timothy P Hancock ◽  
Amanda J. Chamberlain ◽  
Christy J. Vander Jagt ◽  
Jennie E Pryce ◽  
...  

AbstractBackgroundTopological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants.ResultsWe used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping gene, tRNA genes, CTCF binding motifs, SINEs, H3K4me3 and H3K27ac. Then we showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows’ white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The most significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value ≤ 0.001).ConclusionsOur results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative regulatory variants in the bovine genome.


2021 ◽  
Vol 12 ◽  
Author(s):  
Claire P. Prowse-Wilkins ◽  
Jianghui Wang ◽  
Ruidong Xiang ◽  
Josie B. Garner ◽  
Michael E. Goddard ◽  
...  

Genetic variants which affect complex traits (causal variants) are thought to be found in functional regions of the genome. Identifying causal variants would be useful for predicting complex trait phenotypes in dairy cows, however, functional regions are poorly annotated in the bovine genome. Functional regions can be identified on a genome-wide scale by assaying for post-translational modifications to histone proteins (histone modifications) and proteins interacting with the genome (e.g., transcription factors) using a method called Chromatin immunoprecipitation followed by sequencing (ChIP-seq). In this study ChIP-seq was performed to find functional regions in the bovine genome by assaying for four histone modifications (H3K4Me1, H3K4Me3, H3K27ac, and H3K27Me3) and one transcription factor (CTCF) in 6 tissues (heart, kidney, liver, lung, mammary and spleen) from 2 to 3 lactating dairy cows. Eighty-six ChIP-seq samples were generated in this study, identifying millions of functional regions in the bovine genome. Combinations of histone modifications and CTCF were found using ChromHMM and annotated by comparing with active and inactive genes across the genome. Functional marks differed between tissues highlighting areas which might be particularly important to tissue-specific regulation. Supporting the cis-regulatory role of functional regions, the read counts in some ChIP peaks correlated with nearby gene expression. The functional regions identified in this study were enriched for putative causal variants as seen in other species. Interestingly, regions which correlated with gene expression were particularly enriched for potential causal variants. This supports the hypothesis that complex traits are regulated by variants that alter gene expression. This study provides one of the largest ChIP-seq annotation resources in cattle including, for the first time, in the mammary gland of lactating cows. By linking regulatory regions to expression QTL and trait QTL we demonstrate a new strategy for identifying causal variants in cattle.


2017 ◽  
Author(s):  
Anna L. Tyler ◽  
Bo Ji ◽  
Daniel M. Gatti ◽  
Steven C. Munger ◽  
Gary A. Churchill ◽  
...  

ABSTRACTGenetic studies of multidimensional phenotypes can potentially link genetic variation, gene expression, and physiological data to create multi-scale models of complex traits. Multi-parent populations provide a resource for developing methods to understand these relationships. In this study, we simultaneously modeled body composition, serum biomarkers, and liver transcript abundances from 474 Diversity Outbred mice. This population contained both sexes and two dietary cohorts. Using weighted gene co-expression network analysis (WGCNA), we summarized transcript data into functional modules which we then used as summary phenotypes representing enriched biological processes. These module phenotypes were jointly analyzed with body composition and serum biomarkers in a combined analysis of pleiotropy and epistasis (CAPE), which inferred networks of epistatic interactions between quantitative trait loci that affect one or more traits. This network frequently mapped interactions between alleles of different ancestries, providing evidence of both genetic synergy and redundancy between haplotypes. Furthermore, a number of loci interacted with sex and diet to yield sex-specific genetic effects. We were also able to identify alleles that potentially protect individuals from the effects of a high-fat diet. Although the epistatic interactions explained small amounts of trait variance, the combination of directional interactions, allelic specificity, and high genomic resolution provided context to generate hypotheses for the roles of specific genes in complex traits. Our approach moves beyond the cataloging of single loci to infer genetic networks that map genetic etiology by simultaneously modeling all phenotypes.


Sign in / Sign up

Export Citation Format

Share Document