The effect of X-linked dosage compensation on complex trait variation

Mapping Intimacies ◽

10.1101/433870 ◽

2018 ◽

Cited By ~ 3

Author(s):

Julia Sidorenko ◽

Irfahan Kassam ◽

Kathryn Kemper ◽

Jian Zeng ◽

Luke Lloyd-Jones ◽

...

Keyword(s):

Gene Expression ◽

Dosage Compensation ◽

Complex Traits ◽

Complex Trait ◽

Detectable Effect ◽

Phenotypic Trait ◽

Trait Variation ◽

Biologically Relevant ◽

Escape From X Inactivation ◽

The Uk

SummaryQuantitative genetics theory predicts that X-chromosome dosage compensation between sexes will have a detectable effect on the amount of genetic and therefore phenotypic trait variances at associated loci in males and females. Here, we systematically examine the role of dosage compensation in complex trait variation in humans in 20 complex traits in a sample of more than 450,000 individuals from the UK Biobank and in 1,600 gene expression traits from a sample of 2,000 individuals as well as across-tissue gene expression from the GTEx resource. We find, on average, twice as much genetic variation for complex traits due to X-linked loci in males compared to females, consistent with a negligible effect of predicted escape from X-inactivation on complex trait variation across traits and also detect biologically relevant X-linked heterogeneity between the sexes for a number of complex traits.

Quantifying the contribution of dominance effects to complex trait variation in biobank-scale data

10.1101/2020.11.10.376897 ◽

2020 ◽

Author(s):

Ali Pazokitoroudi ◽

Alec M. Chiu ◽

Kathryn S. Burch ◽

Bogdan Pasaniuc ◽

Sriram Sankararaman

Keyword(s):

Complex Traits ◽

Complex Trait ◽

Genetic Effects ◽

Trait Variation ◽

Wide Range ◽

Unbiased Estimates ◽

The Uk ◽

Additive Genetic Effects ◽

Intense Debate ◽

Scale Data

AbstractThe proportion of variation in complex traits that can be attributed to non-additive genetic effects has been a topic of intense debate. The availability of Biobank-scale datasets of genotype and trait data from unrelated individuals opens up the possibility of obtaining precise estimates of the contribution of non-additive genetic effects. We present an efficient method that can partition the variation in complex traits into variance that can be attributed to additive (additive heritability) and dominance (dominance heritability) effects across all genotyped SNPs in a large collection of unrelated individuals. Over a wide range of genetic architectures, our method yields unbiased estimates of heritability. We applied our method, in turn, to array genotypes as well as imputed genotypes (at common SNPs with minor allele frequency, MAF > 1%) and 50 quantitative traits measured in 291, 273 unrelated white British individuals in the UK Biobank. Averaged across these 50 traits, we find that additive heritability on array SNPs is 21.86% while dominance heritability is 0.13% (about 0.48% of the additive heritability) with qualitatively similar results for imputed genotypes. We find no evidence for dominance heritability ( accounting for the number of traits tested) and estimate that dominance heritability is unlikely to exceed 1% for the traits analyzed. Our analyses indicate a limited contribution of dominance heritability to complex trait variation.

Comparing allele specific expression and local expression quantitative trait loci and the influence of gene expression on complex trait variation in cattle

BMC Genomics ◽

10.1186/s12864-018-5181-0 ◽

2018 ◽

Vol 19 (1) ◽

Cited By ~ 10

Author(s):

Majid Khansefid ◽

Jennie E. Pryce ◽

Sunduimijid Bolormaa ◽

Yizhou Chen ◽

Catriona A. Millen ◽

...

Keyword(s):

Gene Expression ◽

Quantitative Trait Loci ◽

Quantitative Trait ◽

Complex Trait ◽

Expression Quantitative Trait Loci ◽

Trait Variation ◽

Specific Expression ◽

Allele Specific Expression ◽

Allele Specific ◽

Local Expression

LDpred-funct: incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets

10.1101/375337 ◽

2018 ◽

Cited By ~ 21

Author(s):

Carla Márquez-Luna ◽

Steven Gazal ◽

Po-Ru Loh ◽

Samuel S. Kim ◽

Nicholas Furlotte ◽

...

Keyword(s):

Complex Traits ◽

Prediction Accuracy ◽

Causal Effect ◽

Complex Trait ◽

Training Data ◽

Data Sets ◽

Uk Biobank ◽

Validation Data ◽

Functional Regions ◽

The Uk

AbstractGenetic variants in functional regions of the genome are enriched for complex trait heritability. Here, we introduce a new method for polygenic prediction, LDpred-funct, that leverages trait-specific functional priors to increase prediction accuracy. We fit priors using the recently developed baseline-LD model, which includes coding, conserved, regulatory and LD-related annotations. We analytically estimate posterior mean causal effect sizes and then use cross-validation to regularize these estimates, improving prediction accuracy for sparse architectures. LDpred-funct attained higher prediction accuracy than other polygenic prediction methods in simulations using real genotypes. We applied LDpred-funct to predict 21 highly heritable traits in the UK Biobank. We used association statistics from British-ancestry samples as training data (avg N=373K) and samples of other European ancestries as validation data (avg N=22K), to minimize confounding. LDpred-funct attained a +4.6% relative improvement in average prediction accuracy (avg prediction R2=0.144; highest R2=0.413 for height) compared to SBayesR (the best method that does not incorporate functional information). For height, meta-analyzing training data from UK Biobank and 23andMe cohorts (total N=1107K; higher heritability in UK Biobank cohort) increased prediction R2 to 0.431. Our results show that incorporating functional priors improves polygenic prediction accuracy, consistent with the functional architecture of complex traits.

Reference Trait Analysis Reveals Correlations Between Gene Expression and Quantitative Traits in Disjoint Samples

Genetics ◽

10.1534/genetics.118.301865 ◽

2019 ◽

Vol 212 (3) ◽

pp. 919-929

Author(s):

Daniel A. Skelly ◽

Narayanan Raghupathy ◽

Raymond F. Robledo ◽

Joel H. Graber ◽

Elissa J. Chesler

Keyword(s):

Gene Expression ◽

Canonical Correlation ◽

Complex Traits ◽

Behavioral Genetics ◽

Association Studies ◽

Complex Trait ◽

Integrated Analysis ◽

Data Set ◽

Trait Analysis ◽

Molecular Features

Systems genetic analysis of complex traits involves the integrated analysis of genetic, genomic, and disease-related measures. However, these data are often collected separately across multiple study populations, rendering direct correlation of molecular features to complex traits impossible. Recent transcriptome-wide association studies (TWAS) have harnessed gene expression quantitative trait loci (eQTL) to associate unmeasured gene expression with a complex trait in genotyped individuals, but this approach relies primarily on strong eQTL. We propose a simple and powerful alternative strategy for correlating independently obtained sets of complex traits and molecular features. In contrast to TWAS, our approach gains precision by correlating complex traits through a common set of continuous phenotypes instead of genetic predictors, and can identify transcript–trait correlations for which the regulation is not genetic. In our approach, a set of multiple quantitative “reference” traits is measured across all individuals, while measures of the complex trait of interest and transcriptional profiles are obtained in disjoint subsamples. A conventional multivariate statistical method, canonical correlation analysis, is used to relate the reference traits and traits of interest to identify gene expression correlates. We evaluate power and sample size requirements of this methodology, as well as performance relative to other methods, via extensive simulation and analysis of a behavioral genetics experiment in 258 Diversity Outbred mice involving two independent sets of anxiety-related behaviors and hippocampal gene expression. After splitting the data set and hiding one set of anxiety-related traits in half the samples, we identified transcripts correlated with the hidden traits using the other set of anxiety-related traits and exploiting the highest canonical correlation (R = 0.69) between the trait data sets. We demonstrate that this approach outperforms TWAS in identifying associated transcripts. Together, these results demonstrate the validity, reliability, and power of reference trait analysis for identifying relations between complex traits and their molecular substrates.

Investigating tissue-relevant causal molecular mechanisms of complex traits using probabilistic TWAS analysis

10.1101/808295 ◽

2019 ◽

Cited By ~ 2

Author(s):

Yuhua Zhang ◽

Corbin Quick ◽

Ketian Yu ◽

Alvaro Barbeira ◽

Francesca Luca ◽

...

Keyword(s):

Gene Expression ◽

Complex Traits ◽

Large Scale ◽

Molecular Mechanisms ◽

Association Studies ◽

Complex Trait ◽

Causal Effects ◽

Biological Mechanisms ◽

Integrative Framework ◽

Eqtl Data

AbstractTranscriptome-wide association studies (TWAS), an integrative framework using expression quantitative trait loci (eQTLs) to construct proxies for gene expression, have emerged as a promising method to investigate the biological mechanisms underlying associations between genotypes and complex traits. However, challenges remain in interpreting TWAS results, especially regarding their causality implications. In this paper, we describe a new computational framework, probabilistic TWAS (PTWAS), to detect associations and investigate causal relationships between gene expression and complex traits. We use established concepts and principles from instrumental variables (IV) analysis to delineate and address the unique challenges that arise in TWAS. PTWAS utilizes probabilistic eQTL annotations derived from multi-variant Bayesian fine-mapping analysis conferring higher power to detect TWAS associations than existing methods. Additionally, PTWAS provides novel functionalities to evaluate the causal assumptions and estimate tissue- or cell-type specific causal effects of gene expression on complex traits. These features make PTWAS uniquely suited for in-depth investigations of the biological mechanisms that contribute to complex trait variation. Using eQTL data across 49 tissues from GTEx v8, we apply PTWAS to analyze 114 complex traits using GWAS summary statistics from several large-scale projects, including the UK Biobank. Our analysis reveals an abundance of genes with strong evidence of eQTL-mediated causal effects on complex traits and highlights the heterogeneity and tissue-relevance of these effects across complex traits. We distribute software and eQTL annotations to enable users performing rigorous TWAS analysis by leveraging the full potentials of the latest GTEx multi-tissue eQTL data.

Mapping short tandem repeats for liver gene expression traits helps prioritize potential causal variants for complex traits in pigs

Journal of Animal Science and Biotechnology ◽

10.1186/s40104-021-00658-z ◽

2022 ◽

Vol 13 (1) ◽

Author(s):

Zhongzi Wu ◽

Huanfa Gong ◽

Zhimin Zhou ◽

Tao Jiang ◽

Ziqi Lin ◽

...

Keyword(s):

Gene Expression ◽

Complex Traits ◽

Short Tandem Repeats ◽

Tandem Repeats ◽

Genome Wide Association Study ◽

Complex Trait ◽

Validation Population ◽

Causal Variants ◽

Liver Gene ◽

Short Tandem

Abstract Background Short tandem repeats (STRs) were recently found to have significant impacts on gene expression and diseases in humans, but their roles on gene expression and complex traits in pigs remain unexplored. This study investigates the effects of STRs on gene expression in liver tissues based on the whole-genome sequences and RNA-Seq data of a discovery cohort of 260 F6 individuals and a validation population of 296 F7 individuals from a heterogeneous population generated from crosses among eight pig breeds. Results We identified 5203 and 5868 significantly expression STRs (eSTRs, FDR < 1%) in the F6 and F7 populations, respectively, most of which could be reciprocally validated (π1 = 0.92). The eSTRs explained 27.5% of the cis-heritability of gene expression traits on average. We further identified 235 and 298 fine-mapped STRs through the Bayesian fine-mapping approach in the F6 and F7 pigs, respectively, which were significantly enriched in intron, ATAC peak, compartment A and H3K4me3 regions. We identified 20 fine-mapped STRs located in 100 kb windows upstream and downstream of published complex trait-associated SNPs, which colocalized with epigenetic markers such as H3K27ac and ATAC peaks. These included eSTR of the CLPB, PGLS, PSMD6 and DHDH genes, which are linked with genome-wide association study (GWAS) SNPs for blood-related traits, leg conformation, growth-related traits, and meat quality traits, respectively. Conclusions This study provides insights into the effects of STRs on gene expression traits. The identified eSTRs are valuable resources for prioritizing causal STRs for complex traits in pigs.

Promoter-anchored chromatin interactions predicted from genetic analysis of epigenomic data

10.1101/580993 ◽

2019 ◽

Author(s):

Yang Wu ◽

Ting Qi ◽

Huanwei Wang ◽

Futao Zhang ◽

Zhili Zheng ◽

...

Keyword(s):

Dna Methylation ◽

Complex Traits ◽

Analytical Approach ◽

Human Peripheral Blood ◽

Complex Trait ◽

Trait Variation ◽

Chromatin Interactions ◽

Level Data ◽

Trait Locus

AbstractPromoter-anchored chromatin interactions (PAIs) play a pivotal role in transcriptional regulation. Current high-throughput technologies for detecting PAIs, such as promoter capture Hi-C, are not scalable to large cohorts. Here, we present an analytical approach that uses summary-level data from cohort-based DNA methylation (DNAm) quantitative trait locus (mQTL) studies to predict PAIs. Using mQTL data from human peripheral blood (n=1,980), we predicted 34,797 PAIs which showed strong overlap with the chromatin contacts identified by previous experimental assays. The promoter-interacting DNAm sites were enriched in enhancers or near expression QTLs. Genes whose promoters were involved in PAIs were more actively expressed, and gene pairs with promoter-promoter interactions were enriched for co-expression. Integration of the predicted PAIs with GWAS data highlighted interactions among 601 DNAm sites associated with 15 complex traits. This study demonstrates the use of mQTL data to predict PAIs and provides insights into the role of PAIs in complex trait variation.

Prediction of gene expression from regulatory sequence composition enhances transcriptome-wide association studies

10.1101/2021.05.11.443571 ◽

2021 ◽

Author(s):

Federico Marotta ◽

Reza Mozafari ◽

Elena Grassi ◽

Alessandro Lussana ◽

Elisa Mariella ◽

...

Keyword(s):

Gene Expression ◽

Regression Model ◽

Complex Traits ◽

Association Studies ◽

Regulatory Region ◽

Regulatory Sequence ◽

Genome Wide Association Studies ◽

Data Set ◽

Sequence Composition ◽

The Uk

Transcriptome-wide association studies (TWAS) can prioritize trait-associated genes by finding correlations between a trait and the genetically regulated component of gene expression. A basic ingredient of a TWAS is a regression model, typically trained in an external reference data set, used to impute the genetically-regulated expression. We devised a model that improves the accuracy of the imputation by using, as predictors, not the genotypes directly but rather the sequence composition of the proximal gene regulatory region, expressed as its profile of affinities for a set of position weight matrices. When trained on 48 tissues from GTEx, the regression model showed improved performance compared with models regressing expression directly on the genotype. We imputed the expression levels in genotyped individuals from the ADNI data set, and used the imputed expression to perform a TWAS. We also developed a method to perform the TWAS based on summary statistics from genome-wide association studies, and applied it to 11 complex traits from the UK Biobank. The greater accuracy in the prediction of gene expression allowed us to report hundreds of new gene-phenotype association candidates.

Putative Causal Variants Are Enriched in Annotated Functional Regions From Six Bovine Tissues

Frontiers in Genetics ◽

10.3389/fgene.2021.664379 ◽

2021 ◽

Vol 12 ◽

Author(s):

Claire P. Prowse-Wilkins ◽

Jianghui Wang ◽

Ruidong Xiang ◽

Josie B. Garner ◽

Michael E. Goddard ◽

...

Keyword(s):

Gene Expression ◽

Dairy Cows ◽

Histone Modifications ◽

Complex Traits ◽

Bovine Genome ◽

Complex Trait ◽

Functional Regions ◽

A Genome ◽

Causal Variants ◽

Specific Regulation

Genetic variants which affect complex traits (causal variants) are thought to be found in functional regions of the genome. Identifying causal variants would be useful for predicting complex trait phenotypes in dairy cows, however, functional regions are poorly annotated in the bovine genome. Functional regions can be identified on a genome-wide scale by assaying for post-translational modifications to histone proteins (histone modifications) and proteins interacting with the genome (e.g., transcription factors) using a method called Chromatin immunoprecipitation followed by sequencing (ChIP-seq). In this study ChIP-seq was performed to find functional regions in the bovine genome by assaying for four histone modifications (H3K4Me1, H3K4Me3, H3K27ac, and H3K27Me3) and one transcription factor (CTCF) in 6 tissues (heart, kidney, liver, lung, mammary and spleen) from 2 to 3 lactating dairy cows. Eighty-six ChIP-seq samples were generated in this study, identifying millions of functional regions in the bovine genome. Combinations of histone modifications and CTCF were found using ChromHMM and annotated by comparing with active and inactive genes across the genome. Functional marks differed between tissues highlighting areas which might be particularly important to tissue-specific regulation. Supporting the cis-regulatory role of functional regions, the read counts in some ChIP peaks correlated with nearby gene expression. The functional regions identified in this study were enriched for putative causal variants as seen in other species. Interestingly, regions which correlated with gene expression were particularly enriched for potential causal variants. This supports the hypothesis that complex traits are regulated by variants that alter gene expression. This study provides one of the largest ChIP-seq annotation resources in cattle including, for the first time, in the mammary gland of lactating cows. By linking regulatory regions to expression QTL and trait QTL we demonstrate a new strategy for identifying causal variants in cattle.

An integrative analysis of genomic and exposomic data for complex traits and phenotypic prediction

Scientific Reports ◽

10.1038/s41598-021-00427-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Xuan Zhou ◽

S. Hong Lee

Keyword(s):

Complex Traits ◽

Prediction Accuracy ◽

Mixed Model ◽

Linear Mixed Model ◽

Complex Trait ◽

Great Promise ◽

Phenotypic Variance ◽

Additive Effects ◽

Mixed Model Approach ◽

The Uk

AbstractComplementary to the genome, the concept of exposome has been proposed to capture the totality of human environmental exposures. While there has been some recent progress on the construction of the exposome, few tools exist that can integrate the genome and exposome for complex trait analyses. Here we propose a linear mixed model approach to bridge this gap, which jointly models the random effects of the two omics layers on phenotypes of complex traits. We illustrate our approach using traits from the UK Biobank (e.g., BMI and height for N ~ 35,000) with a small fraction of the exposome that comprises 28 lifestyle factors. The joint model of the genome and exposome explains substantially more phenotypic variance and significantly improves phenotypic prediction accuracy, compared to the model based on the genome alone. The additional phenotypic variance captured by the exposome includes its additive effects as well as non-additive effects such as genome–exposome (gxe) and exposome–exposome (exe) interactions. For example, 19% of variation in BMI is explained by additive effects of the genome, while additional 7.2% by additive effects of the exposome, 1.9% by exe interactions and 4.5% by gxe interactions. Correspondingly, the prediction accuracy for BMI, computed using Pearson’s correlation between the observed and predicted phenotypes, improves from 0.15 (based on the genome alone) to 0.35 (based on the genome and exposome). We also show, using established theories, that integrating genomic and exposomic data can be an effective way of attaining a clinically meaningful level of prediction accuracy for disease traits. In conclusion, the genomic and exposomic effects can contribute to phenotypic variation via their latent relationships, i.e. genome-exposome correlation, and gxe and exe interactions, and modelling these effects has a potential to improve phenotypic prediction accuracy and thus holds a great promise for future clinical practice.