HOPS: a quantitative score reveals pervasive horizontal pleiotropy in human genetic variation is driven by extreme polygenicity of human traits and diseases

Abstract Horizontal pleiotropy, where one variant has independent effects on multiple traits, is important for our understanding of the genetic architecture of human phenotypes. We develop a method to quantify horizontal pleiotropy using genome-wide association summary statistics and apply it to 372 heritable phenotypes measured in 361,194 UK Biobank individuals. Horizontal pleiotropy is pervasive throughout the human genome, prominent among highly polygenic phenotypes, and enriched in active regulatory regions. Our results highlight the central role horizontal pleiotropy plays in the genetic architecture of human phenotypes. The HOrizontal Pleiotropy Score (HOPS) method is available on Github at https://github.com/rondolab/HOPS.

Download Full-text

HOPS: a quantitative score reveals pervasive horizontal pleiotropy in human genetic variation is driven by extreme polygenicity of human traits and diseases

10.1101/311332 ◽

2018 ◽

Cited By ~ 6

Author(s):

Daniel M. Jordan ◽

Marie Verbanck ◽

Ron Do

Keyword(s):

Genetic Variation ◽

Genetic Architecture ◽

Genome Wide Association ◽

Summary Statistics ◽

Human Genetic Variation ◽

Uk Biobank ◽

Multiple Traits ◽

Genome Wide ◽

Quantitative Score ◽

Independent Effects

AbstractHorizontal pleiotropy, where one variant has independent effects on multiple traits, is important for our understanding of the genetic architecture of human phenotypes. We develop a method to quantify horizontal pleiotropy using genome-wide association summary statistics and apply it to 372 heritable phenotypes measured in 361,194 UK Biobank individuals. Horizontal pleiotropy is pervasive throughout the human genome, prominent among highly polygenic phenotypes, and enriched in active regulatory regions. Our results highlight the central role horizontal pleiotropy plays in the genetic architecture of human phenotypes. The HOrizontal Pleiotropy Score (HOPS) method is available on Github at https://github.com/rondolab/HOPS.

Download Full-text

Bayesian multivariate reanalysis of large genetic studies identifies many new associations

10.1101/638882 ◽

2019 ◽

Author(s):

Michael C. Turchin ◽

Matthew Stephens

Keyword(s):

Genetic Variation ◽

Multivariate Analyses ◽

Association Studies ◽

Plasma Lipid ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Multiple Traits ◽

Association Analyses ◽

New Associations ◽

Genome Wide

AbstractGenome-wide association studies (GWAS) have now been conducted for hundreds of phenotypes of relevance to human health. Many such GWAS involve multiple closely-related phenotypes collected on the same samples. However, the vast majority of these GWAS have been analyzed using simple univariate analyses, which consider one phenotype at a time. This is de-spite the fact that, at least in simulation experiments, multivariate analyses have been shown to be more powerful at detecting associations. Here, we conduct multivariate association analyses on 13 different publicly-available GWAS datasets that involve multiple closely-related phenotypes. These data include large studies of anthropometric traits (GIANT), plasma lipid traits (GlobalLipids), and red blood cell traits (HaemgenRBC). Our analyses identify many new associations (433 in total across the 13 studies), many of which replicate when follow-up samples are available. Overall, our results demonstrate that multivariate analyses can help make more effective use of data from both existing and future GWAS.1Author SummaryGenome-wide association studies (GWAS) have become a common and powerful tool for identifying significant correlations between markers of genetic variation and physical traits of interest. Often these studies are conducted by comparing genetic variation against single traits one at a time (‘univariate’); however, it has previously been shown that it is possible to increase your power to detect significant associations by comparing genetic variation against multiple traits simultaneously (‘multivariate’). Despite this apparent increase in power though, researchers still rarely conduct multivariate GWAS, even when studies have multiple traits readily available. Here, we reanalyze 13 previously published GWAS using a multivariate method and find >400 additional associations. Our method makes use of univariate GWAS summary statistics and is available as a software package, thus making it accessible to other researchers interested in conducting the same analyses. We also show, using studies that have multiple releases, that our new associations have high rates of replication. Overall, we argue multivariate approaches in GWAS should no longer be overlooked and how, often, there is low-hanging fruit in the form of new associations by running these methods on data already collected.

Download Full-text

Expanding the genetic architecture of nicotine dependence and its shared genetics with multiple traits

Nature Communications ◽

10.1038/s41467-020-19265-z ◽

2020 ◽

Vol 11 (1) ◽

Author(s):

Bryan C. Quach ◽

Michael J. Bray ◽

Nathan C. Gaddis ◽

Mengzhen Liu ◽

Teemu Palviainen ◽

...

Keyword(s):

Nicotine Dependence ◽

Genetic Architecture ◽

Genome Wide Association Study ◽

African Ancestry ◽

Genetic Knowledge ◽

Uk Biobank ◽

Multiple Traits ◽

Expression Of Genes ◽

Genome Wide ◽

Smoking Index

Abstract Cigarette smoking is the leading cause of preventable morbidity and mortality. Genetic variation contributes to initiation, regular smoking, nicotine dependence, and cessation. We present a Fagerström Test for Nicotine Dependence (FTND)-based genome-wide association study in 58,000 European or African ancestry smokers. We observe five genome-wide significant loci, including previously unreported loci MAGI2/GNAI1 (rs2714700) and TENM2 (rs1862416), and extend loci reported for other smoking traits to nicotine dependence. Using the heaviness of smoking index from UK Biobank (N = 33,791), rs2714700 is consistently associated; rs1862416 is not associated, likely reflecting nicotine dependence features not captured by the heaviness of smoking index. Both variants influence nearby gene expression (rs2714700/MAGI2-AS3 in hippocampus; rs1862416/TENM2 in lung), and expression of genes spanning nicotine dependence-associated variants is enriched in cerebellum. Nicotine dependence (SNP-based heritability = 8.6%) is genetically correlated with 18 other smoking traits (rg = 0.40–1.09) and co-morbidities. Our results highlight nicotine dependence-specific loci, emphasizing the FTND as a composite phenotype that expands genetic knowledge of smoking.

Download Full-text

Analytic and Translational Genetics

Annual Review of Biomedical Data Science ◽

10.1146/annurev-biodatasci-072018-021148 ◽

2020 ◽

Vol 3 (1) ◽

pp. 217-241

Author(s):

Konrad J. Karczewski ◽

Alicia R. Martin

Keyword(s):

Human Physiology ◽

Genetic Variation ◽

Therapeutic Potential ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Human Genetic Variation ◽

Genome Wide ◽

Human Genes ◽

Rare Genetic Variation

Understanding the influence of genetics on human disease is among the primary goals for biology and medicine. To this end, the direct study of natural human genetic variation has provided valuable insights into human physiology and disease as well as into the origins and migrations of humans. In this review, we discuss the foundations of population genetics, which provide a crucial context to the study of human genes and traits. In particular, genome-wide association studies and similar methods have revealed thousands of genetic loci associated with diseases and traits, providing invaluable information into the biology of these traits. Simultaneously, as the study of rare genetic variation has expanded, so-called human knockouts have elucidated the function of human genes and the therapeutic potential of targeting them.

Download Full-text

O21: GENOME-WIDE ASSOCIATION STUDY OF VARICOSE VEINS IN 810,625 INDIVIDUALS IDENTIFIES 45 GENETIC RISK LOCI

British Journal of Surgery ◽

10.1093/bjs/znab117.021 ◽

2021 ◽

Vol 108 (Supplement_1) ◽

Author(s):

WUR Ahmed ◽

A Wiberg ◽

M Ng ◽

D Furniss

Keyword(s):

Association Study ◽

Genetic Architecture ◽

Genome Wide Association Study ◽

Varicose Veins ◽

Phenotypic Expression ◽

Cell Activity ◽

Genome Wide Association ◽

Uk Biobank ◽

Genetic Components ◽

Genome Wide

Abstract Introduction Varicose veins (VV) impact a third of the UK adult population; 10% of patients develop lipodermatosclerosis and ulceration. VV often requires surgical management, however, there is a high-risk of recurrence. VV is a complex disease, where genetic and non-genetic components contribute to overall phenotypic expression. The genetic architecture of VV is poorly understood; we aimed to uncover its genetic basis. Method We conducted hitherto the largest genome-wide association study of VV. In stage one, using UK Biobank, we compared 22,473 VV patients and 379,183 controls. In stage two, replication and meta-analysis were performed in an independent cohort of 113,041 VV cases and 295,928 controls from 23&Me (California, USA). In-silico analysis was conducted in FUMA, MAGMA, and XGR. Result 109 genome-wide significant (P≤ 5×10-8) loci were identified in UK Biobank, 45 of which successfully replicated in the 23&Me cohort. Twenty-seven loci have not been previously reported. FUMA positionally-mapped 128 genes at the replicated loci, with 84 having a combined annotation-dependent depletion score (CADD) >12.37, suggesting functional, deleterious variants. MAGMA analysis implicated pathways involved in cardiovascular system development (P=1.57×10-08) and tube morphogenesis (P=9.35×10-08). Furthermore, XGR revealed enriched pathways in downstream signalling in naive CD8+ T cells (P=0.0017), and encoding structural and core extracellular glycoproteins (both P=0.007). Conclusion We identified 45 variants conferring risk of VV, which provide insights into disease biology. Implicated genes are enriched in pathways involved in vascular development, immune cell activity and extracellular matrix function, and provide new targets for therapeutic development. Take-home message Unravelling the genetic architecture of varicose veins may facilitate our understanding of the disease and guide therapeutic approaches.

Download Full-text

Genome-Wide Association Study and Genetic Correlation Scan Provide Insights into Its Genetic Architecture of Sleep Health Score in the UK Biobank Cohort

Nature and Science of Sleep ◽

10.2147/nss.s326818 ◽

2022 ◽

Vol Volume 14 ◽

pp. 1-12

Author(s):

Yao Yao ◽

Yumeng Jia ◽

Yan Wen ◽

Bolun Cheng ◽

Shiqiang Cheng ◽

...

Keyword(s):

Association Study ◽

Genetic Correlation ◽

Genetic Architecture ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Uk Biobank ◽

Sleep Health ◽

Health Score ◽

Genome Wide ◽

The Uk

Download Full-text

Analysis of common genetic variation and rare CNVs in the Australian Autism Biobank

Molecular Autism ◽

10.1186/s13229-020-00407-5 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Chloe X. Yap ◽

Gail A. Alvares ◽

Anjali K. Henders ◽

Tian Lin ◽

Leanne Wallace ◽

...

Keyword(s):

Genetic Variation ◽

Autism Spectrum ◽

Research Centre ◽

Summary Statistics ◽

Uk Biobank ◽

Common Genetic Variation ◽

Cooperative Research ◽

Rare Cnvs ◽

Genome Wide ◽

Polygenic Scores

Abstract Background Autism spectrum disorder (ASD) is a complex neurodevelopmental condition whose biological basis is yet to be elucidated. The Australian Autism Biobank (AAB) is an initiative of the Cooperative Research Centre for Living with Autism (Autism CRC) to establish an Australian resource of biospecimens, phenotypes and genomic data for research on autism. Methods Genome-wide single-nucleotide polymorphism genotypes were available for 2,477 individuals (after quality control) from 546 families (436 complete), including 886 participants aged 2 to 17 years with diagnosed (n = 871) or suspected (n = 15) ASD, 218 siblings without ASD, 1,256 parents, and 117 unrelated children without an ASD diagnosis. The genetic data were used to confirm familial relationships and assign ancestry, which was majority European (n = 1,964 European individuals). We generated polygenic scores (PGS) for ASD, IQ, chronotype and height in the subset of Europeans, and in 3,490 unrelated ancestry-matched participants from the UK Biobank. We tested for group differences for each PGS, and performed prediction analyses for related phenotypes in the AAB. We called copy-number variants (CNVs) in all participants, and intersected these with high-confidence ASD- and intellectual disability (ID)-associated CNVs and genes from the public domain. Results The ASD (p = 6.1e−13), sibling (p = 4.9e−3) and unrelated (p = 3.0e−3) groups had significantly higher ASD PGS than UK Biobank controls, whereas this was not the case for height—a control trait. The IQ PGS was a significant predictor of measured IQ in undiagnosed children (r = 0.24, p = 2.1e−3) and parents (r = 0.17, p = 8.0e−7; 4.0% of variance), but not the ASD group. Chronotype PGS predicted sleep disturbances within the ASD group (r = 0.13, p = 1.9e−3; 1.3% of variance). In the CNV analysis, we identified 13 individuals with CNVs overlapping ASD/ID-associated CNVs, and 12 with CNVs overlapping ASD/ID/developmental delay-associated genes identified on the basis of de novo variants. Limitations This dataset is modest in size, and the publicly-available genome-wide-association-study (GWAS) summary statistics used to calculate PGS for ASD and other traits are relatively underpowered. Conclusions We report on common genetic variation and rare CNVs within the AAB. Prediction analyses using currently available GWAS summary statistics are largely consistent with expected relationships based on published studies. As the size of publicly-available GWAS summary statistics grows, the phenotypic depth of the AAB dataset will provide many opportunities for analyses of autism profiles and co-occurring conditions, including when integrated with other omics datasets generated from AAB biospecimens (blood, urine, stool, hair).

Download Full-text

Estimation of Non-null SNP Effect Size Distributions Enables the Detection of Enriched Genes Underlying Complex Traits

10.1101/597484 ◽

2019 ◽

Author(s):

Wei Cheng ◽

Sohini Ramachandran ◽

Lorin Crawford

Keyword(s):

Null Hypothesis ◽

Quantitative Traits ◽

False Positive Rate ◽

Genome Wide Association ◽

Summary Statistics ◽

Uk Biobank ◽

Genome Wide ◽

Gene Level ◽

Positive Rate ◽

The Uk

AbstractTraditional univariate genome-wide association studies generate false positives and negatives due to difficulties distinguishing associated variants from variants with spurious nonzero effects that do not directly influence the trait. Recent efforts have been directed at identifying genes or signaling pathways enriched for mutations in quantitative traits or case-control studies, but these can be computationally costly and hampered by strict model assumptions. Here, we present gene-ε, a new approach for identifying statistical associations between sets of variants and quantitative traits. Our key insight is that enrichment studies on the gene-level are improved when we reformulate the genome-wide SNP-level null hypothesis to identify spurious small-to-intermediate SNP effects and classify them as non-causal. gene-ε efficiently identifies enriched genes under a variety of simulated genetic architectures, achieving greater than a 90% true positive rate at 1% false positive rate for polygenic traits. Lastly, we apply gene-ε to summary statistics derived from six quantitative traits using European-ancestry individuals in the UK Biobank, and identify enriched genes that are in biologically relevant pathways.Author SummaryEnrichment tests augment the standard univariate genome-wide association (GWA) framework by identifying groups of biologically interacting mutations that are enriched for associations with a trait of interest, beyond what is expected by chance. These analyses model local linkage disequilibrium (LD), allow many different mutations to be disease-causing across patients, and generate biologically interpretable hypotheses for disease mechanisms. However, existing enrichment analyses are hampered by high computational costs, and rely on GWA summary statistics despite the high false positive rate of the standard univariate GWA framework. Here, we present the gene-level association framework gene-ε (pronounced “genie”), an empirical Bayesian approach for identifying statistical associations between sets of mutations and quantitative traits. The central innovation of gene-ε is reformulating the GWA null model to distinguish between (i) mutations that are statistically associated with the disease but are unlikely to directly influence it, and (ii) mutations that are most strongly associated with a disease of interest. We find that, with our reformulated SNP-level null hypothesis, our gene-level enrichment model outperforms existing enrichment methods in simulation studies and scales well for application to emerging biobank datasets. We apply gene-ε to six quantitative traits in the UK Biobank and recover novel and functionally validated gene-level associations.

Download Full-text

Genome-wide association study suggests that variation at the RCOR1 locus is associated with tinnitus in UK Biobank

Scientific Reports ◽

10.1038/s41598-021-85871-6 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Helena R. R. Wells ◽

Fatin N. Zainul Abidin ◽

Maxim B. Freidin ◽

Frances M. K. Williams ◽

Sally J. Dawson

Keyword(s):

Association Study ◽

Molecular Mechanisms ◽

Genome Wide Association Study ◽

Genome Wide Association ◽

Uk Biobank ◽

Significance Threshold ◽

Genome Wide ◽

Final Sample ◽

Close Proximity ◽

Repressor Complex

AbstractTinnitus is a prevalent condition in which perception of sound occurs without an external stimulus. It is often associated with pre-existing hearing loss or noise-induced damage to the auditory system. In some individuals it occurs frequently or even continuously and leads to considerable distress and difficulty sleeping. There is little knowledge of the molecular mechanisms involved in tinnitus which has hindered the development of treatments. Evidence suggests that tinnitus has a heritable component although previous genetic studies have not established specific risk factors. From a total of 172,608 UK Biobank participants who answered questions on tinnitus we performed a case–control genome-wide association study for self-reported tinnitus. Final sample size used in association analysis was N = 91,424. Three variants in close proximity to the RCOR1 gene reached genome wide significance: rs4906228 (p = 1.7E−08), rs4900545 (p = 1.8E−08) and 14:103042287_CT_C (p = 3.50E−08). RCOR1 encodes REST Corepressor 1, a component of a co-repressor complex involved in repressing neuronal gene expression in non-neuronal cells. Eleven other independent genetic loci reached a suggestive significance threshold of p < 1E−06.

Download Full-text

A meta-analysis of genome-wide association studies for average daily gain and lean meat percentage in two Duroc pig populations

BMC Genomics ◽

10.1186/s12864-020-07288-1 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

Shenping Zhou ◽

Rongrong Ding ◽

Fanming Meng ◽

Xingwang Wang ◽

Zhanwei Zhuang ◽

...

Keyword(s):

Candidate Genes ◽

Growth And Development ◽

Genetic Architecture ◽

Association Studies ◽

Meta Analysis ◽

Average Daily Gain ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Daily Gain

Abstract Background Average daily gain (ADG) and lean meat percentage (LMP) are the main production performance indicators of pigs. Nevertheless, the genetic architecture of ADG and LMP is still elusive. Here, we conducted genome-wide association studies (GWAS) and meta-analysis for ADG and LMP in 3770 American and 2090 Canadian Duroc pigs. Results In the American Duroc pigs, one novel pleiotropic quantitative trait locus (QTL) on Sus scrofa chromosome 1 (SSC1) was identified to be associated with ADG and LMP, which spans 2.53 Mb (from 159.66 to 162.19 Mb). In the Canadian Duroc pigs, two novel QTLs on SSC1 were detected for LMP, which were situated in 3.86 Mb (from 157.99 to 161.85 Mb) and 555 kb (from 37.63 to 38.19 Mb) regions. The meta-analysis identified ten and 20 additional SNPs for ADG and LMP, respectively. Finally, four genes (PHLPP1, STC1, DYRK1B, and PIK3C2A) were detected to be associated with ADG and/or LMP. Further bioinformatics analysis showed that the candidate genes for ADG are mainly involved in bone growth and development, whereas the candidate genes for LMP mainly participated in adipose tissue and muscle tissue growth and development. Conclusions We performed GWAS and meta-analysis for ADG and LMP based on a large sample size consisting of two Duroc pig populations. One pleiotropic QTL that shared a 2.19 Mb haplotype block from 159.66 to 161.85 Mb on SSC1 was found to affect ADG and LMP in the two Duroc pig populations. Furthermore, the combination of single-population and meta-analysis of GWAS improved the efficiency of detecting additional SNPs for the analyzed traits. Our results provide new insights into the genetic architecture of ADG and LMP traits in pigs. Moreover, some significant SNPs associated with ADG and/or LMP in this study may be useful for marker-assisted selection in pig breeding.

Download Full-text