Genome-wide analyses of behavioural traits biased by misreports and longitudinal changes

AbstractGenome-wide association studies (GWAS) have discovered numerous genetic variants associated with human behavioural traits. However, behavioural traits are subject to misreports and longitudinal changes (MLC) which can cause biases in GWAS and follow-up analyses. Here, we demonstrate that individuals with higher disease burden in the UK Biobank (n = 455,607) are more likely to misreport or reduce their alcohol consumption (AC) levels, and propose a correction procedure to mitigate the MLC-induced biases. The AC GWAS signals removed by the MLC corrections are enriched in metabolic/cardiovascular traits. Almost all the previously reported negative estimates of genetic correlations between AC and common diseases become positive/non-significant after the MLC corrections. We also observe MLC biases for smoking and physical activities in the UK Biobank. Our findings provide a plausible explanation of the controversy about the effects of AC on health outcomes and a caution for future analyses of self-reported behavioural traits in biobank data.

Download Full-text

Genome-wide analyses of behavioural traits are subject to bias by misreports and longitudinal changes

Nature Communications ◽

10.1038/s41467-020-20237-6 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Angli Xue ◽

Longda Jiang ◽

Zhihong Zhu ◽

Naomi R. Wray ◽

Peter M. Visscher ◽

...

Keyword(s):

Alcohol Consumption ◽

Association Studies ◽

Genetic Correlations ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Longitudinal Changes ◽

Genome Wide ◽

The Uk ◽

Almost All ◽

Cardiovascular Traits

AbstractGenome-wide association studies (GWAS) have discovered numerous genetic variants associated with human behavioural traits. However, behavioural traits are subject to misreports and longitudinal changes (MLC) which can cause biases in GWAS and follow-up analyses. Here, we demonstrate that individuals with higher disease burden in the UK Biobank (n = 455,607) are more likely to misreport or reduce their alcohol consumption levels, and propose a correction procedure to mitigate the MLC-induced biases. The alcohol consumption GWAS signals removed by the MLC corrections are enriched in metabolic/cardiovascular traits. Almost all the previously reported negative estimates of genetic correlations between alcohol consumption and common diseases become positive/non-significant after the MLC corrections. We also observe MLC biases for smoking and physical activities in the UK Biobank. Our findings provide a plausible explanation of the controversy about the effects of alcohol consumption on health outcomes and a caution for future analyses of self-reported behavioural traits in biobank data.

Download Full-text

Identifying genetic variants associated with cerebellar volume in 33,265 individuals from the UK-Biobank

10.1101/2020.11.24.393249 ◽

2020 ◽

Author(s):

Tom Chambers ◽

Valentina Escott-Price ◽

Sophie Legge ◽

Emily Baker ◽

Krish D. Singh ◽

...

Keyword(s):

Association Studies ◽

Brain Structures ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Protein Coding ◽

Cerebellar Volume ◽

Genome Wide ◽

Subcortical Brain Structures ◽

The Uk

AbstractThere is expanding interest in researching the cerebellum given accumulating evidence of its important contributions to cognitive and emotional functions, in addition to more established sensorimotor roles. While large genome-wide association studies (GWAS) have shed light on the common allele architecture of cortical and subcortical brain structures, the cerebellum remains under investigated. We conducted a meta-GWAS of cerebellar volume in 33,265 UK-Biobank European participants. Results show cerebellar volume to be moderately heritable (h2SNP=50.6%). We identified 33 independent genome-wide associated SNPs with total cerebellar volume, with 6 of these SNPs mapped to protein-coding genes and 5 more shown to alter cerebellar gene expression. We highlight 21 unique candidate genes for follow-up analysis. Cerebellar volume showed significant genetic correlation with brainstem, pallidum and thalamus volumes, but no significant correlations with neuropsychiatric phenotypes. Our results provide important new knowledge of the genetic architecture of cerebellar volume and its relationship with other brain phenotypes.

Download Full-text

Reproducibility in the UK Biobank of Genome-Wide Significant Signals Discovered in Earlier Genome-wide Association Studies

10.1101/2020.06.24.20139576 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jack W. O’Sullivan ◽

John P. A. Ioannidis

Keyword(s):

Effect Size ◽

Association Studies ◽

Genome Wide Association ◽

P Value ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Single Nucleotide ◽

Genome Wide ◽

The Uk ◽

Open Question

AbstractWith the establishment of large biobanks, discovery of single nucleotide polymorphism (SNPs) that are associated with various phenotypes has been accelerated. An open question is whether SNPs identified with genome-wide significance in earlier genome-wide association studies (GWAS) are replicated also in later GWAS conducted in biobanks. To address this question, the authors examined a publicly available GWAS database and identified two, independent GWAS on the same phenotype (an earlier, “discovery” GWAS and a later, replication GWAS done in the UK biobank). The analysis evaluated 136,318,924 SNPs (of which 6,289 had reached p<5e-8 in the discovery GWAS) from 4,397,962 participants across nine phenotypes. The overall replication rate was 85.0% and it was lower for binary than for quantitative phenotypes (58.1% versus 94.8% respectively). There was a18.0% decrease in SNP effect size for binary phenotypes, but a 12.0% increase for quantitative phenotypes. Using the discovery SNP effect size, phenotype trait (binary or quantitative), and discovery p-value, we built and validated a model that predicted SNP replication with area under the Receiver Operator Curve = 0.90. While non-replication may often reflect lack of power rather than genuine false-positive findings, these results provide insights about which discovered associations are likely to be seen again across subsequent GWAS.

Download Full-text

Insights into the genetic basis of retinal detachment

Human Molecular Genetics ◽

10.1093/hmg/ddz294 ◽

2019 ◽

Vol 29 (4) ◽

pp. 689-702 ◽

Cited By ~ 2

Author(s):

Thibaud S Boutin ◽

David G Charteris ◽

Aman Chandra ◽

Susan Campbell ◽

Caroline Hayward ◽

...

Keyword(s):

Retinal Detachment ◽

Association Studies ◽

Genetic Correlations ◽

Self Report ◽

Cataract Operation ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Genetic Associations ◽

Data Set ◽

Genome Wide

Abstract Retinal detachment (RD) is a serious and common condition, but genetic studies to date have been hampered by the small size of the assembled cohorts. In the UK Biobank data set, where RD was ascertained by self-report or hospital records, genetic correlations between RD and high myopia or cataract operation were, respectively, 0.46 (SE = 0.08) and 0.44 (SE = 0.07). These correlations are consistent with known epidemiological associations. Through meta-analysis of genome-wide association studies using UK Biobank RD cases (N = 3 977) and two cohorts, each comprising ~1 000 clinically ascertained rhegmatogenous RD patients, we uncovered 11 genome-wide significant association signals. These are near or within ZC3H11B, BMP3, COL22A1, DLG5, PLCE1, EFEMP2, TYR, FAT3, TRIM29, COL2A1 and LOXL1. Replication in the 23andMe data set, where RD is self-reported by participants, firmly establishes six RD risk loci: FAT3, COL22A1, TYR, BMP3, ZC3H11B and PLCE1. Based on the genetic associations with eye traits described to date, the first two specifically impact risk of a RD, whereas the last four point to shared aetiologies with macular condition, myopia and glaucoma. Fine-mapping prioritized the lead common missense variant (TYR S192Y) as causal variant at the TYR locus and a small set of credible causal variants at the FAT3 locus. The larger study size presented here, enabled by resources linked to health records or self-report, provides novel insights into RD aetiology and underlying pathological pathways.

Download Full-text

The evolution of skin pigmentation associated variation in West Eurasia

10.1101/2020.05.08.085274 ◽

2020 ◽

Author(s):

Dan Ju ◽

Iain Mathieson

Keyword(s):

Genetic Variants ◽

Association Studies ◽

Skin Pigmentation ◽

Directional Selection ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Genome Wide ◽

Light Skin ◽

The Uk

AbstractSkin pigmentation is a classic example of a polygenic trait that has experienced directional selection in humans. Genome-wide association studies have identified well over a hundred pigmentation-associated loci, and genomic scans in present-day and ancient populations have identified selective sweeps for a small number of light pigmentation-associated alleles in Europeans. It is unclear whether selection has operated on all the genetic variation associated with skin pigmentation as opposed to just a small number of large-effect variants. Here, we address this question using ancient DNA from 1158 individuals from West Eurasia covering a period of 40,000 years combined with genome-wide association summary statistics from the UK Biobank. We find a robust signal of directional selection in ancient West Eurasians on skin pigmentation variants ascertained in the UK Biobank, but find this signal is driven mostly by a limited number of large-effect variants. Consistent with this observation, we find that a polygenic selection test in present-day populations fails to detect selection with the full set of variants; rather, only the top five show strong evidence of selection. Our data allow us to disentangle the effects of admixture and selection. Most notably, a large-effect variant at SLC24A5 was introduced to Europe by migrations of Neolithic farming populations but continued to be under selection post-admixture. This study shows that the response to selection for light skin pigmentation in West Eurasia was driven by a relatively small proportion of the variants that are associated with present-day phenotypic variation.SignificanceSome of the genes responsible for the evolution of light skin pigmentation in Europeans show signals of positive selection in present-day populations. Recently, genome-wide association studies have highlighted the highly polygenic nature of skin pigmentation. It is unclear whether selection has operated on all of these genetic variants or just a subset. By studying variation in over a thousand ancient genomes from West Eurasia covering 40,000 years we are able to study both the aggregate behavior of pigmentation-associated variants and the evolutionary history of individual variants. We find that the evolution of light skin pigmentation in Europeans was driven by frequency changes in a relatively small fraction of the genetic variants that are associated with variation in the trait today.

Download Full-text

Genome-wide association study of circulating liver enzymes reveals an expanded role for manganese transporter SLC30A10 in liver health

10.1101/2020.05.19.104570 ◽

2020 ◽

Author(s):

Lucas D. Ward ◽

Ho-Chou Tu ◽

Chelsea Quenneville ◽

Alexander O. Flynn-Carroll ◽

Margaret M. Parker ◽

...

Keyword(s):

Extrahepatic Bile Duct ◽

Association Studies ◽

Genome Wide Association ◽

Detectable Effect ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Extrahepatic Bile Duct Cancer ◽

Genome Wide ◽

Liver Health ◽

The Uk

AbstractTo better understand molecular pathways underlying liver health and disease, we performed genome-wide association studies (GWAS) on circulating levels of alanine aminotransferase (ALT) and aspartate aminotransferase (AST) across 408,300 subjects from four ethnic groups in the UK Biobank, focusing on variants associating with both enzymes. Of these variants, the strongest effect is a rare (MAF in White British = 0.12%) missense variant in the gene encoding manganese efflux transporter SLC30A10, Thr95Ile (rs188273166), associating with a 5.9% increase in ALT and a 4.2% increase in AST. Carriers have higher prevalence of all-cause liver disease (OR = 1.70; 95% CI = 1.24 to 2.34) and higher prevalence of extrahepatic bile duct cancer (OR = 23.8; 95% CI = 9.1 to 62.1) compared to non-carriers. Over 4% of the cases of extrahepatic cholangiocarcinoma in the UK Biobank carry SLC30A10 Thr95Ile. Unlike variants in SLC30A10 known to cause the recessive syndrome hypermanganesemia with dystonia-1 (HMNDYT1), the Thr95Ile variant has a detectable effect even in the heterozygous state. Also unlike HMNDYT1-causing variants, Thr95Ile results in a protein that is properly trafficked to the plasma membrane when expressed in HeLa cells. These results suggest that coding variation in SLC30A10 impacts liver health in more individuals than the small population of HMNDYT1 patients.

Download Full-text

Genetic stratification of depression by neuroticism: revisiting a diagnostic tradition

Psychological Medicine ◽

10.1017/s0033291719002629 ◽

2019 ◽

Vol 50 (15) ◽

pp. 2526-2535 ◽

Cited By ~ 8

Author(s):

Mark J. Adams ◽

David M. Howard ◽

Michelle Luciano ◽

Toni-Kim Clarke ◽

Gail Davies ◽

...

Keyword(s):

Association Studies ◽

Genetic Correlations ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Genome Wide ◽

Artery Disease ◽

Specific Association ◽

Genetic Stratification ◽

Genomic Regions ◽

Genetic Contributions

AbstractBackgroundMajor depressive disorder and neuroticism (Neu) share a large genetic basis. We sought to determine whether this shared basis could be decomposed to identify genetic factors that are specific to depression.MethodsWe analysed summary statistics from genome-wide association studies (GWAS) of depression (from the Psychiatric Genomics Consortium, 23andMe and UK Biobank) and compared them with GWAS of Neu (from UK Biobank). First, we used a pairwise GWAS analysis to classify variants as associated with only depression, with only Neu or with both. Second, we estimated partial genetic correlations to test whether the depression's genetic link with other phenotypes was explained by shared overlap with Neu.ResultsWe found evidence that most genomic regions (25/37) associated with depression are likely to be shared with Neu. The overlapping common genetic variance of depression and Neu was genetically correlated primarily with psychiatric disorders. We found that the genetic contributions to depression, that were not shared with Neu, were positively correlated with metabolic phenotypes and cardiovascular disease, and negatively correlated with the personality trait conscientiousness. After removing shared genetic overlap with Neu, depression still had a specific association with schizophrenia, bipolar disorder, coronary artery disease and age of first birth. Independent of depression, Neu had specific genetic correlates in ulcerative colitis, pubertal growth, anorexia and education.ConclusionOur findings demonstrate that, while genetic risk factors for depression are largely shared with Neu, there are also non-Neu-related features of depression that may be useful for further patient or phenotypic stratification.

Download Full-text

Body size and composition and site-specific cancers in UK Biobank: a Mendelian randomisation study

10.1101/2020.02.28.970459 ◽

2020 ◽

Cited By ~ 1

Author(s):

Mathew Vithayathil ◽

Paul Carter ◽

Siddhartha Kar ◽

Amy M. Mason ◽

Stephen Burgess ◽

...

Keyword(s):

Instrumental Variables ◽

Association Studies ◽

Genome Wide Association ◽

Mendelian Randomisation ◽

Genome Wide Association Studies ◽

Uk Biobank ◽

Site Specific ◽

Genome Wide ◽

Increased Risk ◽

The Uk

ABSTRACTObjectivesTo investigate the casual role of body mass index, body fat composition and height in cancer.DesignTwo stage mendelian randomisation studySettingPrevious genome wide association studies and the UK BiobankParticipantsGenetic instrumental variables for body mass index (BMI), fat mass index (FMI), fat free mass index (FFMI) and height from previous genome wide association studies and UK Biobank. Cancer outcomes from 367 586 participants of European descent from the UK Biobank.Main outcome measuresOverall cancer risk and 22 site-specific cancers risk for genetic instrumental variables for BMI, FMI, FFMI and height.ResultsGenetically predicted BMI (per 1 kg/m2) was not associated with overall cancer risk (OR 0.99; 95% confidence interval (CI) 0-98-1.00, p=0.105). Elevated BMI was associated with increased risk of stomach cancer (OR 1.15, 95% (CI) 1.05-1.26; p=0.003) and melanoma (OR 0.96, 95% CI 0.92-1.00; p=0.044). For sex-specific cancers, BMI was positively associated with uterine cancer (OR 1.08, 95% CI 1.01-1.14; p=0.015) but inversely associated with breast (OR 0.95, 95% CI 0.92-0.98; p=0.001), prostate (OR 0.95, 95% CI 0.92-0.99; p=0.007) and testicular cancer (OR 0.89, 95% CI 0.81-0.98; p=0.017). Elevated FMI (per 1 kg/m2) was associated with gastrointestinal cancer (stomach cancer OR 4.23, 95% CI 1.18-15.13, p=0.027; colorectal cancer OR 1.94, 95% CI 1.23-3.07; p=0.004). Increased height (per 1 standard deviation, approximately 6.5cm) was associated with increased risk of overall cancer (OR 1.06; 95% 1.04-1.09; p = 2.97×10-8) and most site-specific cancers with the strongest estimates for kidney, non-Hodgkin lymphoma, colorectal, lung, melanoma and breast cancer.ConclusionsThere is little evidence for BMI as a casual risk factor for cancer. BMI may have a causal role for sex-specific cancers, although with inconsistent directions of effect, and FMI for gastrointestinal malignancies. Elevated height is a risk factor for overall cancer and multiple site cancers.

Download Full-text

The transferability of lipid loci across African, Asian and European cohorts

Nature Communications ◽

10.1038/s41467-019-12026-7 ◽

2019 ◽

Vol 10 (1) ◽

Cited By ~ 7

Author(s):

Karoline Kuchenbaecker ◽

◽

Nikita Telkar ◽

Theresa Reiker ◽

Robin G. Walters ◽

...

Keyword(s):

Association Studies ◽

Genetic Risk Score ◽

Genetic Correlations ◽

Serum Levels ◽

European Ancestry ◽

Genome Wide Association Studies ◽

Genetic Determinants ◽

Gene Environment ◽

Genome Wide ◽

The Uk

Abstract Most genome-wide association studies are based on samples of European descent. We assess whether the genetic determinants of blood lipids, a major cardiovascular risk factor, are shared across populations. Genetic correlations for lipids between European-ancestry and Asian cohorts are not significantly different from 1. A genetic risk score based on LDL-cholesterol-associated loci has consistent effects on serum levels in samples from the UK, Uganda and Greece (r = 0.23–0.28, p < 1.9 × 10−14). Overall, there is evidence of reproducibility for ~75% of the major lipid loci from European discovery studies, except triglyceride loci in the Ugandan samples (10% of loci). Individual transferable loci are identified using trans-ethnic colocalization. Ten of fourteen loci not transferable to the Ugandan population have pleiotropic associations with BMI in Europeans; none of the transferable loci do. The non-transferable loci might affect lipids by modifying food intake in environments rich in certain nutrients, which suggests a potential role for gene-environment interactions.

Download Full-text

Fine-scale population structure in the UK Biobank: implications for genome-wide association studies

Human Molecular Genetics ◽

10.1093/hmg/ddaa157 ◽

2020 ◽

Vol 29 (16) ◽

pp. 2803-2811

Author(s):

James P Cook ◽

Anubha Mahajan ◽

Andrew P Morris

Keyword(s):

Population Structure ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Fine Scale ◽

Uk Biobank ◽

Genome Wide ◽

Scale Population ◽

The Uk ◽

The Impact

Abstract The UK Biobank is a prospective study of more than 500 000 participants, which has aggregated data from questionnaires, physical measures, biomarkers, imaging and follow-up for a wide range of health-related outcomes, together with genome-wide genotyping supplemented with high-density imputation. Previous studies have highlighted fine-scale population structure in the UK on a North-West to South-East cline, but the impact of unmeasured geographical confounding on genome-wide association studies (GWAS) of complex human traits in the UK Biobank has not been investigated. We considered 368 325 white British individuals from the UK Biobank and performed GWAS of their birth location. We demonstrate that widely used approaches to adjust for population structure, including principal component analysis and mixed modelling with a random effect for a genetic relationship matrix, cannot fully account for the fine-scale geographical confounding in the UK Biobank. We observe significant genetic correlation of birth location with a range of lifestyle-related traits, including body-mass index and fat mass, hypertension and lung function, even after adjustment for population structure. Variants driving associations with birth location are also strongly associated with many of these lifestyle-related traits after correction for population structure, indicating that there could be environmental factors that are confounded with geography that have not been adequately accounted for. Our findings highlight the need for caution in the interpretation of lifestyle-related trait GWAS in UK Biobank, particularly in loci demonstrating strong residual association with birth location.

Download Full-text