scholarly journals The sequences of 150,119 genomes in the UK biobank

2021 ◽  
Author(s):  
Bjarni V. Halldorsson ◽  
Hannes P. Eggertsson ◽  
Kristjan H.S. Moore ◽  
Hannes Hauswedell ◽  
Ogmundur Eiriksson ◽  
...  

We describe the analysis of whole genome sequencing (WGS) of 150,119 individuals from the UK biobank (UKB). This yielded a set of high quality variants, including 585,040,410 SNPs, representing 7.0% of all possible human SNPs, and 58,707,036 indels. The large set of variants allows us to characterize selection based on sequence variation within a population through a Depletion Rank (DR) score for windows along the genome. DR analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UKB, a large British Irish cohort (XBI) and smaller African (XAF) and South Asian (XSA) cohorts. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large scale WGS studies. Using this formidable new resource, we provide several noteworthy examples of trait associations with rare variants with large effects not found previously through studies based on exome sequencing and/or imputation.

2021 ◽  
Author(s):  
Abhishek Nag ◽  
Lawrence Middleton ◽  
Ryan S Dhindsa ◽  
Dimitrios Vitsios ◽  
Eleanor M Wigmore ◽  
...  

Genome-wide association studies have established the contribution of common and low frequency variants to metabolic biomarkers in the UK Biobank (UKB); however, the role of rare variants remains to be assessed systematically. We evaluated rare coding variants for 198 metabolic biomarkers, including metabolites assayed by Nightingale Health, using exome sequencing in participants from four genetically diverse ancestries in the UKB (N=412,394). Gene-level collapsing analysis, that evaluated a range of genetic architectures, identified a total of 1,303 significant relationships between genes and metabolic biomarkers (p<1x10-8), encompassing 207 distinct genes. These include associations between rare non-synonymous variants in GIGYF1 and glucose and lipid biomarkers, SYT7 and creatinine, and others, which may provide insights into novel disease biology. Comparing to a previous microarray-based genotyping study in the same cohort, we observed that 40% of gene-biomarker relationships identified in the collapsing analysis were novel. Finally, we applied Gene-SCOUT, a novel tool that utilises the gene-biomarker association statistics from the collapsing analysis to identify genes having similar biomarker fingerprints and thus expand our understanding of gene networks.


2020 ◽  
Author(s):  
David Curtis

Rare genetic variants in LDLR, APOB and PCSK9 are known causes of familial hypercholesterolaemia and it is expected that rare variants in other genes will also have effects on hyperlipidaemia risk although such genes remain to be identified. The UK Biobank consists of a sample of 500,000 volunteers and exome sequence data is available for 50,000 of them. 11,490 of these were classified as hyperlipidaemia cases on the basis of having a relevant diagnosis recorded and/or taking lipid-lowering medication while the remaining 38,463 were treated as controls. Variants in each gene were assigned weights according to rarity and predicted impact and overall weighted burden scores were compared between cases and controls, including population principal components as covariates. One biologically plausible gene, HUWE1, produced statistically significant evidence for association after correction for testing 22,028 genes with a signed log10 p value (SLP) of -6.15, suggesting a protective effect of variants in this gene. Other genes with uncorrected p<0.001 are arguably also of interest, including LDLR (SLP=3.67), RBP2 (SLP=3.14), NPFFR1 (SLP=3.02) and ACOT9 (SLP=-3.19). Gene set analysis indicated that rare variants in genes involved in metabolism and energy can influence hyperlipidaemia risk. Overall, the results provide some leads which might be followed up with functional studies and which could be tested in additional data sets as these become available. This research has been conducted using the UK Biobank Resource.


2020 ◽  
Author(s):  
Roni Rasnic ◽  
Nathan Linial ◽  
Michal Linial

AbstractIt is estimated that up to 10% of cancer incidents are attributed to inherited genetic alterations. Despite extensive research, there are still gaps in our understanding of genetic predisposition to cancer. It was theorized that ultra-rare variants partially account for the missing heritable component. We harness the UK BioBank dataset of ∼500,000 individuals, 14% of which were diagnosed with cancer, to detect ultra-rare, possibly high-penetrance cancer predisposition variants. We report on 115 cancer-exclusive ultra-rare variations (CUVs) and nominate 26 variants with additional independent evidence as cancer predisposition variants. We conclude that population cohorts are valuable source for expanding the collection of novel cancer predisposition genes.


2019 ◽  
Vol 29 ◽  
pp. S125-S126
Author(s):  
Amanda Gentry ◽  
Roseann Peterson ◽  
Alexis Edwards ◽  
Brien Riley ◽  
B. Todd Webb

2018 ◽  
Vol 115 (43) ◽  
pp. 11018-11023 ◽  
Author(s):  
Eric Jorgenson ◽  
Navneet Matharu ◽  
Melody R. Palmer ◽  
Jie Yin ◽  
Jun Shan ◽  
...  

Erectile dysfunction affects millions of men worldwide. Twin studies support the role of genetic risk factors underlying erectile dysfunction, but no specific genetic variants have been identified. We conducted a large-scale genome-wide association study of erectile dysfunction in 36,649 men in the multiethnic Kaiser Permanente Northern California Genetic Epidemiology Research in Adult Health and Aging cohort. We also undertook replication analyses in 222,358 men from the UK Biobank. In the discovery cohort, we identified a single locus (rs17185536-T) on chromosome 6 near the single-minded family basic helix-loop-helix transcription factor 1 (SIM1) gene that was significantly associated with the risk of erectile dysfunction (odds ratio = 1.26, P = 3.4 × 10−25). The association replicated in the UK Biobank sample (odds ratio = 1.25, P = 6.8 × 10−14), and the effect is independent of known erectile dysfunction risk factors, including body mass index (BMI). The risk locus resides on the same topologically associating domain as SIM1 and interacts with the SIM1 promoter, and the rs17185536-T risk allele showed differential enhancer activity. SIM1 is part of the leptin–melanocortin system, which has an established role in body weight homeostasis and sexual function. Because the variants associated with erectile dysfunction are not associated with differences in BMI, our findings suggest a mechanism that is specific to sexual function.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Julie A. Fitzpatrick ◽  
Nicolas Basty ◽  
Madeleine Cule ◽  
Yi Liu ◽  
Jimmy D. Bell ◽  
...  

AbstractPsoas muscle measurements are frequently used as markers of sarcopenia and predictors of health. Manually measured cross-sectional areas are most commonly used, but there is a lack of consistency regarding the position of the measurement and manual annotations are not practical for large population studies. We have developed a fully automated method to measure iliopsoas muscle volume (comprised of the psoas and iliacus muscles) using a convolutional neural network. Magnetic resonance images were obtained from the UK Biobank for 5000 participants, balanced for age, gender and BMI. Ninety manual annotations were available for model training and validation. The model showed excellent performance against out-of-sample data (average dice score coefficient of 0.9046 ± 0.0058 for six-fold cross-validation). Iliopsoas muscle volumes were successfully measured in all 5000 participants. Iliopsoas volume was greater in male compared with female subjects. There was a small but significant asymmetry between left and right iliopsoas muscle volumes. We also found that iliopsoas volume was significantly related to height, BMI and age, and that there was an acceleration in muscle volume decrease in men with age. Our method provides a robust technique for measuring iliopsoas muscle volume that can be applied to large cohorts.


2012 ◽  
Vol 30 (15_suppl) ◽  
pp. TPS10633-TPS10633 ◽  
Author(s):  
Emily Shaw ◽  
Alice Tuff ◽  
Rowena Sharpe ◽  
Louise K. Jones ◽  
Tarita Turtiaien ◽  
...  

TPS10633^ Background: Molecular analysis of tumours may be used to identify those predicted to benefit from novel targeted therapies. The Cancer Research UK programme is piloting plans to apply such testing broadly across the UK healthcare system, linking molecular phenotype to clinical outcomes. Methods: The Stratified Medicine Programme (SMP) aims to develop a model for high quality, large-scale molecular characterization of cancer specimens through an initiative developed in partnership with AstraZeneca, Pfizer, the UK Department of Health and academic researchers. Phase One of the SMP is a two year feasibility study. It aims to demonstrate the submission of consented blood samples and sections of surplus diagnostic formalin-fixed paraffin-embedded tumour tissue from 9,000 patients at centres across the UK to one of three ‘technology hubs’ for mutation testing of genes of potential clinical interest (KRAS, BRAF, NRAS, PIK3CA, TP53, PTEN, TMPRSS2-ERG, EGFR, EML4-ALK and KIT) in six selected tumour types. The tests are technically validated and will be completed in clinically relevant timescales. Data including pathological and treatment information and clinical outcome is also collected for the recruited patients, linked to the genetic data and stored in a central data repository hosted within the National Cancer Registration Service. The study opened in September 2011 at 7 sites across the UK and by the end of 2011, 760 patientswith breast, lung, prostate, colorectal, ovarian cancer or metastatic malignant melanoma had consented to participate. 142 sets of molecular results had been returned to clinical teams. Updated figures will be presented at the meeting, by which time the programme is projected to have accrued 4000 subjects. By 2013, we hope to have developed a scalable model for routine, high quality, prospective molecular characterisation of tumours for NHS cancer patients, with consent for the collection, storage and research use of population-scale genetic and clinical outcome data. We will report the emerging results from the Stratified Medicines Programme and early insights into implications for wider implementation across the UK healthcare system.


2021 ◽  
Author(s):  
Jonathan Sulc ◽  
Jenny Sjaarda ◽  
Zoltan Kutalik

Causal inference is a critical step in improving our understanding of biological processes and Mendelian randomisation (MR) has emerged as one of the foremost methods to efficiently interrogate diverse hypotheses using large-scale, observational data from biobanks. Although many extensions have been developed to address the three core assumptions of MR-based causal inference (relevance, exclusion restriction, and exchangeability), most approaches implicitly assume that any putative causal effect is linear. Here we propose PolyMR, an MR-based method which provides a polynomial approximation of an (arbitrary) causal function between an exposure and an outcome. We show that this method provides accurate inference of the shape and magnitude of causal functions with greater accuracy than existing methods. We applied this method to data from the UK Biobank, testing for effects between anthropometric traits and continuous health-related phenotypes and found most of these (84%) to have causal effects which deviate significantly from linear. These deviations ranged from slight attenuation at the extremes of the exposure distribution, to large changes in the magnitude of the effect across the range of the exposure (e.g. a 1 kg/m2 change in BMI having stronger effects on glucose levels if the initial BMI was higher), to non-monotonic causal relationships (e.g. the effects of BMI on cholesterol forming an inverted U shape). Finally, we show that the linearity assumption of the causal effect may lead to the misinterpretation of health risks at the individual level or heterogeneous effect estimates when using cohorts with differing average exposure levels.


Author(s):  
G David Batty ◽  
Bamba Gaye ◽  
Catharine R Gale ◽  
Mark Hamer ◽  
Camille Lassale

Abstract Ethnic inequalities in coronavirus disease 2019 (COVID-19) hospitalizations and mortality have been widely reported but there is scant understanding of how they are embodied. The UK Biobank prospective cohort study comprises around half a million people who were aged 40-69 years at study induction between 2006 and 2010 when information on ethnic background and potential explanatory factors was captured. Study members were prospectively linked to a national mortality registry. In an analytical sample of 448,664 individuals (248,820 women), 705 deaths were ascribed to COVID-19 between 5th March, 2020 and 24th January, 2021. In age- and sex-adjusted analyses, relative to White participants, Black study members experienced around five times the risk of COVID-19 mortality (odds ratio; 95% confidence interval: 4.81; 3.28, 7.05), while there was a doubling in the South Asian group (2.05; 1.30, 3.25). Controlling for baseline comorbidities, social factors (including socioeconomic circumstances), and lifestyle indices attenuated this risk differential by 34% in Black study members (2.84; 1.91, 4.23) and 37% in South Asian individuals (1.57; 0.97, 2.55). The residual risk of COVID-19 deaths in ethnic minority groups may be ascribed to a range of unmeasured characteristics and requires further exploration.


Sign in / Sign up

Export Citation Format

Share Document