Rare Non-coding Variation Identified by Large Scale Whole Genome Sequencing Reveals Unexplained Heritability of Type 2 Diabetes
Type 2 diabetes is increasing in all ancestry groups1. Part of its genetic basis may reside among the rare (minor allele frequency <0.1%) variants that make up the vast majority of human genetic variation2. We analyzed high-coverage (mean depth 38.2x) whole genome sequencing from 9,639 individuals with T2D and 34,994 controls in the NHLBI’s Trans-Omics for Precision Medicine (TOPMed) program2 to show that rare, non-coding variants that are poorly captured by genotyping arrays or imputation panels contribute h2=53% (P=4.2×10−5) to the genetic component of risk in the largest (European) ancestry subset. We coupled sequence variation with islet epigenomic signatures3 to annotate and group rare variants with respect to gene expression4, chromatin state5 and three-dimensional chromatin architecture6, and show that pancreatic islet regulatory elements contribute to T2D genetic risk (h2=8%, P=2.4×10−3). We used islet annotation to create a non-coding framework for rare variant aggregation testing. This approach identified five loci containing rare alleles in islet regulatory elements that suggest novel biological mechanisms readily linked to hypotheses about variant-to-function. Large scale whole genome sequence analysis reveals the substantial contribution of rare, non-coding variation to the genetic architecture of T2D and highlights the value of tissue-specific regulatory annotation for variant-to-function discovery.