scholarly journals Subtle stratification confounds estimates of heritability from rare variants

2016 ◽  
Author(s):  
Gaurav Bhatia ◽  
Alexander Gusev ◽  
Po-Ru Loh ◽  
Hilary Finucane ◽  
Bjarni J. Vilhjálmsson ◽  
...  

AbstractGenome-wide significant associations generally explain only a small proportion of the narrow-sense heritability of complex disease (h2). While considerably more heritability is explained by all genotyped SNPs (hg2), for most traits, much heritability remains missing (hg2 < h2). Rare variants, poorly tagged by genotyped SNPs, are a major potential source of the gap between hg2 and h2. Recent efforts to assess the contribution of both sequenced and imputed rare variants to phenotypes suggest that substantial heritability may lie in these variants. Here we analyze sequenced SNPs, imputed SNPs and haploSNPs— haplotype variants constructed from within a sample, without using a reference panel— and show that studies of heritability from these variants may be strongly confounded by subtle population stratification. For example, when meta-analyzing heritability estimates from 22 randomly ascertained case-control traits from the GERA cohort, we observe a statistically significant increase in heritability explained by imputed SNPs even after correcting for principal components (PCs) from genotyped (or imputed) SNPs. However, this increase is eliminated when correcting for stratification using PCs from a larger number of haploSNPs. We note that subtle stratification may also impact estimates of heritability from array SNPs, although we find that this is generally a less severe problem. Overall, our results suggest that estimating the heritability explained by rare variants for case-control traits requires exquisite control for population stratification, but current methods may not provide this level of control.

2015 ◽  
Author(s):  
Gaurav Bhatia ◽  
Alexander Gusev ◽  
Po-Ru Loh ◽  
Bjarni J Vilhjálmsson ◽  
Stephan Ripke ◽  
...  

While genome-wide significant associations generally explain only a small proportion of the narrow-sense heritability of complex disease (h2), recent work has shown that more heritability is explained by all genotyped SNPs (hg2). However, much of the heritability is still missing (hg2< h2). For example, for schizophrenia, h2is estimated at 0.7-0.8 but hg2 is estimated at ~0.3. Efforts at increasing coverage through accurately imputed variants have yielded only small increases in the heritability explained, and poorly imputed variants can lead to assay artifacts for case-control traits. We propose to estimate the heritability explained by a set of haplotype variants (haploSNPs) constructed directly from the study sample (hhap2). Our method constructs a set of haplotypes from phased genotypes by extending shared haplotypes subject to the 4-gamete test. In a large schizophrenia data set (PGC2-SCZ), haploSNPs with MAF > 0.1% explained substantially more phenotypic variance (hhap2= 0.64 (S.E. 0.084)) than genotyped SNPs alone (hg2= 0.32 (S.E. 0.029)). These estimates were based on cross-cohort comparisons, ensuring that cohort-specific assay artifacts did not contribute to our estimates. In a large multiple sclerosis data set (WTCCC2-MS), we observed an even larger difference between hhap2and hg2, though data from other cohorts will be required to validate this result. Overall, our results suggest that haplotypes of common SNPs can explain a large fraction of missing heritability of complex disease, shedding light on genetic architecture and informing disease mapping strategies.


2011 ◽  
Vol 5 (S9) ◽  
Author(s):  
Libo Wang ◽  
Vitara Pungpapong ◽  
Yanzhu Lin ◽  
Min Zhang ◽  
Dabao Zhang

2020 ◽  
Vol 29 (5) ◽  
pp. 859-863 ◽  
Author(s):  
Genevieve H L Roberts ◽  
Stephanie A Santorico ◽  
Richard A Spritz

Abstract Autoimmune vitiligo is a complex disease involving polygenic risk from at least 50 loci previously identified by genome-wide association studies. The objectives of this study were to estimate and compare vitiligo heritability in European-derived patients using both family-based and ‘deep imputation’ genotype-based approaches. We estimated family-based heritability (h2FAM) by vitiligo recurrence among a total 8034 first-degree relatives (3776 siblings, 4258 parents or offspring) of 2122 unrelated vitiligo probands. We estimated genotype-based heritability (h2SNP) by deep imputation to Haplotype Reference Consortium and the 1000 Genomes Project data in unrelated 2812 vitiligo cases and 37 079 controls genotyped genome wide, achieving high-quality imputation from markers with minor allele frequency (MAF) as low as 0.0001. Heritability estimated by both approaches was exceedingly high; h2FAM = 0.75–0.83 and h2SNP = 0.78. These estimates are statistically identical, indicating there is essentially no remaining ‘missing heritability’ for vitiligo. Overall, ~70% of h2SNP is represented by common variants (MAF &gt; 0.01) and 30% by rare variants. These results demonstrate that essentially all vitiligo heritable risk is captured by array-based genotyping and deep imputation. These findings suggest that vitiligo may provide a particularly tractable model for investigation of complex disease genetic architecture and predictive aspects of personalized medicine.


2019 ◽  
Author(s):  
Weihua Meng ◽  
Mark J Adams ◽  
Colin NA Palmer ◽  
Jingchunzi Shi ◽  
Adam Auton ◽  
...  

SUMMARYObjectiveKnee pain is one of the most common musculoskeletal complaints that brings people to medical attention. We sought to identify the genetic variants associated with knee pain in 171,516 subjects from the UK Biobank cohort and replicate them using cohorts from 23andMe, the Osteoarthritis Initiative (OAI), and the Johnston County Osteoarthritis Study (JoCo).MethodsWe performed a genome-wide association study of knee pain in the UK Biobank, where knee pain was ascertained through self-report and defined as “knee pain in the last month interfering with usual activities”. A total of 22,204 cases and 149,312 controls were included in the discovery analysis. We tested our top and independent SNPs (P < 5 × 10−8) for replication in 23andMe, OAI, and JoCo, then performed a joint meta-analysis between discovery and replication cohorts using GWAMA. We calculated the narrow-sense heritability of knee pain using Genome-wide Complex Trait Analysis (GCTA).ResultsWe identified 2 loci that reached genome-wide significance, rs143384 located in the GDF5 (P = 1.32 × 10−12), a gene previously implicated in osteoarthritis, and rs2808772, located near COL27A1 (P = 1.49 × 10−8). These findings were subsequently replicated in independent cohorts and increased in significance in the joint meta-analysis (rs143384: P = 4.64 × 10−18; rs2808772: P −11 = 2.56 × 10−1’). The narrow sense heritability of knee pain was 0.08.ConclusionIn this first reported genome-wide association meta-analysis of knee pain, we identified and replicated two loci in or near GDF5 and COL27A1 that are associated with knee pain.


2021 ◽  
Vol 9 ◽  
Author(s):  
Anwarul Karim ◽  
Clara Sze-Man Tang ◽  
Paul Kwong-Hang Tam

Hirschsprung disease (HSCR) is the leading cause of neonatal functional intestinal obstruction. It is a rare congenital disease with an incidence of one in 3,500–5,000 live births. HSCR is characterized by the absence of enteric ganglia in the distal colon, plausibly due to genetic defects perturbing the normal migration, proliferation, differentiation, and/or survival of the enteric neural crest cells as well as impaired interaction with the enteric progenitor cell niche. Early linkage analyses in Mendelian and syndromic forms of HSCR uncovered variants with large effects in major HSCR genes including RET, EDNRB, and their interacting partners in the same biological pathways. With the advances in genome-wide genotyping and next-generation sequencing technologies, there has been a remarkable progress in understanding of the genetic basis of HSCR in the past few years, with common and rare variants with small to moderate effects being uncovered. The discovery of new HSCR genes such as neuregulin and BACE2 as well as the deeper understanding of the roles and mechanisms of known HSCR genes provided solid evidence that many HSCR cases are in the form of complex polygenic/oligogenic disorder where rare variants act in the sensitized background of HSCR-associated common variants. This review summarizes the roadmap of genetic discoveries of HSCR from the earlier family-based linkage analyses to the recent population-based genome-wide analyses coupled with functional genomics, and how these discoveries facilitated our understanding of the genetic architecture of this complex disease and provide the foundation of clinical translation for precision and stratified medicine.


2020 ◽  
Author(s):  
Gaspard Kerner ◽  
Matthieu Bouaziz ◽  
Aurélie Cobat ◽  
Benedetta Bigio ◽  
Andrew T Timberlake ◽  
...  

AbstractWhole-exome sequencing (WES) has facilitated the discovery of genetic lesions underlying monogenic disorders. Incomplete penetrance and variable expressivity suggest a contribution of additional genetic lesions to clinical manifestations and outcome. Some monogenic disorders may therefore actually be digenic. However, only a few digenic disorders have been reported, all discovered by candidate gene approaches applied to at least one locus. We propose here a novel two-locus genome-wide test for detecting digenic inheritance in WES data. This approach uses the gene as the unit of analysis and tests all pairs of genes to detect pairwise gene x gene interactions underlying disease. It is a case-only method, which has several advantages over classic case-control tests, in particular by avoiding recruitment and bias of controls. Our simulation studies based on real WES data identified two major sources of type I error inflation in this case-only test: linkage disequilibrium and population stratification. Both were corrected by specific procedures. Moreover, our case-only approach is more powerful than the corresponding case-control test for detecting digenic interactions in various population stratification scenarios. Finally, we validated our unbiased, genome-wide approach by successfully identifying a previously reported digenic lesion in patients with craniosynostosis. Our case-only test is a powerful and timely tool for detecting digenic inheritance in WES data from patients.Significance statementDespite a growing number of reports of rare disorders not fully explained by monogenic lesions, digenic inheritance has been reported for only 54 diseases to date. The very few existing methods for detecting gene x gene interactions from next-generation sequencing data were generally studied in rare-variant association studies with limited simulation analyses for short genomic regions, under a case-control design. We describe the first case-only approach designed specifically to search for digenic inheritance, which avoids recruitment and bias related to controls. We show, through both extensive simulation studies on real WES datasets and application to a real example of craniosynostosis, that our method is robust and powerful for the genome-wide identification of digenic lesions.


2018 ◽  
Vol 45 (1-2) ◽  
pp. 1-17 ◽  
Author(s):  
Elizabeth E. Blue ◽  
Joshua C. Bis ◽  
Michael O. Dorschner ◽  
Debby W. Tsuang ◽  
Sandra M. Barral ◽  
...  

Background/Aims: The Alzheimer’s Disease Sequencing Project (ADSP) aims to identify novel genes influencing Alzheimer’s disease (AD). Variants within genes known to cause dementias other than AD have previously been associated with AD risk. We describe evidence of co-segregation and associations between variants in dementia genes and clinically diagnosed AD within the ADSP. Methods: We summarize the properties of known pathogenic variants within dementia genes, describe the co-segregation of variants annotated as “pathogenic” in ClinVar and new candidates observed in ADSP families, and test for associations between rare variants in dementia genes in the ADSP case-control study. The participants were clinically evaluated for AD, and they represent European, Caribbean Hispanic, and isolate Dutch populations. Results/Conclusions: Pathogenic variants in dementia genes were predominantly rare and conserved coding changes. Pathogenic variants within ARSA, CSF1R, and GRN were observed, and candidate variants in GRN and CHMP2B were nominated in ADSP families. An independent case-control study provided evidence of an association between variants in TREM2, APOE, ARSA, CSF1R, PSEN1, and MAPT and risk of AD. Variants in genes which cause dementing disorders may influence the clinical diagnosis of AD in a small proportion of cases within the ADSP.


2018 ◽  
Vol 25 (7) ◽  
pp. 909-917 ◽  
Author(s):  
Julia Y Mescheriakova ◽  
Annemieke JMH Verkerk ◽  
Najaf Amin ◽  
André G Uitterlinden ◽  
Cornelia M van Duijn ◽  
...  

Background: Multiple sclerosis (MS) is a complex disease resulting from the joint effect of many genes. It has been speculated that rare variants might explain part of the missing heritability of MS. Objective: To identify rare coding genetic variants by analyzing a large MS pedigree with 11 affected individuals in several generations. Methods: Genome-wide linkage screen and whole exome sequencing (WES) were performed to identify novel coding variants in the shared region(s) and in the known 110 MS risk loci. The candidate variants were then assessed in 591 MS patients and 3169 controls. Results: Suggestive evidence for linkage was obtained to 7q11.22-q11.23. In WES data, a rare missense variant p.R183C in FKBP6 was identified that segregated with the disease in this family. The minor allele frequency was higher in an independent cohort of MS patients than in healthy controls (1.27% vs 0.95%), but not significant (odds ratio (OR) = 1.33 (95% confidence interval (CI): 0.8–2.4), p = 0.31). Conclusion: The rare missense variant in FKBP6 was identified in a large Dutch MS family segregating with the disease. This association to MS was not found in an independent MS cohort. Overall, genome-wide studies in larger cohorts are needed to adequately investigate the role of rare variants in MS risk.


Genome ◽  
2017 ◽  
Vol 60 (7) ◽  
pp. 572-580 ◽  
Author(s):  
Rong-Cai Yang

Narrow-sense heritability (portion of the total phenotypic variation attributable to additive genetic effect, h2) is a critical parameter in plant breeding and genetics, but its estimation is difficult for populations with unknown pedigree information. This study applied a marker-based linear mixed model (LMM) analysis to estimate narrow-sense heritability and its seven functional components corresponding to SNPs in coding and noncoding regions for each of 107 flowering, defense, ionomics, and developmental traits in an Arabidopsis (Arabidopsis thaliana) population of 199 inbred lines with unknown genetic relatedness. Genetic relationship matrix (GRM) based on 214 051 SNPs and component GRMs based on seven subsets of SNPs were computed for LMM estimation of h2 and functional components contributing to h2, respectively. The h2 estimates for flowering traits were higher than those for defense, ionomics, and developmental traits, supporting a general view that the fitness-related traits have lower heritabilities than other traits. The function component owing to SNPs in coding (exon) regions was the least contributor to h2. Our LMM analysis provides an opportunity to gain a comprehensive view on heritability and its functional components for populations with unknown structure but with genome-wide DNA markers.


Sign in / Sign up

Export Citation Format

Share Document