scholarly journals A Saturated Map of Common Genetic Variants Associated with Human Height from 5.4 Million Individuals of Diverse Ancestries

2022 ◽  
Author(s):  
Loic Yengo ◽  
Sailaja Vedantam ◽  
Eirini Marouli ◽  
Julia Sidorenko ◽  
Eric Bartell ◽  
...  

Common SNPs are predicted to collectively explain 40-50% of phenotypic variation in human height, but identifying the specific variants and associated regions requires huge sample sizes. Here we show, using GWAS data from 5.4 million individuals of diverse ancestries, that 12,111 independent SNPs that are significantly associated with height account for nearly all of the common SNP-based heritability. These SNPs are clustered within 7,209 non-overlapping genomic segments with a median size of ~90 kb, covering ~21% of the genome. The density of independent associations varies across the genome and the regions of elevated density are enriched for biologically relevant genes. In out-of-sample estimation and prediction, the 12,111 SNPs account for 40% of phenotypic variance in European ancestry populations but only ~10%-20% in other ancestries. Effect sizes, associated regions, and gene prioritization are similar across ancestries, indicating that reduced prediction accuracy is likely explained by linkage disequilibrium and allele frequency differences within associated regions. Finally, we show that the relevant biological pathways are detectable with smaller sample sizes than needed to implicate causal genes and variants. Overall, this study, the largest GWAS to date, provides an unprecedented saturated map of specific genomic regions containing the vast majority of common height-associated variants.

2016 ◽  
Vol 46 (10) ◽  
pp. 2059-2069 ◽  
Author(s):  
L. C. Bidwell ◽  
R. H. C. Palmer ◽  
L. Brick ◽  
J. E. McGeary ◽  
V. S. Knopik

BackgroundHeritability estimates from twin studies of the multi-faceted phenotype of nicotine dependence (ND) range from moderate to high (31–60%), but vary substantially based on the specific ND-related construct examined. The current study estimated the aggregate role of common genetic variants on key ND constructs.MethodGenomic-relationship-matrix restricted maximum likelihood (GREML) was used to decompose phenotypic variance across multiple ND indices using 796 125 polymorphisms from 2346 unrelated ‘lifetime ever smokers’ of European ancestry. Measures included DSM-IV ND and Fagerström Test for Nicotine Dependence (FTND) summary measures and constituent constructs (e.g. withdrawal severity, tolerance, heaviness of smoking and time spent smoking). Exploratory and confirmatory factor models were used to describe the covariance structure across ND measures; resulting factor(s) were the subject(s) of GREML analyses.ResultsFactor models indicated highly correlated DSM-IV and FTND factors for ND (0.545, 95% confidence interval 0.50–0.60) that could be represented as a higher-order factor (NIC DEP). Additive genetic influence on NIC DEP was 33% (s.e. = 0.14, p = 0.009). Post-hoc analyses indicated moderate genetic effects on the DSM-IV (34%, s.e. = 0.14, p = 0.008) and FTND (26%, s.e. = 0.14, p = 0.032) factors, both of which were influenced by the same genetic effects (rG-SNP = 1.00, s.e. = 0.09, p < 0.00001).ConclusionsOverall, common single nucleotide polymorphisms accounted for a large proportion of the genetic influences on ND-related phenotypes that have been observed in twin studies. Genetic contributions across distinct ND scales were largely influenced by shared genetic factors.


2017 ◽  
Author(s):  
Louis Lello ◽  
Steven G. Avery ◽  
Laurent Tellier ◽  
Ana I. Vazquez ◽  
Gustavo de los Campos ◽  
...  

AbstractWe construct genomic predictors for heritable and extremely complex human quan-titative traits (height, heel bone density, and educational attainment) using modern methods in high dimensional statistics (i.e., machine learning). Replication tests show that these predictors capture, respectively, ∼40, 20, and 9 percent of total variance for the three traits. For example, predicted heights correlate ∼0.65 with actual height; actual heights of most individuals in validation samples are within a few cm of the prediction. The variance captured for height is comparable to the estimated SNP heritability from GCTA (GREML) analysis, and seems to be close to its asymptotic value (i.e., as sample size goes to infinity), suggesting that we have captured most of the heritability for the SNPs used. Thus, our results resolve the common SNP portion of the “missing heritability” problem – i.e., the gap between prediction R-squared and SNP heritability. The ∼20k activated SNPs in our height predictor reveal the genetic architecture of human height, at least for common SNPs. Our primary dataset is the UK Biobank cohort, comprised of almost 500k individual genotypes with multiple phenotypes. We also use other datasets and SNPs found in earlier GWAS for out-of-sample validation of our results.


2018 ◽  
Author(s):  
Max Lam ◽  
Chia-Yen Chen ◽  
Zhiqiang Li ◽  
Alicia R. Martin ◽  
Julien Bryois ◽  
...  

Author summarySchizophrenia is a severe psychiatric disorder with a lifetime risk of about 1% world-wide. Most large schizophrenia genetic studies have studied people of primarily European ancestry, potentially missing important biological insights. Here we present a study of East Asian participants (22,778 schizophrenia cases and 35,362 controls), identifying 21 genome-wide significant schizophrenia associations in 19 genetic loci. Over the genome, the common genetic variants that confer risk for schizophrenia have highly similar effects in those of East Asian and European ancestry (rg=0.98), indicating for the first time that the genetic basis of schizophrenia and its biology are broadly shared across these world populations. A fixed-effect meta-analysis including individuals from East Asian and European ancestries revealed 208 genome-wide significant schizophrenia associations in 176 genetic loci (53 novel). Trans-ancestry fine-mapping more precisely isolated schizophrenia causal alleles in 70% of these loci. Despite consistent genetic effects across populations, polygenic risk models trained in one population have reduced performance in the other, highlighting the importance of including all major ancestral groups with sufficient sample size to ensure the findings have maximum relevance for all populations.


2014 ◽  
Vol 46 (16) ◽  
pp. 571-582 ◽  
Author(s):  
P. Carbonetto ◽  
R. Cheng ◽  
J. P. Gyekis ◽  
C. C. Parker ◽  
D. A. Blizard ◽  
...  

The genes underlying variation in skeletal muscle mass are poorly understood. Although many quantitative trait loci (QTLs) have been mapped in crosses of mouse strains, the limited resolution inherent in these conventional studies has made it difficult to reliably pinpoint the causal genetic variants. The accumulated recombination events in an advanced intercross line (AIL), in which mice from two inbred strains are mated at random for several generations, can improve mapping resolution. We demonstrate these advancements in mapping QTLs for hindlimb muscle weights in an AIL ( n = 832) of the C57BL/6J (B6) and DBA/2J (D2) strains, generations F8–F13. We mapped muscle weight QTLs using the high-density MegaMUGA SNP panel. The QTLs highlight the shared genetic architecture of four hindlimb muscles and suggest that the genetic contributions to muscle variation are substantially different in males and females, at least in the B6D2 lineage. Out of the 15 muscle weight QTLs identified in the AIL, nine overlapped the genomic regions discovered in an earlier B6D2 F2 intercross. Mapping resolution, however, was substantially improved in our study to a median QTL interval of 12.5 Mb. Subsequent sequence analysis of the QTL regions revealed 20 genes with nonsense or potentially damaging missense mutations. Further refinement of the muscle weight QTLs using additional functional information, such as gene expression differences between alleles, will be important for discerning the causal genes.


2021 ◽  
Author(s):  
Robin N Beaumont ◽  
Isabelle K Mayne ◽  
Rachel M Freathy ◽  
Caroline F Wright

Abstract Birth weight is an important factor in newborn survival; both low and high birth weights are associated with adverse later-life health outcomes. Genome-wide association studies (GWAS) have identified 190 loci associated with maternal or fetal effects on birth weight. Knowledge of the underlying causal genes is crucial to understand how these loci influence birth weight and the links between infant and adult morbidity. Numerous monogenic developmental syndromes are associated with birth weights at the extreme ends of the distribution. Genes implicated in those syndromes may provide valuable information to prioritize candidate genes at the GWAS loci. We examined the proximity of genes implicated in developmental disorders (DDs) to birth weight GWAS loci using simulations to test whether they fall disproportionately close to the GWAS loci. We found birth weight GWAS single nucleotide polymorphisms (SNPs) fall closer to such genes than expected both when the DD gene is the nearest gene to the birth weight SNP and also when examining all genes within 258 kb of the SNP. This enrichment was driven by genes causing monogenic DDs with dominant modes of inheritance. We found examples of SNPs in the intron of one gene marking plausible effects via different nearby genes, highlighting the closest gene to the SNP not necessarily being the functionally relevant gene. This is the first application of this approach to birth weight, which has helped identify GWAS loci likely to have direct fetal effects on birth weight, which could not previously be classified as fetal or maternal owing to insufficient statistical power.


2006 ◽  
Vol 04 (03) ◽  
pp. 639-647 ◽  
Author(s):  
ELEAZAR ESKIN ◽  
RODED SHARAN ◽  
ERAN HALPERIN

The common approaches for haplotype inference from genotype data are targeted toward phasing short genomic regions. Longer regions are often tackled in a heuristic manner, due to the high computational cost. Here, we describe a novel approach for phasing genotypes over long regions, which is based on combining information from local predictions on short, overlapping regions. The phasing is done in a way, which maximizes a natural maximum likelihood criterion. Among other things, this criterion takes into account the physical length between neighboring single nucleotide polymorphisms. The approach is very efficient and is applied to several large scale datasets and is shown to be successful in two recent benchmarking studies (Zaitlen et al., in press; Marchini et al., in preparation). Our method is publicly available via a webserver at .


Circulation ◽  
2007 ◽  
Vol 116 (suppl_16) ◽  
Author(s):  
Allison B Lehtinen ◽  
Christopher Newton-Cheh ◽  
Julie T Ziegler ◽  
Carl D Langefeld ◽  
Barry I Freedman ◽  
...  

Background: Prolongation of the electrocardiographic QT interval is a risk factor for sudden cardiac death (SCD) in unselected samples as well as in post-myocardial infarction patients or those with diabetes. Common genetic variants in the nitric oxide synthase 1 adaptor protein (NOS1AP) gene have been reported to be associated with QT interval duration in individuals of European ancestry. We sought to replicate the association of NOS1AP variants with QT interval duration in pedigrees enriched for type 2 diabetes mellitus (T2DM). Methods and Results: Two single nucleotide polymorphisms (SNPs) in the NOS1AP gene, rs10494366 and rs10918594, were genotyped in a collection of 937 European Americans (EAs) and 177 African Americans (AAs) in 450 pedigrees containing at least two siblings with T2DM. An additive genetic model was tested for each SNP in ancestry-specific analyses using SOLAR in the total sample and in the diabetic subset (EA n=778, AA n=159), with and without exclusion of QT-altering medications. In the EA individuals, rs10494366 minor allele homozygotes had an 8.9 msec longer mean QT interval compared to major homozygotes (additive model p=4.4x10 -3 ); rs10918594 minor homozygotes had a 12.9 msec longer mean QT interval compared to major homozygotes (p=9.9x10 -5 ). Excluding users of QT-altering medications in the diabetic-only EA sample (n=514) strengthened the association despite the reduction in sample size (20.6 msec difference, p=2.0x10 -5 ; 23.4 msec difference, p=8.9x10 -7 , respectively). No association between the NOS1AP SNPs and QT interval duration was observed in the limited number of AA individuals examined. Conclusions: Two NOS1AP SNPs are strongly associated with QT interval duration in a predominately diabetic EA sample. Stronger effects of NOS1AP variants in diabetic individuals compared to previously reported unselected samples suggest that this patient subset may be particularly susceptible to genetic variants that influence myocardial depolarization and repolarization as manifest in the QT interval.


2012 ◽  
Vol 78 (7) ◽  
pp. 2435-2442 ◽  
Author(s):  
Marie Foulongne-Oriol ◽  
Anne Rodier ◽  
Jean-Michel Savoie

ABSTRACTDry bubble, caused byLecanicillium fungicola, is one of the most detrimental diseases affecting button mushroom cultivation. In a previous study, we demonstrated that breeding for resistance to this pathogen is quite challenging due to its quantitative inheritance. A second-generation hybrid progeny derived from an intervarietal cross between a wild strain and a commercial cultivar was characterized forL. fungicolaresistance under artificial inoculation in three independent experiments. Analysis of quantitative trait loci (QTL) was used to determine the locations, numbers, and effects of genomic regions associated with dry-bubble resistance. Four traits related to resistance were analyzed. Two to four QTL were detected per trait, depending on the experiment. Two genomic regions, on linkage group X (LGX) and LGVIII, were consistently detected in the three experiments. The genomic region on LGX was detected for three of the four variables studied. The total phenotypic variance accounted for by all QTL ranged from 19.3% to 42.1% over all traits in all experiments. For most of the QTL, the favorable allele for resistance came from the wild parent, but for some QTL, the allele that contributed to a higher level of resistance was carried by the cultivar. Comparative mapping with QTL for yield-related traits revealed five colocations between resistance and yield component loci, suggesting that the resistance results from both genetic factors and fitness expression. The consequences for mushroom breeding programs are discussed.


2019 ◽  
Author(s):  
Matthew Dapas ◽  
Frederick T. J. Lin ◽  
Girish N. Nadkarni ◽  
Ryan Sisk ◽  
Richard S. Legro ◽  
...  

AbstractBackgroundPolycystic ovary syndrome (PCOS) is a common, complex genetic disorder affecting up to 15% of reproductive age women worldwide, depending on the diagnostic criteria applied. These diagnostic criteria are based on expert opinion and have been the subject of considerable controversy. The phenotypic variation observed in PCOS is suggestive of an underlying genetic heterogeneity, but a recent meta-analysis of European ancestry PCOS cases found that the genetic architecture of PCOS defined by different diagnostic criteria was generally similar, suggesting that the criteria do not identify biologically distinct disease subtypes. We performed this study to test the hypothesis that there are biologically relevant subtypes of PCOS.Methods and FindingsUnsupervised hierarchical cluster analysis was performed on quantitative anthropometric, reproductive, and metabolic traits in a genotyped discovery cohort of 893 PCOS cases and an ungenotyped validation cohort of 263 PCOS cases. We identified two PCOS subtypes: a “reproductive” group (21-23%) characterized by higher luteinizing hormone (LH) and sex hormone binding globulin (SHBG) levels with relatively low body mass index (BMI) and insulin levels; and a “metabolic” group (37-39%), characterized by higher BMI, glucose, and insulin levels with lower SHBG and LH levels. We performed a GWAS on the genotyped cohort, limiting the cases to either the reproductive or metabolic subtypes. We identified alleles in four novel loci that were associated with the reproductive subtype at genome-wide significance (PRDM2/KAZN1, P=2.2×10-10; IQCA1, P=2.8×10-9; BMPR1B/UNC5C, P=9.7×10-9; CDH10, P=1.2×10-8) and one locus that was significantly associated with the metabolic subtype (KCNH7/FIGN, P=1.0×10-8). We have previously reported that rare variants in DENND1A, a gene regulating androgen biosynthesis, were associated with PCOS quantitative traits in a family-based whole genome sequencing analysis. We classified the reproductive and metabolic subtypes in this family-based PCOS cohort and found that the subtypes tended to cluster in families and that carriers of rare DENND1A variants were significantly more likely to have the reproductive subtype of PCOS. Limitations of our study were that only PCOS cases of European ancestry diagnosed by NIH criteria were included, the sample sizes for the subtype GWAS were small, and the GWAS findings were not replicated.ConclusionsIn conclusion, we have found stable reproductive and metabolic subtypes of PCOS. Further, these subtypes were associated with novel susceptibility loci. Our results suggest that these subtypes are biologically relevant since they have distinct genetic architectures. This study demonstrates how precise phenotypic delineation can be more powerful than increases in sample size for genetic association studies.


2020 ◽  
Author(s):  
Zena Rawandoozi ◽  
Timothy Hartmann ◽  
Silvia Carpenedo ◽  
Ksenija Gasic ◽  
Cassia da Silva Linge ◽  
...  

Abstract BackgroundEnvironmental adaptation and expanding harvest seasons are primary goals of most peach [Prunus persica (L.) Batsch] breeding programs. Breeding perennial crops is a challenging task due to their long breeding cycles and large tree size. Pedigree-based analysis using pedigreed families followed by haplotype construction creates a platform for QTL and marker identification, validation, and the use of marker-assisted selection in breeding programs.ResultsPhenotypic data of seven F1 low to medium chill full-sib families were collected over two years at two locations and genotyped using the 9K SNP Illumina array. Three QTLs were discovered for bloom date (BD) and mapped on linkage group 1 (LG1) (172 – 182 cM), LG4 (48 – 54 cM), and LG7 (62 – 70 cM), explaining 17-54%, 11-55%, and 11-18% of the phenotypic variance, respectively. The QTL for ripening date (RD) and fruit development period (FDP) on LG4 was co-localized at the central part of LG4 (40 - 46 cM) and explained between 40-75% of the phenotypic variance. Haplotype analyses revealed SNP haplotypes and predictive SNP marker(s) associated with desired QTL alleles and the presence of multiple functional alleles with different effects for a single locus for RD and FDP.ConclusionsA multiple pedigree-linked families approach validated major QTLs for the three key phenological traits which were reported in previous studies across diverse materials, geographical distributions, and QTL mapping methods. Haplotype characterization of these genomic regions differentiates this study from the previous QTL studies. Our results will provide the peach breeder with the haplotypes for three BD QTLs and one RD/FDP QTL for the creation of predictive DNA-based molecular marker tests to select parents and/or seedlings that have desired QTL alleles and cull unwanted genotypes in early seedling stages.


Sign in / Sign up

Export Citation Format

Share Document