scholarly journals Integrative eQTL-weighted hierarchical Cox models for SNP-set based time-to-event association studies

2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Haojie Lu ◽  
Yongyue Wei ◽  
Zhou Jiang ◽  
Jinhui Zhang ◽  
Ting Wang ◽  
...  

Abstract Background Integrating functional annotations into SNP-set association studies has been proven a powerful analysis strategy. Statistical methods for such integration have been developed for continuous and binary phenotypes; however, the SNP-set integrative approaches for time-to-event or survival outcomes are lacking. Methods We here propose IEHC, an integrative eQTL (expression quantitative trait loci) hierarchical Cox regression, for SNP-set based survival association analysis by modeling effect sizes of genetic variants as a function of eQTL via a hierarchical manner. Three p-values combination tests are developed to examine the joint effects of eQTL and genetic variants after a novel decorrelated modification of statistics for the two components. An omnibus test (IEHC-ACAT) is further adapted to aggregate the strengths of all available tests. Results Simulations demonstrated that the IEHC joint tests were more powerful if both eQTL and genetic variants contributed to association signal, while IEHC-ACAT was robust and often outperformed other approaches across various simulation scenarios. When applying IEHC to ten TCGA cancers by incorporating eQTL from relevant tissues of GTEx, we revealed that substantial correlations existed between the two types of effect sizes of genetic variants from TCGA and GTEx, and identified 21 (9 unique) cancer-associated genes which would otherwise be missed by approaches not incorporating eQTL. Conclusion IEHC represents a flexible, robust, and powerful approach to integrate functional omics information to enhance the power of identifying association signals for the survival risk of complex human cancers.

2018 ◽  
Author(s):  
Niccolò Tesi ◽  
Sven J. van der Lee ◽  
Marc Hulsman ◽  
Iris E. Jansen ◽  
Najada Stringa ◽  
...  

AbstractThe detection of genetic loci associated with Alzheimer’s disease (AD) requires large numbers of cases and controls because variant effect-sizes are mostly small. We hypothesized that variant effect-sizes should increase when individuals who represent the extreme ends of a disease spectrum are considered, as their genomes are assumed to be maximally enriched or depleted with disease-associated genetic variants.We used 1,073 extensively phenotyped AD cases with relatively young age at onset as extreme cases (66.3±7.9 years), 1,664 age-matched controls (66.0±6.5 years) and 255 cognitively healthy centenarians as extreme controls (101.4±1.3 years). We estimated the effect-size of 29 variants that were previously associated with AD in genome-wide association studies.Comparing extreme AD-cases with centenarian-controls increased the variant effect-size relative to published effect-sizes by on average 1.90-fold (SE=0.29,p=9.0×10−4). The effect-size increase was largest for the rare high-impactTREM2 (R74H)variant (6.5-fold), and significant for variants in/nearECHDC3(4.6-fold),SLC24A4-RIN3(4.5-fold),NME8(3.8-fold),PLCG2(3.3-fold),APOE-ε2(2.2-fold) andAPOE-ε4(2.0-fold). Comparing extreme phenotypes enabled us to replicate the AD association for 10 variants (p<0.05) in relatively small samples. The increase in effect-sizes depended mainly on using centenarians as extreme controls: the average variant effect-size was not increased in a comparison of extreme AD cases and age-matched controls (0.94-fold,p=6.8×10−1), suggesting that on average the tested genetic variants did not explain the extremity of the AD-cases. Concluding, using centenarians as extreme controls in AD case-controls studies boosts the variant effect-size by on average two-fold, allowing the replication of disease-association in relatively small samples.


2020 ◽  
Vol 38 (15_suppl) ◽  
pp. 10554-10554
Author(s):  
Cindy Im ◽  
Nan Li ◽  
Wonjong Moon ◽  
Lindsay M. Morton ◽  
Wendy M. Leisenring ◽  
...  

10554 Background: Recent genome-wide association studies (GWAS) have reported substantial sex differences in the genetic architectures of bone-related phenotypes. We investigated sex-specific genetic determinants of FFR in survivors of childhood cancer. Methods: We performed sex-combined and sex-stratified GWAS for FFR using Cox regression models fitted on follow-up age in 2,453 long-term (≥5 years) survivors in CCSS with ~5.4 million imputed SNPs (minor allele frequency, MAF≥5%), with self-reported FFR defined by first fracture at any site after diagnosis. Replication analyses were conducted in an independent sample of 1,417 SJLIFE survivors with whole-genome sequencing and clinician-assessed FFR. All models were adjusted for relevant genetic (e.g., ancestry) and clinical (e.g., height, weight, treatment) factors. Results: Sex-combined and male-specific analyses yielded no associations with P < 10−7. Among female CCSS survivors (N = 1,289, 33% ≥1 fractures), we discovered 7 genome-wide significant (P < 5x10−8) SNP-FFR associations with strong evidence of sex effect heterogeneity (P < 7x10−6) across 2 independent loci with no known associations with bone phenotypes. We replicated these associations in SJLIFE (P≤0.05) for 3 coding SNPs in the HAGHL gene (16p13.3), among which rs1406815 showed the strongest association (MAF = 20%, meta-analysis HR = 1.43, P = 8.2x10−9; N = 1,935 women, 35% ≥1 fractures). We observed increased HAGHL SNP effects on FFR that corresponded with increasing head/neck (HN) radiation therapy (RT) dose (Table). Public omics data show replicated SNPs are associated with differential HAGHL expression in sex gland and musculoskeletal tissues (GTEx) and in osteoblasts treated with dexamethasone or prostaglandins (GRASP), suggesting sex-/therapy-specific biological pathways involving HAGHL SNPs for fracture are plausible. Conclusions: Novel associations between HAGHL genetic variants and FFR potentially reveal new sex- and therapy-specific biological mechanisms underlying bone-related health conditions in survivors of childhood cancer. [Table: see text]


2017 ◽  
Author(s):  
Jingjing Yang ◽  
Lars G. Fritsche ◽  
Xiang Zhou ◽  
Gonçalo Abecasis ◽  

AbstractAlthough genome-wide association studies (GWASs) have identified many risk loci for complex traits and common diseases, most of the identified associations reside in noncoding regions and have unknown biological functions. Recent genomic sequencing studies have produced a rich resource of annotations that help characterize the function of genetic variants. Integrative analysis that incorporates these functional annotations into GWAS can help elucidate the biological mechanisms underlying the identified associations and help prioritize causal-variants. Here, we develop a novel, flexible Bayesian variable selection model with efficient computational techniques for such integrative analysis. Different from previous approaches, our method models the effect-size distribution and probability of causality for variants with different annotations and jointly models genome-wide variants to account for linkage disequilibrium (LD), thus prioritizing associations based on the quantification of the annotations and allowing for multiple causal-variants per locus. Our efficient computational algorithm dramatically improves both computational speed and posterior sampling convergence by taking advantage of the block-wise LD structures of human genomes. With simulations, we show that our method accurately quantifies the functional enrichment and performs more powerful for identifying true causal-variants than several competing methods. The power gain brought up by our method is especially apparent in cases when multiple causal-variants in LD reside in the same locus. We also apply our method for an in-depth GWAS of age-related macular degeneration with 33,976 individuals and 9,857,286 variants. We find the strongest enrichment for causality among non-synonymous variants (54x more likely to be causal, 1.4x larger effect-sizes) and variants in active promoter (7.8x more likely, 1.4x larger effect-sizes), as well as identify 5 potentially novel loci in addition to the 32 known AMD risk loci. In conclusion, our method is shown to efficiently integrate functional information in GWASs, helping identify causal variants and underlying biology.Author summaryWe propose a novel Bayesian hierarchical model to account for linkage disequilibrium (LD) and multiple functional annotations in GWAS, paired with an expectation-maximization Markov chain Monte Carlo (EM-MCMC) computational algorithm to jointly analyze genome-wide variants. Our method improves the MCMC convergence property to ensure accurate Bayesian inference of the quantifications of the functional enrichment pattern and fine-mapped association results. By applying our method to the real GWAS of age-related macular degeneration (AMD) with various functional annotations (i.e., gene-based, regulatory, and chromatin states), we find that the variants of non-synonymous, coding, and active promoter annotations have the highest causal probability and the largest effect-sizes. In addition, our method produces fine-mapped association results in the identified risk loci, two of which are shown as examples (C2/CFB/SKIV2L and C3) with justifications by haplotype analysis, model comparison, and conditional analysis. Therefore, we believe our integrative method will be useful for quantifying the enrichment pattern of functional annotations in GWAS, and then prioritizing associations with respect to the learned functional enrichment pattern.


2021 ◽  
pp. 096228022199841
Author(s):  
Yingrui Yang ◽  
Molin Wang

In epidemiology, identifying the effect of exposure variables in relation to a time-to-event outcome is a classical research area of practical importance. Incorporating propensity score in the Cox regression model, as a measure to control for confounding, has certain advantages when outcome is rare. However, in situations involving exposure measured with moderate to substantial error, identifying the exposure effect using propensity score in Cox models remains a challenging yet unresolved problem. In this paper, we propose an estimating equation method to correct for the exposure misclassification-caused bias in the estimation of exposure-outcome associations. We also discuss the asymptotic properties and derive the asymptotic variances of the proposed estimators. We conduct a simulation study to evaluate the performance of the proposed estimators in various settings. As an illustration, we apply our method to correct for the misclassification-caused bias in estimating the association of PM2.5 level with lung cancer mortality using a nationwide prospective cohort, the Nurses’ Health Study. The proposed methodology can be applied using our user-friendly R program published online.


2019 ◽  
Vol 26 (34) ◽  
pp. 6207-6221 ◽  
Author(s):  
Innocenzo Rainero ◽  
Alessandro Vacca ◽  
Flora Govone ◽  
Annalisa Gai ◽  
Lorenzo Pinessi ◽  
...  

Migraine is a common, chronic neurovascular disorder caused by a complex interaction between genetic and environmental risk factors. In the last two decades, molecular genetics of migraine have been intensively investigated. In a few cases, migraine is transmitted as a monogenic disorder, and the disease phenotype cosegregates with mutations in different genes like CACNA1A, ATP1A2, SCN1A, KCNK18, and NOTCH3. In the common forms of migraine, candidate genes as well as genome-wide association studies have shown that a large number of genetic variants may increase the risk of developing migraine. At present, few studies investigated the genotype-phenotype correlation in patients with migraine. The purpose of this review was to discuss recent studies investigating the relationship between different genetic variants and the clinical characteristics of migraine. Analysis of genotype-phenotype correlations in migraineurs is complicated by several confounding factors and, to date, only polymorphisms of the MTHFR gene have been shown to have an effect on migraine phenotype. Additional genomic studies and network analyses are needed to clarify the complex pathways underlying migraine and its clinical phenotypes.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Feng Jiang ◽  
Chuyan Wu ◽  
Ming Wang ◽  
Ke Wei ◽  
Jimei Wang

AbstractOne of the most frequently identified tumors and a contributing cause of death in women is breast cancer (BC). Many biomarkers associated with survival and prognosis were identified in previous studies through database mining. Nevertheless, the predictive capabilities of single-gene biomarkers are not accurate enough. Genetic signatures can be an enhanced prediction method. This research analyzed data from The Cancer Genome Atlas (TCGA) for the detection of a new genetic signature to predict BC prognosis. Profiling of mRNA expression was carried out in samples of patients with TCGA BC (n = 1222). Gene set enrichment research has been undertaken to classify gene sets that vary greatly between BC tissues and normal tissues. Cox models for additive hazards regression were used to classify genes that were strongly linked to overall survival. A subsequent Cox regression multivariate analysis was used to construct a predictive risk parameter model. Kaplan–Meier survival predictions and log-rank validation have been used to verify the value of risk prediction parameters. Seven genes (PGK1, CACNA1H, IL13RA1, SDC1, AK3, NUP43, SDC3) correlated with glycolysis were shown to be strongly linked to overall survival. Depending on the 7-gene-signature, 1222 BC patients were classified into subgroups of high/low-risk. Certain variables have not impaired the prognostic potential of the seven-gene signature. A seven-gene signature correlated with cellular glycolysis was developed to predict the survival of BC patients. The results include insight into cellular glycolysis mechanisms and the detection of patients with poor BC prognosis.


2020 ◽  
Vol 07 (03) ◽  
pp. 075-079
Author(s):  
Mahamad Irfanulla Khan ◽  
Prashanth CS

AbstractCleft lip with or without cleft palate (CL/P) is one of the most common congenital malformations in humans involving various genetic and environmental risk factors. The prevalence of CL/P varies according to geographical location, ethnicity, race, gender, and socioeconomic status, affecting approximately 1 in 800 live births worldwide. Genetic studies aim to understand the mechanisms contributory to a phenotype by measuring the association between genetic variants and also between genetic variants and phenotype population. Genome-wide association studies are standard tools used to discover genetic loci related to a trait of interest. Genetic association studies are generally divided into two main design types: population-based studies and family-based studies. The epidemiological population-based studies comprise unrelated individuals that directly compare the frequency of genetic variants between (usually independent) cases and controls. The alternative to population-based studies (case–control designs) includes various family-based study designs that comprise related individuals. An example of such a study is a case–parent trio design study, which is commonly employed in genetics to identify the variants underlying complex human disease where transmission of alleles from parents to offspring is studied. This article describes the fundamentals of case–parent trio study, trio design and its significances, statistical methods, and limitations of the trio studies.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Shuquan Rao ◽  
Yao Yao ◽  
Daniel E. Bauer

AbstractGenome-wide association studies (GWAS) have uncovered thousands of genetic variants that influence risk for human diseases and traits. Yet understanding the mechanisms by which these genetic variants, mainly noncoding, have an impact on associated diseases and traits remains a significant hurdle. In this review, we discuss emerging experimental approaches that are being applied for functional studies of causal variants and translational advances from GWAS findings to disease prevention and treatment. We highlight the use of genome editing technologies in GWAS functional studies to modify genomic sequences, with proof-of-principle examples. We discuss the challenges in interrogating causal variants, points for consideration in experimental design and interpretation of GWAS locus mechanisms, and the potential for novel therapeutic opportunities. With the accumulation of knowledge of functional genetics, therapeutic genome editing based on GWAS discoveries will become increasingly feasible.


Author(s):  
Fernando Pires Hartwig ◽  
Kate Tilling ◽  
George Davey Smith ◽  
Deborah A Lawlor ◽  
Maria Carolina Borges

Abstract Background Two-sample Mendelian randomization (MR) allows the use of freely accessible summary association results from genome-wide association studies (GWAS) to estimate causal effects of modifiable exposures on outcomes. Some GWAS adjust for heritable covariables in an attempt to estimate direct effects of genetic variants on the trait of interest. One, both or neither of the exposure GWAS and outcome GWAS may have been adjusted for covariables. Methods We performed a simulation study comprising different scenarios that could motivate covariable adjustment in a GWAS and analysed real data to assess the influence of using covariable-adjusted summary association results in two-sample MR. Results In the absence of residual confounding between exposure and covariable, between exposure and outcome, and between covariable and outcome, using covariable-adjusted summary associations for two-sample MR eliminated bias due to horizontal pleiotropy. However, covariable adjustment led to bias in the presence of residual confounding (especially between the covariable and the outcome), even in the absence of horizontal pleiotropy (when the genetic variants would be valid instruments without covariable adjustment). In an analysis using real data from the Genetic Investigation of ANthropometric Traits (GIANT) consortium and UK Biobank, the causal effect estimate of waist circumference on blood pressure changed direction upon adjustment of waist circumference for body mass index. Conclusions Our findings indicate that using covariable-adjusted summary associations in MR should generally be avoided. When that is not possible, careful consideration of the causal relationships underlying the data (including potentially unmeasured confounders) is required to direct sensitivity analyses and interpret results with appropriate caution.


Sign in / Sign up

Export Citation Format

Share Document