scholarly journals Identifying Pleiotropic Effects: A Two-Stage Approach Using Genome-Wide Association Meta-Analysis Data

2017 ◽  
Author(s):  
Xing Chen ◽  
Yi-Hsiang Hsu

AbstractPleiotropic effects occur when a single genetic variant independently influences multiple phenotypes. In genetic epidemiological studies, multiple endo-phenotypes or correlated traits are commonly tested separately in a univariate statistical framework to identify associations with genetic determinants. Subsequently, a simple look-up of overlapping univariate results is applied to identify pleiotropic genetic effects. However, this strategy offers limited power to detect pleiotropy. In contrast, combining correlated traits into a composite test provides a powerful approach for detecting pleiotropic genes. Here, we propose a two-stage approach to identify potential pleiotropic effects by utilizing aggregated results from large-scale genome-wide association (GWAS) meta-analyses. In the first stage, we developed two novel approaches (direct linear combining, dLC; and empirical combining, eLC) combining correlated univariate test statistics to screen potential pleiotropic variants on a genome-wide scale, using either individual-level or aggregated data. Our simulations indicated that dLC and eLC outperform other popular multivariate approaches (such as principal component analysis (PCA), multivariate analysis of variance (MANOVA), canonical correlation (CCA), generalized estimation equations (GEE), linear mixed effects models (LME) and O’Brien combining approach). In particular, eLC provides a notable increase in power when the genetic variant exhibits both protective and deleterious effects. In the second stage, we developed a unique approach, conditional pleiotropy testing (cPLT), to examine pleiotropic effects using individual-level data for candidate variants identified in Stage 1. Simulation demonstrated reduced type 1 error for cPLT in identifying pleiotropic genetic variants compared to the typical conditional strategy. We validated our two-stage approach by performing a bivariate GWA study on two correlated quantitative traits, high-density lipoprotein (HDL) and triglycerides (TG), in the Genetic Analysis Workshop 16 (GAW16) simulation dataset. In summary, the proposed two-stage approach allows us to leverage aggregated summary statistics from univariate GWAS and improves the power to identify potential pleiotropy while maintaining valid false-positive rates.Author SummaryPleiotropy, occurring when a single genetic variant contributes to multiple phenotypes, remains difficult to identify in genome-wide association studies (GWAS). To leverage data for multiple phenotypes and incorporate univariate GWAS summary results, we propose a novel two-stage approach for discovering potential pleiotropic variants. In the first stage, two novel combining approaches were developed to screen potential pleiotropic variants on a genome-wide scale. Simulations demonstrated the superior statistical power of these approaches over other multivariate methods. In the second stage, our approach was used to identify potential pleiotropy in the candidate marker sets generated from the first stage. The proposed two-stage approach was applied to the GAW16 simulation dataset to discover pleiotropic variants associated with high-density lipoprotein and triglycerides. In summary, we demonstrate that the proposed two-stage approach can be applied as a viable and robust strategy to accommodate phenotypic and genetic heterogeneity for discovering potential pleiotropy on genome-wide scale.

2012 ◽  
Vol 15 (6) ◽  
pp. 691-699 ◽  
Author(s):  
Ida Surakka ◽  
John B. Whitfield ◽  
Markus Perola ◽  
Peter M. Visscher ◽  
Grant W. Montgomery ◽  
...  

Genome-wide association analysis on monozygotic twin-pairs offers a route to discovery of gene–environment interactions through testing for variability loci associated with sensitivity to individual environment/lifestyle. We present a genome-wide scan of loci associated with intra-pair differences in serum lipid and apolipoprotein levels. We report data for 1,720 monozygotic female twin-pairs from GenomEUtwin project with 2.5 million SNPs, imputed or genotyped, and measured serum lipid fractions for both twins. We found one locus associated with intra-pair differences in high-density lipoprotein cholesterol, rs2483058 in an intron of SRGAP2, where twins carrying the C allele are more sensitive to environmental factors (P = 3.98 × 10−8). We followed up the association in further genotyped monozygotic twins (N = 1,261), which showed a moderate association for the variant (P = 0.200, same direction of an effect). In addition, we report a new association on the level of apolipoprotein A-II (P = 4.03 × 10−8).


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Peitao Wu ◽  
Biqi Wang ◽  
Steven A. Lubitz ◽  
Emelia J. Benjamin ◽  
James B. Meigs ◽  
...  

AbstractBecause single genetic variants may have pleiotropic effects, one trait can be a confounder in a genome-wide association study (GWAS) that aims to identify loci associated with another trait. A typical approach to address this issue is to perform an additional analysis adjusting for the confounder. However, obtaining conditional results can be time-consuming. We propose an approximate conditional phenotype analysis based on GWAS summary statistics, the covariance between outcome and confounder, and the variant minor allele frequency (MAF). GWAS summary statistics and MAF are taken from GWAS meta-analysis results while the traits covariance may be estimated by two strategies: (i) estimates from a subset of the phenotypic data; or (ii) estimates from published studies. We compare our two strategies with estimates using individual level data from the full GWAS sample (gold standard). A simulation study for both binary and continuous traits demonstrates that our approximate approach is accurate. We apply our method to the Framingham Heart Study (FHS) GWAS and to large-scale cardiometabolic GWAS results. We observed a high consistency of genetic effect size estimates between our method and individual level data analysis. Our approach leads to an efficient way to perform approximate conditional analysis using large-scale GWAS summary statistics.


Author(s):  
Nicola Santoro ◽  
Ling Chen ◽  
Jennifer Todd ◽  
Jasmin Divers ◽  
Amy S Shah ◽  
...  

Abstract Context Dyslipidemia is highly prevalent in youth with type 2 diabetes (T2D), yet the pathogenic components of dyslipidemia in youth with T2D are poorly understood. Objective To evaluate the genetic determinants of lipid traits in youth with T2D through a genome-wide association study (GWAS). Design, participants and main outcome measures We genotyped 206,928 variants and imputed 17,642,824 variants in 1,076 youth (mean age 15.0 ±2.48 years) with T2D from the Treatment Options for Type 2 Diabetes in Adolescents and Youth (TODAY) and SEARCH for Diabetes in Youth (SEARCH) studies as part of the Progress in Diabetes Genetics in Youth (ProDiGY) consortium. We performed association testing for triglyceride, low-density lipoprotein (LDL-c) and high-density lipoprotein (HDL-c) concentrations adjusted for the genetic relationship matrix within each sub-study followed by meta-analyses for each trait. Results We identified a novel association between a deletion on chromosome 3 (3:67817380_AT/A_Deletion:RP11-81N13.1) and triglyceride levels at genome-wide level of significance (P=2.3×10 -8) with each risk allele increasing triglycerides by 20%. We also identified a genome-wide significant signal at rs247617 (P=5.1×10 -9) between HERFUD1 and CETP associated with HDL-c, with carriers of one copy of the risk allele having twice higher HDL-c. Conclusions Our genetic analyses of lipid traits in youth with T2D have identified one novel and one previously known locus. Additional studies are needed to further characterize the genetic architecture of dyslipidemia in youth with T2D.


2018 ◽  
Author(s):  
Jing Yuan ◽  
Sharon A. Kessler

AbstractOvules contain the female gametophytes which are fertilized during pollination to initiate seed development. Thus, the number of ovules that are produced during flower development is an important determinant of seed crop yield and plant fitness. Mutants with pleiotropic effects on development often alter the number of ovules, but specific regulators of ovule number have been difficult to identify in traditional mutant screens. We used natural variation in Arabidopsis accessions to identify new genes involved in the regulation of ovule number. The ovule numbers per flower of 189 Arabidopsis accessions were determined and found to have broad phenotypic variation that ranged from 39 ovules to 84 ovules per pistil. Genome-Wide Association tests revealed several genomic regions that are associated with ovule number. T-DNA insertion lines in candidate genes from the most significantly associated loci were screened for ovule number phenotypes. The NEW ENHANCER of ROOT DWARFISM (NERD1) gene was found to have pleiotropic effects on plant fertility that include regulation of ovule number and both male and female gametophyte development. Overexpression of NERD1 increased ovule number per fruit in a background-dependent manner and more than doubled the total number of flowers produced in all backgrounds tested, indicating that manipulation of NERD1 levels can be used to increase plant productivity.Author SummaryOvules are the precursors of seeds in flowering plants. Each ovule contains an egg cell and a central cell that fuse with two sperm cells during double fertilization to generate seeds containing an embryo and endosperm. The number of ovules produced during flower development determines the maximum number of seeds that can be produced by a flower. In this paper, we used natural variation in Arabidopsis thaliana accessions to identify regions of the genome that are associated with ovule number. Polymorphisms in the plant-specific NERD1 gene on chromosome 3 were significantly associated with ovule number. Mutant and overexpression analyses revealed that NERD1 is a positive regulator of ovule number, lateral branching, and flower number in Arabidopsis. Manipulation of NERD1 expression levels could potentially be used to increase yield in crop plants.


Author(s):  
Andrew W George ◽  
Arunas Verbyla ◽  
Joshua Bowden

Abstract Eagle is an R package for multi-locus association mapping on a genome-wide scale. It is unlike other multi-locus packages in that it is easy-to-use for R users and non-users alike. It has two modes of use, command line and GUI. Eagle is fully documented and has its own supporting website, http://eagle.r-forge.r-project.org/index.html. Eagle is a significant improvement over the method-of-choice, single-locus association mapping. It has greater power to detect SNP-trait associations. It is based on model selection, linear mixed models, and a clever idea on how random effects can be used to identify SNP-trait associations. Through an example with real mouse data, we demonstrate Eagle’s ability to bring clarity and increased insight to single-locus findings. Initially, we see Eagle complementing single-locus analyses. However, over time, we hope the community will make, increasingly, multi-locus association mapping their method-of-choice for the analysis of genome-wide association study data.


Thorax ◽  
2020 ◽  
pp. thoraxjnl-2019-214430
Author(s):  
Jaeyoung Cho ◽  
Kyungtaek Park ◽  
Sun Mi Choi ◽  
Jinwoo Lee ◽  
Chang-Hoon Lee ◽  
...  

BackgroundThe prevalence of non-tuberculous mycobacterial pulmonary disease (NTM-PD) is increasing in South Korea and many parts of the world. However, the genetic factors underlying susceptibility to this disease remain elusive.MethodsTo identify genetic variants in patients with NTM-PD, we performed a genome-wide association study with 403 Korean patients with NTM-PD and 306 healthy controls from the Healthy Twin Study, Korea cohort. Candidate variants from the discovery cohort were subsequently validated in an independent cohort. The Genotype-Tissue Expression (GTEx) database was used to identify expression quantitative trait loci (eQTL) and to conduct Mendelian randomisation (MR).ResultsWe identified a putatively significant locus on chromosome 7p13, rs849177 (OR, 2.34; 95% CI, 1.71 to 3.21; p=1.36×10−7), as the candidate genetic variant associated with NTM-PD susceptibility. Its association was subsequently replicated and the combined p value was 4.92×10−8. The eQTL analysis showed that a risk allele at rs849177 was associated with lower expression levels of STK17A, a proapoptotic gene. In the MR analysis, a causal effect of STK17A on NTM-PD development was identified (β, −4.627; 95% CI, −8.768 to −0.486; p=0.029).ConclusionsThe 7p13 genetic variant might be associated with susceptibility to NTM-PD in the Korean population by altering the expression level of STK17A.


PeerJ ◽  
2019 ◽  
Vol 7 ◽  
pp. e7983 ◽  
Author(s):  
Nicolas Granger ◽  
Alejandro Luján Feliu-Pascual ◽  
Charlotte Spicer ◽  
Sally Ricketts ◽  
Rebekkah Hitti ◽  
...  

BackgroundCharcot-Marie-Tooth (CMT) disease is the most common neuromuscular disorder in humans affecting 40 out of 100,000 individuals. In 2008, we described the clinical, electrophysiological and pathological findings of a demyelinating motor and sensory neuropathy in Miniature Schnauzer dogs, with a suspected autosomal recessive mode of inheritance based on pedigree analysis. The discovery of additional cases has followed this work and led to a genome-wide association mapping approach to search for the underlying genetic cause of the disease.MethodsFor genome wide association screening, genomic DNA samples from affected and unaffected dogs were genotyped using the Illumina CanineHD SNP genotyping array.SBF2and its variant were sequenced using primers and PCRs. RNA was extracted from muscle of an unaffected and an affected dog and RT-PCR performed. Immunohistochemistry for myelin basic protein was performed on peripheral nerve section specimens.ResultsThe genome-wide association study gave an indicative signal on canine chromosome 21. Although the signal was not of genome-wide significance due to the small number of cases, theSBF2(also known asMTMR13)gene within the region of shared case homozygosity was a strong positional candidate, as 22 genetic variants in the gene have been associated with demyelinating forms of Charcot-Marie-Tooth disease in humans. Sequencing ofSBF2in cases revealed a splice donor site genetic variant, resulting in cryptic splicing and predicted early termination of the protein based on RNA sequencing results.ConclusionsThis study reports the first genetic variant in Miniature Schnauzer dogs responsible for the occurrence of a demyelinating peripheral neuropathy with abnormally folded myelin. This discovery establishes a genotype/phenotype correlation in affected Miniature Schnauzers that can be used for the diagnosis of these dogs. It further supports the dog as a natural model of a human disease; in this instance, Charcot-Marie-Tooth disease. It opens avenues to search the biological mechanisms responsible for the disease and to test new therapies in a non-rodent large animal model. In particular, recent gene editing methods that led to the restoration of dystrophin expression in a canine model of muscular dystrophy could be applied to other canine models such as this before translation to humans.


Sign in / Sign up

Export Citation Format

Share Document