scholarly journals Approximate conditional phenotype analysis based on genome wide association summary statistics

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Peitao Wu ◽  
Biqi Wang ◽  
Steven A. Lubitz ◽  
Emelia J. Benjamin ◽  
James B. Meigs ◽  
...  

AbstractBecause single genetic variants may have pleiotropic effects, one trait can be a confounder in a genome-wide association study (GWAS) that aims to identify loci associated with another trait. A typical approach to address this issue is to perform an additional analysis adjusting for the confounder. However, obtaining conditional results can be time-consuming. We propose an approximate conditional phenotype analysis based on GWAS summary statistics, the covariance between outcome and confounder, and the variant minor allele frequency (MAF). GWAS summary statistics and MAF are taken from GWAS meta-analysis results while the traits covariance may be estimated by two strategies: (i) estimates from a subset of the phenotypic data; or (ii) estimates from published studies. We compare our two strategies with estimates using individual level data from the full GWAS sample (gold standard). A simulation study for both binary and continuous traits demonstrates that our approximate approach is accurate. We apply our method to the Framingham Heart Study (FHS) GWAS and to large-scale cardiometabolic GWAS results. We observed a high consistency of genetic effect size estimates between our method and individual level data analysis. Our approach leads to an efficient way to perform approximate conditional analysis using large-scale GWAS summary statistics.

2016 ◽  
Author(s):  
Xiang Zhu ◽  
Matthew Stephens

Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously-proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously-unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss.


2021 ◽  
Vol 12 ◽  
Author(s):  
Kelechi Uchendu ◽  
Damian Ndubuisi Njoku ◽  
Agre Paterne ◽  
Ismail Yusuf Rabbi ◽  
Daniel Dzidzienyo ◽  
...  

Cassava breeders have made significant progress in developing new genotypes with improved agronomic characteristics such as improved root yield and resistance against biotic and abiotic stresses. However, these new and improved cassava (Manihot esculenta Crantz) varieties in cultivation in Nigeria have undergone little or no improvement in their culinary qualities; hence, there is a paucity of genetic information regarding the texture of boiled cassava, particularly with respect to its mealiness, the principal sensory quality attribute of boiled cassava roots. The current study aimed at identifying genomic regions and polymorphisms associated with natural variation for root mealiness and other texture-related attributes of boiled cassava roots, which includes fibre, adhesiveness (ADH), taste, aroma, colour, and firmness. We performed a genome-wide association (GWAS) analysis using phenotypic data from a panel of 142 accessions obtained from the National Root Crops Research Institute (NRCRI), Umudike, Nigeria, and a set of 59,792 high-quality single nucleotide polymorphisms (SNPs) distributed across the cassava genome. Through genome-wide association mapping, we identified 80 SNPs that were significantly associated with root mealiness, fibre, adhesiveness, taste, aroma, colour and firmness on chromosomes 1, 4, 5, 6, 10, 13, 17 and 18. We also identified relevant candidate genes that are co-located with peak SNPs linked to these traits in M. esculenta. A survey of the cassava reference genome v6.1 positioned the SNPs on chromosome 13 in the vicinity of Manes.13G026900, a gene recognized as being responsible for cell adhesion and for the mealiness or crispness of vegetables and fruits, and also known to play an important role in cooked potato texture. This study provides the first insights into understanding the underlying genetic basis of boiled cassava root texture. After validation, the markers and candidate genes identified in this novel work could provide important genomic resources for use in marker-assisted selection (MAS) and genomic selection (GS) to accelerate genetic improvement of root mealiness and other culinary qualities in cassava breeding programmes in West Africa, especially in Nigeria, where the consumption of boiled and pounded cassava is low.


2021 ◽  
Author(s):  
José Luis Zepeda-Batista ◽  
Rafael Nuñez-Domínguez ◽  
Rodolfo Ramírez-Valverde ◽  
Agustín Ruíz-Flores ◽  
Francisco Joel Jahuey-Martínez ◽  
...  

Abstract A genome-wide association study (GWAS) for liveweight traits of Braunvieh cattle was performed. The study included 300 genotyped animals by the GeneSeek® Genomic Profiler Bovine LDv.4 panel; after quality control, 22,734 SNP and 276 animals were maintained in the analysis. The examined phenotypic data considered birth, weaning, and yearling weights. The association analysis was performed using the principal components method via the egscore function of the GenABEL version 1.8-0 package in the R environment. The marker rs133262280 located in BTA 22 was associated with birth weight, and two SNPs were associated with weaning weight, rs43668789 (BTA 11) and rs136155567 (BTA 27). New QTL associated with these liveweight traits and four positional and functional candidate genes potentially involved in variations of the analyzed traits were identified. The most important genes in these genomic regions were MCM2 (minichromosome maintenance complex component 2), TPRA1 (transmembrane protein adipocyte associated 1), GALM (galactose mutarotase), and NRG1 (neuregulin 1), with relationships with embryonic cleavage, bone and tissue growth, cell adhesion, and organic development. This study is the first to present a GWAS conducted in Braunvieh cattle in Mexico and represents a basis for future research. Further analyses of found associated regions will clarify its contribution to the genetic basis of growth-related traits.


Rice ◽  
2019 ◽  
Vol 12 (1) ◽  
Author(s):  
Szu-Yu Chen ◽  
Ming-Hsin Lai ◽  
Chih-Wei Tung ◽  
Dong-Hong Wu ◽  
Fang-Yu Chang ◽  
...  

Abstract Background Rice bakanae disease has emerged as a new threat to rice production. In recent years, an increase in the occurrence and severity of bakanae disease has been reported in several areas in Asia. Although bakanae disease affects rice yield and quality, little is known about the genetics of bakanae resistance in rice. The lack of large-scale screens for bakanae resistance in rice germplasm has also limited the development and deployment of resistant varieties. Results A genome-wide association study (GWAS) was conducted to identify genes/loci conferring bakanae resistance in rice. A total of 231 diverse accessions from Rice Diversity Panel 1 (RDP1) were inoculated with a highly virulent Taiwanese Fusarium fujikuroi isolate and assessed for resistance using two parameters: (1) disease severity index based on visual rating and (2) colonization rate determined by reisolation of F. fujikuroi from the basal stems of infected rice seedlings. We identified 14 quantitative trait loci (QTLs) (10 for disease severity and 4 for colonization rate), including 1 mapped for both parameters. A total of 206 candidate genes were identified within the 14 QTLs, including genes encoding leucine-rich repeat (LRR)-containing and NB-ARC (nucleotide-binding adaptor shared by APAF-1, R proteins, and CED-4) proteins, hormone-related genes, transcription factor genes, ubiquitination-related genes, and oxidase/oxidoreductase genes. In addition, a candidate QTL (qBK1.7) that co-localized with the previously identified QTLs qBK1 and qFfR1, was verified by linkage analysis using a population of 132 recombinant inbred lines derived from IR64 x Nipponbare. GWAS delineated qBK1.7 to a region of 8239 bp containing three genes. Full-length sequencing across qBK1.7 in 20 rice accessions revealed that the coding regions of two LRR-containing genes Os01g0601625 and Os01g0601675 may be associated with bakanae resistance. Conclusions This study facilitates the exploitation of bakanae resistance resources in RDP1. The information on the resistance performance of 231 rice accessions and 14 candidate QTLs will aid efforts to breed resistance to bakanae and uncover resistance mechanisms. Quantification of the level of F. fujikuroi colonization in addition to the conventional rating of visual symptoms offers new insights into the genetics of bakanae resistance.


Author(s):  
Daigo Okada ◽  
Naotoshi Nakamura ◽  
Kazuya Setoh ◽  
Takahisa Kawaguchi ◽  
Koichiro Higasa ◽  
...  

AbstractHuman immune systems are very complex, and the basis for individual differences in immune phenotypes is largely unclear. One reason is that the phenotype of the immune system is so complex that it is very difficult to describe its features and quantify differences between samples. To identify the genetic factors that cause individual differences in whole lymphocyte profiles and their changes after vaccination without having to rely on biological assumptions, we performed a genome-wide association study (GWAS), using cytometry data. Here, we applied computational analysis to the cytometry data of 301 people before receiving an influenza vaccine, and 1, 7, and 90 days after the vaccination to extract the feature statistics of the lymphocyte profiles in a nonparametric and data-driven manner. We analyzed two types of cytometry data: measurements of six markers for B cell classification and seven markers for T cell classification. The coordinate values calculated by this method can be treated as feature statistics of the lymphocyte profile. Next, we examined the genetic basis of individual differences in human immune phenotypes with a GWAS for the feature statistics, and we newly identified seven significant and 36 suggestive single-nucleotide polymorphisms associated with the individual differences in lymphocyte profiles and their change after vaccination. This study provides a new workflow for performing combined analyses of cytometry data and other types of genomics data.


Author(s):  
Mary Hoekstra ◽  
Hao Yu Chen ◽  
Jian Rong ◽  
Line Dufresne ◽  
Jie Yao ◽  
...  

Objective: Lp(a) (lipoprotein[a]) is an independent risk factor for cardiovascular diseases and plasma levels are primarily determined by variation at the LPA locus. We performed a genome-wide association study in the UK Biobank to determine whether additional loci influence Lp(a) levels. Approach and Results: We included 293 274 White British individuals in the discovery analysis. Approximately 93 095 623 variants were tested for association with natural log-transformed Lp(a) levels using linear regression models adjusted for age, sex, genotype batch, and 20 principal components of genetic ancestry. After quality control, 131 independent variants were associated at genome-wide significance (P ≤5×10 -8 ). In addition to validating previous associations at LPA , APOE , and CETP , we identified a novel variant at the APOH locus, encoding β2GPI (beta2-glycoprotein I). The APOH variant rs8178824 was associated with increased Lp(a) levels (β [95% CI] [ln nmol/L], 0.064 [0.047–0.081]; P =2.8×10 -13 ) and demonstrated a stronger effect after adjustment for variation at the LPA locus (β [95% CI] [ln nmol/L], 0.089 [0.076–0.10]; P =3.8×10 -42 ). This association was replicated in a meta-analysis of 5465 European-ancestry individuals from the Framingham Offspring Study and Multi-Ethnic Study of Atherosclerosis (β [95% CI] [ln mg/dL], 0.16 [0.044–0.28]; P =0.0071). Conclusions: In a large-scale genome-wide association study of Lp(a) levels, we identified APOH as a novel locus for Lp(a) in individuals of European ancestry. Additional studies are needed to determine the precise role of β2GPI in influencing Lp(a) levels as well as its potential as a therapeutic target.


Sign in / Sign up

Export Citation Format

Share Document