scholarly journals A Multiple-Trait Bayesian Variable Selection Regression Method for Integrating Phenotypic Causal Networks in Genome-Wide Association Studies

2020 ◽  
Vol 10 (12) ◽  
pp. 4439-4448
Author(s):  
Zigui Wang ◽  
Deborah Chapman ◽  
Gota Morota ◽  
Hao Cheng

Bayesian regression methods that incorporate different mixture priors for marker effects are used in multi-trait genomic prediction. These methods can also be extended to genome-wide association studies (GWAS). In multiple-trait GWAS, incorporating the underlying causal structures among traits is essential for comprehensively understanding the relationship between genotypes and traits of interest. Therefore, we develop a GWAS methodology, SEM-Bayesian alphabet, which, by applying the structural equation model (SEM), can be used to incorporate causal structures into multi-trait Bayesian regression methods. SEM-Bayesian alphabet provides a more comprehensive understanding of the genotype-phenotype mapping than multi-trait GWAS by performing GWAS based on indirect, direct and overall marker effects. The superior performance of SEM-Bayesian alphabet was demonstrated by comparing its GWAS results with other similar multi-trait GWAS methods on real and simulated data. The software tool JWAS offers open-source routines to perform these analyses.

2019 ◽  
Author(s):  
Zigui Wang ◽  
Deborah Chapman ◽  
Gota Morota ◽  
Hao Cheng

ABSTRACTBayesian regression methods that incorporate different mixture priors for marker effects are used in multi-trait genomic prediction. These methods can also be extended to genome-wide association studies (GWAS). In multiple-trait GWAS, incorporating the underlying causal structures among traits is essential for comprehensively understanding the relationship between genotypes and traits of interest. Therefore, we develop a GWAS methodology, SEM-BayesCΠ, which, by applying the structural equation model (SEM), can be used to incorporate causal structures into a multi-trait Bayesian regression method using mixture priors. The performance of SEM-BayesCΠ was demonstrated by comparing its GWAS results with those from multi-trait BayesCΠ. Through the inductive causation (IC) algorithm, three potential causal structures were inferred of 0.9 highest posterior density (HPD) interval. SEM-BayesCΠ provides a more comprehensive understanding of the genotype-phenotype mapping than multi-trait BayesCΠ by performing GWAS based on indirect, direct and overall marker effects. The software tool JWAS offers open-source routines to perform these analyses.


Animals ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. 2009
Author(s):  
Ellen Lai ◽  
Alexa L. Danner ◽  
Thomas R. Famula ◽  
Anita M. Oberbauer

Digital dermatitis (DD) causes lameness in dairy cattle. To detect the quantitative trait loci (QTL) associated with DD, genome-wide association studies (GWAS) were performed using high-density single nucleotide polymorphism (SNP) genotypes and binary case/control, quantitative (average number of FW per hoof trimming record) and recurrent (cases with ≥2 DD episodes vs. controls) phenotypes from cows across four dairies (controls n = 129 vs. FW n = 85). Linear mixed model (LMM) and random forest (RF) approaches identified the top SNPs, which were used as predictors in Bayesian regression models to assess the SNP predictive value. The LMM and RF analyses identified QTL regions containing candidate genes on Bos taurus autosome (BTA) 2 for the binary and recurrent phenotypes and BTA7 and 20 for the quantitative phenotype that related to epidermal integrity, immune function, and wound healing. Although larger sample sizes are necessary to reaffirm these small effect loci amidst a strong environmental effect, the sample cohort used in this study was sufficient for estimating SNP effects with a high predictive value.


Author(s):  
Yingjie Guo ◽  
Chenxi Wu ◽  
Zhian Yuan ◽  
Yansu Wang ◽  
Zhen Liang ◽  
...  

Among the myriad of statistical methods that identify gene–gene interactions in the realm of qualitative genome-wide association studies, gene-based interactions are not only powerful statistically, but also they are interpretable biologically. However, they have limited statistical detection by making assumptions on the association between traits and single nucleotide polymorphisms. Thus, a gene-based method (GGInt-XGBoost) originated from XGBoost is proposed in this article. Assuming that log odds ratio of disease traits satisfies the additive relationship if the pair of genes had no interactions, the difference in error between the XGBoost model with and without additive constraint could indicate gene–gene interaction; we then used a permutation-based statistical test to assess this difference and to provide a statistical p-value to represent the significance of the interaction. Experimental results on both simulation and real data showed that our approach had superior performance than previous experiments to detect gene–gene interactions.


2018 ◽  
Author(s):  
Jianan Zhana ◽  
Jessica van Setten ◽  
Jennifer Brody ◽  
Brenton Swenson ◽  
Anne M. Butler ◽  
...  

AbstractMotivationGenome-wide association studies have had great success in identifying human genetic variants associated with disease, disease risk factors, and other biomedical phenotypes. Many variants are associated with multiple traits, even after correction for trait-trait correlation. Discovering subsets of variants associated with a shared subset of phenotypes could help reveal disease mechanisms, suggest new therapeutic options, and increase the power to detect additional variants with similar pattern of associations. Here we introduce two methods based on a Bayesian framework, SNP And Pleiotropic PHenotype Organization (SAPPHO), one modeling independent phenotypes (SAPPHO-I) and the other incorporating a full phenotype covariance structure (SAPPHO-C). These two methods learn patterns of pleiotropy from genotype and phenotype data, using identified associations to discover additional associations with shared patterns.ResultsThe SAPPHO methods, along with other recent approaches for pleiotropic association tests, were assessed using data from the Atherosclerotic Risk in Communities (ARIC) study of 8,000 individuals, whose gold-standard associations were provided by meta-analysis of 40,000 to 100,000 individuals from the CHARGE consortium. Using power to detect gold-standard associations at genome-wide significance (0.05 family-wise error rate) as a metric, SAPPHO performed best. The SAPPHO methods were also uniquely able to select the most significant variants in a parsimonious model, excluding other less likely variants within a linkage disequilibrium block. For meta-analysis, the SAPPHO methods implement summary modes that use sufficient statistics rather than full phenotype and genotype data. Meta-analysis applied to CHARGE detected 16 additional associations to the gold-standard loci, as well as 124 novel loci, at 0.05 false discovery rate. Reasons for the superior performance were explored by performing simulations over a range of scenarios describing different genetic architectures. With SAPPHO we were able to learn genetic structures that were hidden using the traditional univariate tests.Availabilityhttps://bitbucket.org/baderlab/fast/wiki/Home. SAPPHO software is available under the GNU General Public License, v2.


2019 ◽  
Vol 5 (1) ◽  
Author(s):  
Benazir Rowe ◽  
Xiangning Chen ◽  
Zuoheng Wang ◽  
Jingchun Chen ◽  
Amei Amei

AbstractGenome-wide association studies (GWAS) have identified over 100 loci associated with schizophrenia. Most of these studies test genetic variants for association one at a time. In this study, we performed GWAS of the molecular genetics of schizophrenia (MGS) dataset with 5334 subjects using multivariate Bayesian variable selection (BVS) method Posterior Inference via Model Averaging and Subset Selection (piMASS) and compared our results with the previous univariate analysis of the MGS dataset. We showed that piMASS can improve the power of detecting schizophrenia-associated SNPs, potentially leading to new discoveries from existing data without increasing the sample size. We tested SNPs in groups to allow for local additive effects and used permutation test to determine statistical significance in order to compare our results with univariate method. The previous univariate analysis of the MGS dataset revealed no genome-wide significant loci. Using the same dataset, we identified a single region that exceeded the genome-wide significance. The result was replicated using an independent Swedish Schizophrenia Case–Control Study (SSCCS) dataset. Based on the SZGR 2.0 database we found 63 SNPs from the best performing regions that are mapped to 27 genes known to be associated with schizophrenia. Overall, we demonstrated that piMASS could discover association signals that otherwise would need a much larger sample size. Our study has important implication that reanalyzing published datasets with BVS methods like piMASS might have more power to discover new risk variants for many diseases without new sample collection, ascertainment, and genotyping.


F1000Research ◽  
2014 ◽  
Vol 3 ◽  
pp. 200 ◽  
Author(s):  
Diego Fabregat-Traver ◽  
Sodbo Zh. Sharapov ◽  
Caroline Hayward ◽  
Igor Rudan ◽  
Harry Campbell ◽  
...  

To raise the power of genome-wide association studies (GWAS) and avoid false-positive results in structured populations, one can rely on mixed model based tests. When large samples are used, and when multiple traits are to be studied in the ’omics’ context, this approach becomes computationally challenging. Here we consider the problem of mixed-model based GWAS for arbitrary number of traits, and demonstrate that for the analysis of single-trait and multiple-trait scenarios different computational algorithms are optimal. We implement these optimal algorithms in a high-performance computing framework that uses state-of-the-art linear algebra kernels, incorporates optimizations, and avoids redundant computations,increasing throughput while reducing memory usage and energy consumption. We show that, compared to existing libraries, our algorithms and software achieve considerable speed-ups. The OmicABEL software described in this manuscript is available under the GNUGPL v. 3 license as part of the GenABEL project for statistical genomics at http: //www.genabel.org/packages/OmicABEL.


Sign in / Sign up

Export Citation Format

Share Document