GWASpro: a high-performance genome-wide association analysis server

Bongsong Kim; Xinbin Dai; Wenchao Zhang; Zhaohong Zhuang; Darlene L Sanchez; Thomas Lübberstedt; Yun Kang; Michael K Udvardi; William D Beavis; Shizhong Xu; Patrick X Zhao

doi:10.1093/bioinformatics/bty989

GWASpro: a high-performance genome-wide association analysis server

Bioinformatics ◽

10.1093/bioinformatics/bty989 ◽

2018 ◽

Vol 35 (14) ◽

pp. 2512-2514 ◽

Cited By ~ 4

Author(s):

Bongsong Kim ◽

Xinbin Dai ◽

Wenchao Zhang ◽

Zhaohong Zhuang ◽

Darlene L Sanchez ◽

...

Keyword(s):

High Performance ◽

Large Scale ◽

Linear Mixed Model ◽

Association Studies ◽

Learning Curves ◽

Experimental Designs ◽

Genome Wide Association ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Genome Wide

Abstract Summary We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. Availability and implementation GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Efficient multivariate analysis algorithms for longitudinal genome-wide association studies

Bioinformatics ◽

10.1093/bioinformatics/btz304 ◽

2019 ◽

Vol 35 (23) ◽

pp. 4879-4885 ◽

Cited By ~ 4

Author(s):

Chao Ning ◽

Dan Wang ◽

Lei Zhou ◽

Julong Wei ◽

Yuanxin Liu ◽

...

Keyword(s):

Longitudinal Data ◽

Software Package ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Computational Speed

Abstract Motivation Current dynamic phenotyping system introduces time as an extra dimension to genome-wide association studies (GWAS), which helps to explore the mechanism of dynamical genetic control for complex longitudinal traits. However, existing methods for longitudinal GWAS either ignore the covariance among observations of different time points or encounter computational efficiency issues. Results We herein developed efficient genome-wide multivariate association algorithms for longitudinal data. In contrast to existing univariate linear mixed model analyses, the proposed method has improved statistic power for association detection and computational speed. In addition, the new method can analyze unbalanced longitudinal data with thousands of individuals and more than ten thousand records within a few hours. The corresponding time for balanced longitudinal data is just a few minutes. Availability and implementation A software package to implement the efficient algorithm named GMA (https://github.com/chaoning/GMA) is available freely for interested users in relevant fields. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Current knowledge in hypertension genetics: mosaic theory, candidate genes and genome-wide association studies

Arterial’naya Gipertenziya (Arterial Hypertension) ◽

10.18705/1607-419x-2020-26-5-490-500 ◽

2020 ◽

Vol 26 (5) ◽

pp. 490-500

Author(s):

A. O. Konradi

Keyword(s):

Candidate Genes ◽

High Performance ◽

New Technologies ◽

Current Knowledge ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Modern Approach ◽

Genome Wide

The article reviews monogenic forms of hypertension, data on the role of heredity of essential hypertension and candidate genes, as well as genome-wide association studies. Modern approach for the role of genetics is driven by implementation of new technologies and their productivity. High performance speed of new technologies like genome-wide association studies provide data for better knowledge of genetic markers of hypertension. The major goal nowadays for research is to reveal molecular pathways of blood pressure regulation, which can help to move from populational to individual level of understanding of pathogenesis and treatment targets.

Download Full-text

GWAS-Flow: A GPU accelerated framework for efficient permutation based genome-wide association studies

10.1101/783100 ◽

2019 ◽

Cited By ~ 2

Author(s):

Jan A. Freudenthal ◽

Markus J. Ankenbrand ◽

Dominik G. Grimm ◽

Arthur Korte

Keyword(s):

Complex Traits ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Large Datasets ◽

Genome Wide Association ◽

Small Data ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Non Gaussian

AbstractMotivationGenome-wide association studies (GWAS) are one of the most commonly used methods to detect associations between complex traits and genomic polymorphisms. As both genotyping and phenotyping of large populations has become easier, typical modern GWAS have to cope with massive amounts of data. Thus, the computational demand for these analyses grew remarkably during the last decades. This is especially true, if one wants to implement permutation-based significance thresholds, instead of using the naïve Bonferroni threshold. Permutation-based methods have the advantage to provide an adjusted multiple hypothesis correction threshold that takes the underlying phenotypic distribution into account and will thus remove the need to find the correct transformation for non Gaussian phenotypes. To enable efficient analyses of large datasets and the possibility to compute permutation-based significance thresholds, we used the machine learning framework TensorFlow to develop a linear mixed model (GWAS-Flow) that can make use of the available CPU or GPU infrastructure to decrease the time of the analyses especially for large datasets.ResultsWe were able to show that our application GWAS-Flow outperforms custom GWAS scripts in terms of speed without loosing accuracy. Apart from p-values, GWAS-Flow also computes summary statistics, such as the effect size and its standard error for each individual marker. The CPU-based version is the default choice for small data, while the GPU-based version of GWAS-Flow is especially suited for the analyses of big data.AvailabilityGWAS-Flow is freely available on GitHub (https://github.com/Joyvalley/GWAS_Flow) and is released under the terms of the MIT-License.

Download Full-text

Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies

Methods ◽

10.1016/j.ymeth.2018.04.021 ◽

2018 ◽

Vol 145 ◽

pp. 2-9 ◽

Cited By ~ 1

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Heterogeneous Datasets

Download Full-text

bGWAS: an R package to perform Bayesian genome wide association studies

Bioinformatics ◽

10.1093/bioinformatics/btaa549 ◽

2020 ◽

Vol 36 (15) ◽

pp. 4374-4376

Author(s):

Ninon Mounier ◽

Zoltán Kutalik

Keyword(s):

Mendelian Randomization ◽

Causal Effect ◽

Association Studies ◽

R Package ◽

Genome Wide Association ◽

Supplementary Information ◽

Genome Wide Association Studies ◽

Biological Mechanisms ◽

Genome Wide ◽

Related Risk

Abstract Summary Increasing sample size is not the only strategy to improve discovery in Genome Wide Association Studies (GWASs) and we propose here an approach that leverages published studies of related traits to improve inference. Our Bayesian GWAS method derives informative prior effects by leveraging GWASs of related risk factors and their causal effect estimates on the focal trait using multivariable Mendelian randomization. These prior effects are combined with the observed effects to yield Bayes Factors, posterior and direct effects. The approach not only increases power, but also has the potential to dissect direct and indirect biological mechanisms. Availability and implementation bGWAS package is freely available under a GPL-2 License, and can be accessed, alongside with user guides and tutorials, from https://github.com/n-mounier/bGWAS. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

Genome-Wide Association Studies Reveal Susceptibility Loci for Digital Dermatitis in Holstein Cattle

Animals ◽

10.3390/ani10112009 ◽

2020 ◽

Vol 10 (11) ◽

pp. 2009

Author(s):

Ellen Lai ◽

Alexa L. Danner ◽

Thomas R. Famula ◽

Anita M. Oberbauer

Keyword(s):

Predictive Value ◽

Mixed Model ◽

Linear Mixed Model ◽

Bos Taurus ◽

Association Studies ◽

Bayesian Regression ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Digital Dermatitis ◽

Genome Wide

Digital dermatitis (DD) causes lameness in dairy cattle. To detect the quantitative trait loci (QTL) associated with DD, genome-wide association studies (GWAS) were performed using high-density single nucleotide polymorphism (SNP) genotypes and binary case/control, quantitative (average number of FW per hoof trimming record) and recurrent (cases with ≥2 DD episodes vs. controls) phenotypes from cows across four dairies (controls n = 129 vs. FW n = 85). Linear mixed model (LMM) and random forest (RF) approaches identified the top SNPs, which were used as predictors in Bayesian regression models to assess the SNP predictive value. The LMM and RF analyses identified QTL regions containing candidate genes on Bos taurus autosome (BTA) 2 for the binary and recurrent phenotypes and BTA7 and 20 for the quantitative phenotype that related to epidermal integrity, immune function, and wound healing. Although larger sample sizes are necessary to reaffirm these small effect loci amidst a strong environmental effect, the sample cohort used in this study was sufficient for estimating SNP effects with a high predictive value.

Download Full-text

Secure large-scale genome-wide association studies using homomorphic encryption

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1918257117 ◽

2020 ◽

Vol 117 (21) ◽

pp. 11608-11613 ◽

Cited By ~ 1

Author(s):

Marcelo Blatt ◽

Alexander Gusev ◽

Yuriy Polyakov ◽

Shafi Goldwasser

Keyword(s):

Large Scale ◽

Homomorphic Encryption ◽

Association Studies ◽

Genome Wide Association ◽

Single Server ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphisms ◽

User Interactions ◽

Individual Level ◽

Genome Wide

Genome-wide association studies (GWASs) seek to identify genetic variants associated with a trait, and have been a powerful approach for understanding complex diseases. A critical challenge for GWASs has been the dependence on individual-level data that typically have strict privacy requirements, creating an urgent need for methods that preserve the individual-level privacy of participants. Here, we present a privacy-preserving framework based on several advances in homomorphic encryption and demonstrate that it can perform an accurate GWAS analysis for a real dataset of more than 25,000 individuals, keeping all individual data encrypted and requiring no user interactions. Our extrapolations show that it can evaluate GWASs of 100,000 individuals and 500,000 single-nucleotide polymorphisms (SNPs) in 5.6 h on a single server node (or in 11 min on 31 server nodes running in parallel). Our performance results are more than one order of magnitude faster than prior state-of-the-art results using secure multiparty computation, which requires continuous user interactions, with the accuracy of both solutions being similar. Our homomorphic encryption advances can also be applied to other domains where large-scale statistical analyses over encrypted data are needed.

Download Full-text

A critical evaluation of results from genome-wide association studies of micronutrient status and their utility in the practice of precision nutrition

British Journal Of Nutrition ◽

10.1017/s0007114519001119 ◽

2019 ◽

Vol 122 (2) ◽

pp. 121-130 ◽

Cited By ~ 2

Author(s):

Marie-Joe Dib ◽

Ruan Elliott ◽

Kourosh R. Ahmadi

Keyword(s):

Large Scale ◽

Association Studies ◽

Critical Evaluation ◽

Water Soluble ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Micronutrient Deficiencies ◽

Micronutrient Status ◽

Genome Wide ◽

Fat Soluble Vitamins

AbstractRapid advances in ‘omics’ technologies have paved the way forward to an era where more ‘precise’ approaches – ‘precision’ nutrition – which leverage data on genetic variability alongside the traditional indices, have been put forth as the state-of-the-art solution to redress the effects of malnutrition across the life course. We purport that this inference is premature and that it is imperative to first review and critique the existing evidence from large-scale epidemiological findings. We set out to provide a critical evaluation of findings from genome-wide association studies (GWAS) in the roadmap to precision nutrition, focusing on GWAS of micronutrient disposition. We found that a large number of loci associated with biomarkers of micronutrient status have been identified. Mean estimates of heritability of micronutrient status ranged between 20 and 35 % for minerals, 56–59 % for water-soluble and 30–70 % for fat-soluble vitamins. With some exceptions, the majority of the identified genetic variants explained little of the overall variance in status for each micronutrient, ranging between 1·3 and 8 % (minerals), <0·1–12 % (water-soluble) and 1·7–2·3 % for (fat-soluble) vitamins. However, GWAS have provided some novel insight into mechanisms that underpin variability in micronutrient status. Our findings highlight obvious gaps that need to be addressed if the full scope of precision nutrition is ever to be realised, including research aimed at (i) dissecting the genetic basis of micronutrient deficiencies or ‘response’ to intake/supplementation (ii) identifying trans-ethnic and ethnic-specific effects (iii) identifying gene–nutrient interactions for the purpose of unravelling molecular ‘behaviour’ in a range of environmental contexts.

Download Full-text

Variable selection in heterogeneous datasets: A truncated-rank sparse linear mixed model with applications to genome-wide association studies

2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2017.8217687 ◽

2017 ◽

Cited By ~ 9

Author(s):

Haohan Wang ◽

Bryon Aragam ◽

Eric P. Xing

Keyword(s):

Variable Selection ◽

Mixed Model ◽

Linear Mixed Model ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Heterogeneous Datasets

Download Full-text

High performance computing enabling exhaustive analysis of higher order single nucleotide polymorphism interaction in Genome Wide Association Studies

Health Information Science and Systems ◽

10.1186/2047-2501-3-s1-s3 ◽

2015 ◽

Vol 3 (S1) ◽

Cited By ~ 14

Author(s):

Benjamin Goudey ◽

Mani Abedini ◽

John L Hopper ◽

Michael Inouye ◽

Enes Makalic ◽

...

Keyword(s):

Single Nucleotide Polymorphism ◽

High Performance Computing ◽

High Performance ◽

Association Studies ◽

Genome Wide Association ◽

Genome Wide Association Studies ◽

Nucleotide Polymorphism ◽

Single Nucleotide ◽

Genome Wide ◽

Performance Computing

Download Full-text