scholarly journals High-throughput and efficient multilocus genome-wide association study on longitudinal outcomes

2020 ◽  
Vol 36 (10) ◽  
pp. 3004-3010
Author(s):  
Huang Xu ◽  
Xiang Li ◽  
Yaning Yang ◽  
Yi Li ◽  
Jose Pinheiro ◽  
...  

Abstract Motivation With the emerging of high-dimensional genomic data, genetic analysis such as genome-wide association studies (GWAS) have played an important role in identifying disease-related genetic variants and novel treatments. Complex longitudinal phenotypes are commonly collected in medical studies. However, since limited analytical approaches are available for longitudinal traits, these data are often underutilized. In this article, we develop a high-throughput machine learning approach for multilocus GWAS using longitudinal traits by coupling Empirical Bayesian Estimates from mixed-effects modeling with a novel ℓ0-norm algorithm. Results Extensive simulations demonstrated that the proposed approach not only provided accurate selection of single nucleotide polymorphisms (SNPs) with comparable or higher power but also robust control of false positives. More importantly, this novel approach is highly scalable and could be approximately >1000 times faster than recently published approaches, making genome-wide multilocus analysis of longitudinal traits possible. In addition, our proposed approach can simultaneously analyze millions of SNPs if the computer memory allows, thereby potentially allowing a true multilocus analysis for high-dimensional genomic data. With application to the data from Alzheimer's Disease Neuroimaging Initiative, we confirmed that our approach can identify well-known SNPs associated with AD and were much faster than recently published approaches (≥6000 times). Availability and implementation The source code and the testing datasets are available at https://github.com/Myuan2019/EBE_APML0. Supplementary information Supplementary data are available at Bioinformatics online.

2018 ◽  
Vol 35 (14) ◽  
pp. 2512-2514 ◽  
Author(s):  
Bongsong Kim ◽  
Xinbin Dai ◽  
Wenchao Zhang ◽  
Zhaohong Zhuang ◽  
Darlene L Sanchez ◽  
...  

Abstract Summary We present GWASpro, a high-performance web server for the analyses of large-scale genome-wide association studies (GWAS). GWASpro was developed to provide data analyses for large-scale molecular genetic data, coupled with complex replicated experimental designs such as found in plant science investigations and to overcome the steep learning curves of existing GWAS software tools. GWASpro supports building complex design matrices, by which complex experimental designs that may include replications, treatments, locations and times, can be accounted for in the linear mixed model. GWASpro is optimized to handle GWAS data that may consist of up to 10 million markers and 10 000 samples from replicable lines or hybrids. GWASpro provides an interface that significantly reduces the learning curve for new GWAS investigators. Availability and implementation GWASpro is freely available at https://bioinfo.noble.org/GWASPRO. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 36 (15) ◽  
pp. 4374-4376
Author(s):  
Ninon Mounier ◽  
Zoltán Kutalik

Abstract Summary Increasing sample size is not the only strategy to improve discovery in Genome Wide Association Studies (GWASs) and we propose here an approach that leverages published studies of related traits to improve inference. Our Bayesian GWAS method derives informative prior effects by leveraging GWASs of related risk factors and their causal effect estimates on the focal trait using multivariable Mendelian randomization. These prior effects are combined with the observed effects to yield Bayes Factors, posterior and direct effects. The approach not only increases power, but also has the potential to dissect direct and indirect biological mechanisms. Availability and implementation bGWAS package is freely available under a GPL-2 License, and can be accessed, alongside with user guides and tutorials, from https://github.com/n-mounier/bGWAS. Supplementary information Supplementary data are available at Bioinformatics online.


2021 ◽  
Author(s):  
Robert J. Loughnan ◽  
Alexey A. Shadrin ◽  
Oleksandr Frei ◽  
Dennis van der Mer ◽  
Weiqi Zhao ◽  
...  

AbstractGenome-Wide Association studies have typically been limited to single phenotypes, given that high dimensional phenotypes incur a large multiple comparisons burden: ~1 million tests across the genome times the number of phenotypes. Recent work demonstrates that a Multivariate Omnibus Statistic Test (MOSTest) is well powered to discover genomic effects distributed across multiple phenotypes. Applied to cortical brain MRI morphology measures, MOSTest has resulted in a drastic improvement in power to discover loci – a 10-fold increase in discovered loci compared to established approaches (min-P). One question that arises is how well these discovered loci replicate in independent data. Here we perform 10 -imes cross validation within 35,644 individuals from UK Biobank for imaging measures of cortical area, thickness and sulcal depth (>1,000 dimensionality for each). By deploying a replication method that aggregates discovered effects distributed across multiple phenotypes, termed PolyVertex Score (PVS), we demonstrate a higher replication yield and comparable replication rate of discovered loci for MOSTest (# replicated loci: 428-1,037, replication rate: 95-96%) in independent data when compared with the established min-P approach (# replicated loci: 30-71, replication rate: 70-84%). An out-of-sample generalization of discovered loci was conducted with a sample of 8,336 individuals from the Adolescent Brain Cognitive Development® (ABCD) study, who are on average 50 years younger than UK Biobank individuals. We observe a higher replication yield and comparable replication rate of MOSTest compared to min-P. This finding underscores the importance of using multivariate techniques for both discovery and replication of high dimensional phenotypes in Genome-Wide Association studies.


2018 ◽  
Author(s):  
John A Lees ◽  
Marco Galardini ◽  
Stephen D Bentley ◽  
Jeffrey N Weiser ◽  
Jukka Corander

AbstractSummaryGenome-wide association studies (GWAS) in microbes face different challenges to eukaryotes and have been addressed by a number of different methods. pyseer brings these techniques together in one package tailored to microbial GWAS, allows greater flexibility of the input data used, and adds new methods to interpret the association results.Availability and Implementationpyseer is written in python and is freely available at https://github.com/mgalardini/pyseer, or can be installed through pip. Documentation and a tutorial are available at http://[email protected] and [email protected] informationSupplementary data are available online.


2019 ◽  
Vol 35 (17) ◽  
pp. 3046-3054 ◽  
Author(s):  
Anastasia Gurinovich ◽  
Harold Bae ◽  
John J Farrell ◽  
Stacy L Andersen ◽  
Stefano Monti ◽  
...  

Abstract Motivation Over the last decade, more diverse populations have been included in genome-wide association studies. If a genetic variant has a varying effect on a phenotype in different populations, genome-wide association studies applied to a dataset as a whole may not pinpoint such differences. It is especially important to be able to identify population-specific effects of genetic variants in studies that would eventually lead to development of diagnostic tests or drug discovery. Results In this paper, we propose PopCluster: an algorithm to automatically discover subsets of individuals in which the genetic effects of a variant are statistically different. PopCluster provides a simple framework to directly analyze genotype data without prior knowledge of subjects’ ethnicities. PopCluster combines logistic regression modeling, principal component analysis, hierarchical clustering and a recursive bottom-up tree parsing procedure. The evaluation of PopCluster suggests that the algorithm has a stable low false positive rate (∼4%) and high true positive rate (>80%) in simulations with large differences in allele frequencies between cases and controls. Application of PopCluster to data from genetic studies of longevity discovers ethnicity-dependent heterogeneity in the association of rs3764814 (USP42) with the phenotype. Availability and implementation PopCluster was implemented using the R programming language, PLINK and Eigensoft software, and can be found at the following GitHub repository: https://github.com/gurinovich/PopCluster with instructions on its installation and usage. Supplementary information Supplementary data are available at Bioinformatics online.


2019 ◽  
Vol 35 (23) ◽  
pp. 4879-4885 ◽  
Author(s):  
Chao Ning ◽  
Dan Wang ◽  
Lei Zhou ◽  
Julong Wei ◽  
Yuanxin Liu ◽  
...  

Abstract Motivation Current dynamic phenotyping system introduces time as an extra dimension to genome-wide association studies (GWAS), which helps to explore the mechanism of dynamical genetic control for complex longitudinal traits. However, existing methods for longitudinal GWAS either ignore the covariance among observations of different time points or encounter computational efficiency issues. Results We herein developed efficient genome-wide multivariate association algorithms for longitudinal data. In contrast to existing univariate linear mixed model analyses, the proposed method has improved statistic power for association detection and computational speed. In addition, the new method can analyze unbalanced longitudinal data with thousands of individuals and more than ten thousand records within a few hours. The corresponding time for balanced longitudinal data is just a few minutes. Availability and implementation A software package to implement the efficient algorithm named GMA (https://github.com/chaoning/GMA) is available freely for interested users in relevant fields. Supplementary information Supplementary data are available at Bioinformatics online.


2012 ◽  
Vol 28 (15) ◽  
pp. 1957-1964 ◽  
Author(s):  
Attila Gyenesei ◽  
Jonathan Moody ◽  
Colin A.M. Semple ◽  
Chris S. Haley ◽  
Wen-Hua Wei

Sign in / Sign up

Export Citation Format

Share Document