Pleiotropic mapping for genome-wide association studies using group variable selection

2019 ◽  
Vol 67 ◽  
pp. S134-S135
Author(s):  
B. Liquet
2016 ◽  
Vol 32 (10) ◽  
pp. 1493-1501 ◽  
Author(s):  
Christian Benner ◽  
Chris C.A. Spencer ◽  
Aki S. Havulinna ◽  
Veikko Salomaa ◽  
Samuli Ripatti ◽  
...  

2017 ◽  
Author(s):  
Haohan Wang ◽  
Bryon Aragam ◽  
Eric P. Xing

AbstractA fundamental and important challenge in modern datasets of ever increasing dimensionality is variable selection, which has taken on renewed interest recently due to the growth of biological and medical datasets with complex, non-i.i.d. structures. Naïvely applying classical variable selection methods such as the Lasso to such datasets may lead to a large number of false discoveries. Motivated by genome-wide association studies in genetics, we study the problem of variable selection for datasets arising from multiple subpopulations, when this underlying population structure is unknown to the researcher. We propose a unified framework for sparse variable selection that adaptively corrects for population structure via a low-rank linear mixed model. Most importantly, the proposed method does not require prior knowledge of sample structure in the data and adaptively selects a covariance structure of the correct complexity. Through extensive experiments, we illustrate the effectiveness of this framework over existing methods. Further, we test our method on three different genomic datasets from plants, mice, and human, and discuss the knowledge we discover with our method.


2019 ◽  
Vol 5 (1) ◽  
Author(s):  
Benazir Rowe ◽  
Xiangning Chen ◽  
Zuoheng Wang ◽  
Jingchun Chen ◽  
Amei Amei

AbstractGenome-wide association studies (GWAS) have identified over 100 loci associated with schizophrenia. Most of these studies test genetic variants for association one at a time. In this study, we performed GWAS of the molecular genetics of schizophrenia (MGS) dataset with 5334 subjects using multivariate Bayesian variable selection (BVS) method Posterior Inference via Model Averaging and Subset Selection (piMASS) and compared our results with the previous univariate analysis of the MGS dataset. We showed that piMASS can improve the power of detecting schizophrenia-associated SNPs, potentially leading to new discoveries from existing data without increasing the sample size. We tested SNPs in groups to allow for local additive effects and used permutation test to determine statistical significance in order to compare our results with univariate method. The previous univariate analysis of the MGS dataset revealed no genome-wide significant loci. Using the same dataset, we identified a single region that exceeded the genome-wide significance. The result was replicated using an independent Swedish Schizophrenia Case–Control Study (SSCCS) dataset. Based on the SZGR 2.0 database we found 63 SNPs from the best performing regions that are mapped to 27 genes known to be associated with schizophrenia. Overall, we demonstrated that piMASS could discover association signals that otherwise would need a much larger sample size. Our study has important implication that reanalyzing published datasets with BVS methods like piMASS might have more power to discover new risk variants for many diseases without new sample collection, ascertainment, and genotyping.


2014 ◽  
Vol 13s7 ◽  
pp. CIN.S16350 ◽  
Author(s):  
Sungyeon Hong ◽  
Yongkang Kim ◽  
Taesung Park

Variable selection methods play an important role in high-dimensional statistical modeling and analysis. Computational cost and estimation accuracy are the two main concerns for statistical inference from ultrahigh-dimensional data. In particular, genome-wide association studies (GWAS), which focus on identifying single nucleotide polymorphisms (SNPs) associated with a disease of interest, have produced ultrahigh-dimensional data. Numerous methods have been proposed to handle GWAS data. Most statistical methods have adopted a two-stage approach: pre-screening for dimensional reduction and variable selection to identify causal SNPs. The pre-screening step selects SNPs in terms of their P-values or the absolute values of the regression coefficients in single SNP analysis. Penalized regressions, such as the ridge, lasso, adaptive lasso, and elastic-net regressions, are commonly used for the variable selection step. In this paper, we investigate which combination of pre-screening method and penalized regression performs best on a quantitative phenotype using two real GWAS datasets.


Sign in / Sign up

Export Citation Format

Share Document