Controlling for off-target genetic effects using polygenic scores improves the power of genome-wide association studies
ABSTRACTOngoing increases in the size of human genotype and phenotype collections offer the promise of improved understanding of the genetics of complex diseases. In addition to the biological insights that can be gained from the nature of the variants that contribute to the genetic component of complex trait variability, these data have brought forward the prospect of predicting complex traits and the risk of complex genetic diseases from genotype data. Optimal realization of these objectives requires ongoing methodological developments, designed to identify true trait-associated variants and accurately predict phenotype from genotype. These methods must be computationally efficient, in order to remain tractable in the context of high variant densities and very large sample sizes. Here we show that the power of linear mixed models that are in widespread use for GWAS can be increased significantly by modeling off-target genetic effects using polygenic scores derived from SNPs that are not on the same chromosome as the target SNP. Using simulated and real data we found that this can result in a substantial increase in the number of variants passing genome-wide significance thresholds. This increase in power to detect trait-associated variants also translates into an increase in the accuracy with which the resulting polygenic score predicts the phenotype from genotype data. Our results suggest that advances in methods for phenotype prediction can be exploited to improve the control of off-target genetic effects, leading to more accurate GWAS results and further improvements in phenotype prediction.