Polygenic risk prediction and SNCA haplotype analysis in a Latino Parkinson's disease cohort
Background: Large-scale Parkinson's disease (PD) genome-wide association studies (GWAS) and meta-analyses have, until recently, only been conducted on subjects with European-ancestry. Consequently, polygenic risk scores (PRS) constructed using PD GWAS data are likely to be less predictive when applied to non-European cohorts. Methods: Using GWAS data from Nalls et al. 2019, we constructed a PD PRS for a Latino PD cohort (LARGE-PD) and tested it for association with PD status. We validated the PRS performance through testing the PD PRS in an independent cohort of Latino PD patients and by repeating the PRS analysis in LARGE-PD with the addition of 440 external Peruvian controls. To explore the global distribution of PD PRS, we utilized 1000 Genomes Project (1KGP) and Peruvian Genome Project (PGP) data to estimate PD risk allele frequencies. We also tested SNCA haplotypes for association with PD risk using logistic regression in LARGE-PD and a European-ancestry PD cohort from the International Parkinson Disease Genomics Consortium (IPDGC). Results: The GWAS-significant PD PRS had an area under the receiver-operator curve (AUC) of 0.668 (95% CI: 0.640-0.695) and explained 2.8% of the phenotypic variance in LARGE-PD as determined via pseudo R2. The inclusion of external Peruvian data as controls mitigated this result, dropping the AUC 0.632 (95% CI: 0.607-0.657). In 1KGP Latinos, we found the PD PRS to exhibit a bias by ancestry. At the SNCA locus, haplotypes differ by ancestry. Ancestry-specific SNCA haplotypes are significantly associated with PD status in both LARGE-PD and the IPDGC cohort (p-value < 0.05). Apart from rs356182, these haplotypes share as little as 14% of their variants. Conclusion: The PD PRS has potential for PD risk prediction in Latinos, but variability caused by admixture patterns and bias in the PD PRS calculated using only European-ancestry data limits its utility. The inclusion of diverse subjects can help elucidate PD risk loci and improve risk prediction in non-European cohorts. In the case of the SNCA locus, by leveraging a Latino cohort, we provide orthogonal evidence for rs356182 causality.