scholarly journals Estimating cross-population LD decay and its effect on trans-ethnic polygenic scores

2020 ◽  
Author(s):  
Davide Piffer

Differing patterns of linkage disequilibrium (LD) across populations (LD decay) potentially reduce the trans-ethnic validity of GWAS results. In this study, I compute a measure of LD decay to test the hypothesis that it mediates the difference in polygenic scores computed using GWAS summary statistics between Europeans and Africans.There was a significant PGS difference between CEU and YRI ( 50.312% - 47.995%, p= 0.001). In total, there were 1,680,781 SNPs in LD with 2,596 GWAS SNP within a 500 Kb window.There was no correlation between a measure of LD decay and population differences in polygenic scores of educational attainment.This study provides no evidence that LD decay affects the trans-ethnic predictive validity of polygenic scores.

2018 ◽  
Author(s):  
Timothy Shin Heng Mak ◽  
Robert Milan Porsch ◽  
Shing Wan Choi ◽  
Pak Chung Sham

AbstractPolygenic scores (PGS) are estimated scores representing the genetic tendency of an individual for a disease or trait and have become an indispensible tool in a variety of analyses. Typically they are linear combination of the genotypes of a large number of SNPs, with the weights calculated from an external source, such as summary statistics from large meta-analyses. Recently cohorts with genetic data have become very large, such that it would be a waste if the raw data were not made use of in constructing PGS. Making use of raw data in calculating PGS, however, presents us with problems of overfitting. Here we discuss the essence of overfitting as applied in PGS calculations and highlight the difference between overfitting due to the overlap between the target and the discovery data (OTD), and overfitting due to the overlap between the target the the validation data (OTV). We propose two methods — cross prediction and split validation — to overcome OTD and OTV respectively. Using these two methods, PGS can be calculated using raw data without overfitting. We show that PGSs thus calculated have better predictive power than those using summary statistics alone for six phenotypes in the UK Biobank data.


2020 ◽  
Author(s):  
John E. McGeary ◽  
Chelsie Benca-Bachman ◽  
Victoria Risner ◽  
Christopher G Beevers ◽  
Brandon Gibb ◽  
...  

Twin studies indicate that 30-40% of the disease liability for depression can be attributed to genetic differences. Here, we assess the explanatory ability of polygenic scores (PGS) based on broad- (PGSBD) and clinical- (PGSMDD) depression summary statistics from the UK Biobank using independent cohorts of adults (N=210; 100% European Ancestry) and children (N=728; 70% European Ancestry) who have been extensively phenotyped for depression and related neurocognitive phenotypes. PGS associations with depression severity and diagnosis were generally modest, and larger in adults than children. Polygenic prediction of depression-related phenotypes was mixed and varied by PGS. Higher PGSBD, in adults, was associated with a higher likelihood of having suicidal ideation, increased brooding and anhedonia, and lower levels of cognitive reappraisal; PGSMDD was positively associated with brooding and negatively related to cognitive reappraisal. Overall, PGS based on both broad and clinical depression phenotypes have modest utility in adult and child samples of depression.


2016 ◽  
Vol 283 (1826) ◽  
pp. 20152340 ◽  
Author(s):  
Chih-Ming Hung ◽  
Sergei V. Drovetski ◽  
Robert M. Zink

Although mitochondrial DNA (mtDNA) has long been used for assessing genetic variation within and between populations, its workhorse role in phylogeography has been criticized owing to its single-locus nature. The only choice for testing mtDNA results is to survey nuclear loci, which brings into contrast the difference in locus effective size and coalescence times. Thus, it remains unclear how erroneous mtDNA-based estimates of species history might be, especially for evolutionary events in the recent past. To test the robustness of mtDNA and nuclear sequences in phylogeography, we provide one of the largest paired comparisons of summary statistics and demographic parameters estimated from mitochondrial, five Z-linked and 10 autosomal genes of 30 avian species co-distributed in the Caucasus and Europe. The results suggest that mtDNA is robust in estimating inter-population divergence but not in intra-population diversity, which is sensitive to population size change. Here, we provide empirical evidence showing that mtDNA was more likely to detect population divergence than any other single locus owing to its smaller N e and thus faster coalescent time. Therefore, at least in birds, numerous studies that have based their inferences of phylogeographic patterns solely on mtDNA should not be readily dismissed.


2016 ◽  
Author(s):  
Timothy Shin Heng Mak ◽  
Robert Milan Porsch ◽  
Shing Wan Choi ◽  
Xueya Zhou ◽  
Pak Chung Sham

AbstractPolygenic scores (PGS) summarize the genetic contribution of a person’s genotype to a disease or phenotype. They can be used to group participants into different risk categories for diseases, and are also used as covariates in epidemiological analyses. A number of possible ways of calculating polygenic scores have been proposed, and recently there is much interest in methods that incorporate information available in published summary statistics. As there is no inherent information on linkage disequilibrium (LD) in summary statistics, a pertinent question is how we can make use of LD information available elsewhere to supplement such analyses. To answer this question we propose a method for constructing PGS using summary statistics and a reference panel in a penalized regression framework, which we call lassosum. We also propose a general method for choosing the value of the tuning parameter in the absence of validation data. In our simulations, we showed that pseudovalidation often resulted in prediction accuracy that is comparable to using a dataset with validation phenotype and was clearly superior to the conservative option of setting the tuning parameter of lassosum to its lowest value. We also showed that lassosum achieved better prediction accuracy than simple clumping and p-value thresholding in almost all scenarios. It was also substantially faster and more accurate than the recently proposed LDpred.


2020 ◽  
Vol 2 (2) ◽  
Author(s):  
Qing Cheng ◽  
Yi Yang ◽  
Xingjie Shi ◽  
Kar-Fu Yeung ◽  
Can Yang ◽  
...  

Abstract The proliferation of genome-wide association studies (GWAS) has prompted the use of two-sample Mendelian randomization (MR) with genetic variants as instrumental variables (IVs) for drawing reliable causal relationships between health risk factors and disease outcomes. However, the unique features of GWAS demand that MR methods account for both linkage disequilibrium (LD) and ubiquitously existing horizontal pleiotropy among complex traits, which is the phenomenon wherein a variant affects the outcome through mechanisms other than exclusively through the exposure. Therefore, statistical methods that fail to consider LD and horizontal pleiotropy can lead to biased estimates and false-positive causal relationships. To overcome these limitations, we proposed a probabilistic model for MR analysis in identifying the causal effects between risk factors and disease outcomes using GWAS summary statistics in the presence of LD and to properly account for horizontal pleiotropy among genetic variants (MR-LDP) and develop a computationally efficient algorithm to make the causal inference. We then conducted comprehensive simulation studies to demonstrate the advantages of MR-LDP over the existing methods. Moreover, we used two real exposure–outcome pairs to validate the results from MR-LDP compared with alternative methods, showing that our method is more efficient in using all-instrumental variants in LD. By further applying MR-LDP to lipid traits and body mass index (BMI) as risk factors for complex diseases, we identified multiple pairs of significant causal relationships, including a protective effect of high-density lipoprotein cholesterol on peripheral vascular disease and a positive causal effect of BMI on hemorrhoids.


2017 ◽  
Vol 20 (3) ◽  
pp. 257-259 ◽  
Author(s):  
Julian Hecker ◽  
Anna Maaser ◽  
Dmitry Prokopenko ◽  
Heide Loehlein Fier ◽  
Christoph Lange

VEGAS (versatile gene-based association study) is a popular methodological framework to perform gene-based tests based on summary statistics from single-variant analyses. The approach incorporates linkage disequilibrium information from reference panels to account for the correlation of test statistics. The gene-based test can utilize three different types of tests. In 2015, the improved framework VEGAS2, using more detailed reference panels, was published. Both versions provide user-friendly web- and offline-based tools for the analysis. However, the implementation of the popular top-percentage test is erroneous in both versions. The p values provided by VEGAS2 are deflated/anti-conservative. Based on real data examples, we demonstrate that this can increase substantially the rate of false-positive findings and can lead to inconsistencies between different test options. We also provide code that allows the user of VEGAS to compute correct p values.


2012 ◽  
Vol 30 (5_suppl) ◽  
pp. 41-41 ◽  
Author(s):  
Daniella J. Perlroth ◽  
Stephen F. Thompson ◽  
Yesenia Luna ◽  
Dana P. Goldman ◽  
Essy Mozaffari ◽  
...  

41 Background: ADT and chemotherapy use in men with mPC may differ across regions in community practice. The extent of variation could indicate whether men with mPC have appropriate access to effective treatments. Methods: We identified 16,024 men diagnosed with mPC in the Surveillance, Epidemiology, and End Results (SEER) database from 2000-2005 linked to their Medicare claims. Patients were excluded if they had a second cancer or disenrolled from Medicare Parts A or B (n=6,155), or failed to initiate therapy with ADT (n=3,400). We identified demographic and clinical information from SEER and treatments and comorbidities from J-codes and ICD-9 codes in the Medicare claims. We used regression models to estimate the probability of advancement to chemotherapy, the time from diagnosis to first ADT use, and time from first ADT to chemotherapy. Then the patient-level predicted results from these models were used to generate summary statistics by hospital service area (HSA). Results: There were 6,469 patients remaining after exclusion who were treated with ADT, and 1,198 of those received chemotherapy (19%). The median age was 76 years old, most were white (77%), married (62%), and 50% had 1 other major comorbidity (most frequent was diabetes, 21%). Men who were younger, married, with fewer comorbidities, and higher Gleason scores were statistically more likely to both receive chemotherapy and use it earlier. After adjusting for clinical and sociodemographic factors, the average time to ADT by referral region was 2.7 months but varied from 1.3 to 5.6; probability of progression to chemotherapy averaged 19% but varied from 6% to 30%, and the time from first ADT to chemotherapy averaged 19.7 months but varied from 12.9 to 25.7 months. The difference in time to ADT between regions in the 10th and 90th percentiles of use was 2.6 months, whereas for chemotherapy initiation, it was 12.4 months. Conclusions: Our results suggest that living in different parts of the country has a substantial impact on how clinically similar patients are treated. There was substantial variation across regions in use of and time to initiation of chemotherapy for men with mPC, but not in ADT use.


1983 ◽  
Vol 102 (1) ◽  
pp. 57-61 ◽  
Author(s):  
H. Allannic ◽  
R. Fauchet ◽  
Y. Lorcy ◽  
H. Phengsavath ◽  
M. Gueguen ◽  
...  

Abstract. Patients with Graves' disease were phenotyped for properdin factor B (Bf) and glyoxalase, which are coded for by genes mapping close to the HLA region on the sixth chromosome. Frequency data were analysed in relation to HLA-A, -B and -DR typing data. Diagnosis of Graves' disease was based on the usual criteria including elevated T3 and T4 levels and free T4 index and a homogenous thyroid scan. Ninety-four patients with Graves' disease were phenotyped for properdin factor B (Bf) and 37 for red cells glyoxalase (GLO). HLA-A, -B and -DR antigens were typed in 94 patients using a lymphocyte microcytotoxicity assay. The frequency distribution of Bf and GLO alleles showed no significant differences from control subjects. This finding contrasts with the reports of an increased frequency of Bf Fl in insulin-dependent diabetes mellitus. The difference in the two diseases which are both associated with an increased frequency of the antigen combination D8-DR3, is accounted for by linkage disequilibrium between B 18 and BfF1.


Sign in / Sign up

Export Citation Format

Share Document