scholarly journals Reconstructing Genotypes in Private Genomic Databases from Genetic Risk Scores

2020 ◽  
Author(s):  
Brooks Paige ◽  
James Bell ◽  
Aurélien Bellet ◽  
Adrià Gascón ◽  
Daphne Ezer

AbstractSome organisations like 23andMe and the UK Biobank have large genomic databases that they re-use for multiple different genome-wide association studies (GWAS). Even research studies that compile smaller genomic databases often utilise these databases to investigate many related traits. It is common for the study to report a genetic risk score (GRS) model for each trait within the publication. Here we show that under some circumstances, these GRS models can be used to recover the genetic variants of individuals in these genomic databases—a reconstruction attack. In particular, if two GRS models are trained using a largely overlapping set of participants, then it is often possible to determine the genotype for each of the individuals who were used to train one GRS model, but not the other. We demonstrate this theoretically and experimentally by analysing the Cornell Dog Genome database. The accuracy of our reconstruction attack depends on how accurately we can estimate the rate of co-occurrence of pairs of SNPs within the private database, so if this aggregate information is ever released, it would drastically reduce the security of a private genomic database. Caution should be applied when using the same database for multiple analysis, especially when a small number of individuals are included or excluded from one part of the study.

2020 ◽  
Vol 21 (16) ◽  
pp. 5835
Author(s):  
Maria-Ancuta Jurj ◽  
Mihail Buse ◽  
Alina-Andreea Zimta ◽  
Angelo Paradiso ◽  
Schuyler S. Korban ◽  
...  

Genome-wide association studies (GWAS) are useful in assessing and analyzing either differences or variations in DNA sequences across the human genome to detect genetic risk factors of diseases prevalent within a target population under study. The ultimate goal of GWAS is to predict either disease risk or disease progression by identifying genetic risk factors. These risk factors will define the biological basis of disease susceptibility for the purposes of developing innovative, preventative, and therapeutic strategies. As single nucleotide polymorphisms (SNPs) are often used in GWAS, their relevance for triple negative breast cancer (TNBC) will be assessed in this review. Furthermore, as there are different levels and patterns of linkage disequilibrium (LD) present within different human subpopulations, a plausible strategy to evaluate known SNPs associated with incidence of breast cancer in ethnically different patient cohorts will be presented and discussed. Additionally, a description of GWAS for TNBC will be presented, involving various identified SNPs correlated with miRNA sites to determine their efficacies on either prognosis or progression of TNBC in patients. Although GWAS have identified multiple common breast cancer susceptibility variants that individually would result in minor risks, it is their combined effects that would likely result in major risks. Thus, one approach to quantify synergistic effects of such common variants is to utilize polygenic risk scores. Therefore, studies utilizing predictive risk scores (PRSs) based on known breast cancer susceptibility SNPs will be evaluated. Such PRSs are potentially useful in improving stratification for screening, particularly when combining family history, other risk factors, and risk prediction models. In conclusion, although interpretation of the results from GWAS remains a challenge, the use of SNPs associated with TNBC may elucidate and better contextualize these studies.


2012 ◽  
Vol 32 (suppl_1) ◽  
Author(s):  
Themistocles L Assimes ◽  
Benjamin Goldstein ◽  

Genome wide association studies (GWAS) to date have identified 30 CAD susceptibility loci but the ability to use this information to improve risk prediction remains limited. A meta-analysis of the GWAS and Cardio Metabochip data produced by the CARDIoGRAM+C4D consortium representing 63,253 cases and 126,820 controls has identified 1885 SNPs passing a False Discovery Rate (FDR) threshold of 0.5%. We hypothesized that an expanded multi locus genetic risk score (GRS) incorporating genotype information at all loci below an FDR of 0.5% would perform better than a GRS restricted to 42 loci reaching genome wide significance and tested this hypothesis in subjects of European ancestry participating in the Atherosclerosis Risk in the Community (ARIC) study. Models testing the GRS were either minimally (age and sex) or fully adjusted for traditional risk factors (TRFs). The Figure shows the hazard ratio (HZ) and 95% CI for incident events comparing each quintile of GRS to the middle quintile. The GRS including genotype information at all loci with an FDR of 0.5% noticeably improves risk prediction over the GRS restricted to genome wide significant loci in both the minimally and fully adjusted models based on several metrics including i) HR per GRS quintile, ii) the HR per SD of the GRS, and iii) the logistic regression pseudo R2, and iv) the c statistic. The HR per GRS quintile and per SD of GRS were all lower in the fully adjusted models compared to the respective minimally adjusted models but the reduction of the HR was more striking for the models that tested the more expansive GRS. These findings suggest that a larger proportion of novel GWAS CAD loci are mediating their effects through TRFs. While these findings demonstrate some progress in risk prediction using GWAS loci, both the limited and the expanded GRS continues to explain a relatively small proportion of the overall variance compared to TRF. Thus, the clinical utility of a CAD GRS remains to be determined.


Thorax ◽  
2021 ◽  
pp. thoraxjnl-2020-215624
Author(s):  
Sinjini Sikdar ◽  
Annah B Wyss ◽  
Mi Kyeong Lee ◽  
Thanh T Hoang ◽  
Marie Richards ◽  
...  

RationaleGenome-wide association studies (GWASs) have identified numerous loci associated with lower pulmonary function. Pulmonary function is strongly related to smoking and has also been associated with asthma and dust endotoxin. At the individual SNP level, genome-wide analyses of pulmonary function have not identified appreciable evidence for gene by environment interactions. Genetic Risk Scores (GRSs) may enhance power to identify gene–environment interactions, but studies are few.MethodsWe analysed 2844 individuals of European ancestry with 1000 Genomes imputed GWAS data from a case–control study of adult asthma nested within a US agricultural cohort. Pulmonary function traits were FEV1, FVC and FEV1/FVC. Using data from a recent large meta-analysis of GWAS, we constructed a weighted GRS for each trait by combining the top (p value<5×10−9) genetic variants, after clumping based on distance (±250 kb) and linkage disequilibrium (r2=0.5). We used linear regression, adjusting for relevant covariates, to estimate associations of each trait with its GRS and to assess interactions.ResultsEach trait was highly significantly associated with its GRS (all three p values<8.9×10−8). The inverse association of the GRS with FEV1/FVC was stronger for current smokers (pinteraction=0.017) or former smokers (pinteraction=0.064) when compared with never smokers and among asthmatics compared with non-asthmatics (pinteraction=0.053). No significant interactions were observed between any GRS and house dust endotoxin.ConclusionsEvaluation of interactions using GRSs supports a greater impact of increased genetic susceptibility on reduced pulmonary function in the presence of smoking or asthma.


Neurosurgery ◽  
2013 ◽  
Vol 73 (4) ◽  
pp. 705-708 ◽  
Author(s):  
Rachel Kleinloog ◽  
Femke N.G. van 't Hof ◽  
Franciscus J. Wolters ◽  
Ingeborg Rasing ◽  
Irene C. van der Schaaf ◽  
...  

Abstract BACKGROUND: Genetic risk factors for intracranial aneurysms may influence the size of aneurysms. OBJECTIVE: To assess the association between genetic risk factors and the size of aneurysms at the time of rupture. METHODS: Genotypes of 7 independent single-nucleotide polymorphisms (SNPs) of the 6 genetic risk loci identified in genome-wide association studies of patients with intracranial aneurysms were obtained from 700 Dutch patients with an aneurysmal subarachnoid hemorrhage (1997-2007) previously genotyped in the genome-wide association studies; 255 additional Dutch patients with an aneurysmal subarachnoid hemorrhage (2007-2011) were genotyped for these SNPs. Aneurysms were measured on computerized tomography angiography or digital subtraction angiography. The mean aneurysm size (with standard error) was compared between patients with and without a genetic risk factor by the use of linear regression. The association between SNPs and size was assessed for single SNPs and for the combined effect of SNPs by using a weighted genetic risk score. RESULTS: Single SNPs showed no association with aneurysm size, nor did the genetic risk score. CONCLUSION: The 6 genetic risk loci have no major influence on the size of aneurysms at the time of rupture. Because these risk loci explain no more than 5% of the genetic risk, other genetic factors for intracranial aneurysms may influence aneurysm size and thereby proneness to rupture.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Karoline Kuchenbaecker ◽  
◽  
Nikita Telkar ◽  
Theresa Reiker ◽  
Robin G. Walters ◽  
...  

Abstract Most genome-wide association studies are based on samples of European descent. We assess whether the genetic determinants of blood lipids, a major cardiovascular risk factor, are shared across populations. Genetic correlations for lipids between European-ancestry and Asian cohorts are not significantly different from 1. A genetic risk score based on LDL-cholesterol-associated loci has consistent effects on serum levels in samples from the UK, Uganda and Greece (r = 0.23–0.28, p < 1.9 × 10−14). Overall, there is evidence of reproducibility for ~75% of the major lipid loci from European discovery studies, except triglyceride loci in the Ugandan samples (10% of loci). Individual transferable loci are identified using trans-ethnic colocalization. Ten of fourteen loci not transferable to the Ugandan population have pleiotropic associations with BMI in Europeans; none of the transferable loci do. The non-transferable loci might affect lipids by modifying food intake in environments rich in certain nutrients, which suggests a potential role for gene-environment interactions.


2017 ◽  
Vol 35 (6_suppl) ◽  
pp. 1-1
Author(s):  
Rosalind Eeles ◽  
Ali Amin Al Olama ◽  
Sonja Berndt ◽  
Fredrik Wiklund ◽  
David V Conti ◽  
...  

1 Background: Currently genome-wide association studies (GWAS) have identified over 100 prostate cancer (PrCa) susceptibility loci, capturing 33% of the PrCa familial relative risk (FRR) in Europeans. To identify further susceptibility variants, we conducted a PrCa GWAS, larger than previous studies, comprising ~49,000 cases and ~29,000 controls among individuals of European and Asian descent using the OncoArray, a platform consisting of a 260K GWAS backbone and 310K custom content selected from previous GWAS and fine-mapping studies of multiple cancers ( http://epi.grants.cancer.gov/oncoarray/ ). Methods: Genotypes from the OncoArray were used to impute genotypes from ~70M variants using the October 2014 release of the 1000 genomes project as a reference, and then combined with several previous PrCa GWAS of European ancestry: UK stage 1 (1,906 cases/1,934 controls) and stage 2 (3,888 cases/3,956 controls); CaPS 1 (498 cases/502 controls) and CaPS 2 (1,483 cases/519 controls); BPC3 (2,137 cases/3,101 controls); NCI PEGASUS (4,622 cases/2,954 controls); and iCOGS (21,209 cases/ 20,440 controls). Risk analyses for overall PrCa risk, aggressive PrCa (several definitions defined by PrCa clinical characteristics), and Gleason score were performed. Logistic and linear regression summary statistics were meta-analysed using an inverse variance fixed effect approach. Results: We identified novel loci significantly associated ( P < 5.0x10-8) with overall PrCa (N = 65). Our novel findings are comprised of several missense variants, including a SNP in the ATM gene - a key member of the DNA repair pathway. When combined multiplicatively, the 65 novel PrCa loci identified here increases the captured heritability of PrCa, explaining 38.5% of the FRR when combining novel and previously identified PrCa loci. Conclusions: In risk stratification, men in the top 1% of the genetic risk score group have a relative risk of 5.6 fold for developing PrCa compared with the median risk group. These results will improve the utility of genetic risk scores for targeted screening and prevention for prostate cancer.


2019 ◽  
Vol 10 (1) ◽  
Author(s):  
Gad Abraham ◽  
Rainer Malik ◽  
Ekaterina Yonova-Doing ◽  
Agus Salim ◽  
Tingting Wang ◽  
...  

AbstractRecent genome-wide association studies in stroke have enabled the generation of genomic risk scores (GRS) but their predictive power has been modest compared to established stroke risk factors. Here, using a meta-scoring approach, we develop a metaGRS for ischaemic stroke (IS) and analyse this score in the UK Biobank (n = 395,393; 3075 IS events by age 75). The metaGRS hazard ratio for IS (1.26, 95% CI 1.22–1.31 per metaGRS standard deviation) doubles that of a previous GRS, identifying a subset of individuals at monogenic levels of risk: the top 0.25% of metaGRS have three-fold risk of IS. The metaGRS is similarly or more predictive compared to several risk factors, such as family history, blood pressure, body mass index, and smoking. We estimate the reductions needed in modifiable risk factors for individuals with different levels of genomic risk and suggest that, for individuals with high metaGRS, achieving risk factor levels recommended by current guidelines may be insufficient to mitigate risk.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Siew-Kee Low ◽  
Yoon Ming Chin ◽  
Hidemi Ito ◽  
Keitaro Matsuo ◽  
Chizu Tanikawa ◽  
...  

AbstractGenome-wide association studies (GWAS) have successfully identified about 70 genomic loci associated with breast cancer. Owing to the complexity of linkage disequilibrium and environmental exposures in different populations, it is essential to perform regional GWAS for better risk prediction. This study aimed to investigate the genetic architecture and to assess common genetic risk model of breast cancer with 6,669 breast cancer patients and 21,930 female controls in the Japanese population. This GWAS identified 11 genomic loci that surpass genome-wide significance threshold of P < 5.0 × 10−8 with nine previously reported loci and two novel loci that include rs9862599 on 3q13.11 (ALCAM) and rs75286142 on 21q22.12 (CLIC6-RUNX1). Validation study was carried out with 981 breast cancer cases and 1,394 controls from the Aichi Cancer Center. Pathway analyses of GWAS signals identified association of dopamine receptor medicated signaling and protein amino acid deacetylation with breast cancer. Weighted genetic risk score showed that individuals who were categorized in the highest risk group are approximately 3.7 times more likely to develop breast cancer compared to individuals in the lowest risk group. This well-powered GWAS is a representative study to identify SNPs that are associated with breast cancer in the Japanese population.


Depression ◽  
2019 ◽  
pp. 33-50
Author(s):  
Thorhildur Halldorsdottir ◽  
Hildur Ýr Hilmarsdottir

Research on the genetic underpinnings of depression has rapidly advanced in the past decade. This field of research provides a promising avenue toward improving the diagnosis of, prevention of, and treatment for this devastating disorder. The goal of this chapter is to review the main genetic and gene-by-environment interaction findings on depression. We first describe family and twin studies used to empirically study the familial aggregation of depression. Second, we provide a review of the genome-wide association studies (GWAS) published to date. Building on GWAS findings, we will discuss the use of polygenic risk scores in predicting depression. We also review the most robust candidate gene studies and gene-by-environment interaction studies. Finally, we discuss the clinical implications of the findings and promising strategies for making further progress within this field.


2014 ◽  
Vol 99 (9) ◽  
pp. E1814-E1818 ◽  
Author(s):  
Yann C. Klimentidis ◽  
Nathan E. Wineinger ◽  
Ana I. Vazquez ◽  
Gustavo de los Campos

Context/Rationale: Meta-analyses of genome-wide association studies have identified many single-nucleotide polymorphisms associated with various metabolic and cardiovascular traits, offering us the opportunity to learn about and capitalize on the links between cardiometabolic traits and type 2 diabetes (T2D). Design: In multiple datasets comprising over 30 000 individuals and 3 ethnic/racial groups, we calculated 17 genetic risk scores (GRSs) for glycemic, anthropometric, lipid, hemodynamic, and other traits, based on the results of recent trait-specific meta-analyses of genome-wide association studies, and examined associations with T2D risk. Using a training-testing procedure, we evaluated whether additional GRSs could contribute to risk prediction. Results: In European Americans, we find that GRSs for T2D, fasting glucose, fasting insulin, and body mass index are associated with T2D risk. In African Americans, GRSs for T2D, fasting insulin, and waist-to-hip ratio are associated with T2D. In Hispanic Americans, GRSs for T2D and body mass index are associated with T2D. We observed a trend among European Americans suggesting that genetic risk for hyperlipidemia is inversely associated with T2D risk. The use of additional GRSs resulted in only small changes in prediction accuracy in multiple independent validation datasets. Conclusions: The analysis of multiple GRSs can shed light on T2D etiology and how it varies across ethnic/racial groups. Our findings using multiple GRSs are consistent with what is known about the differences in T2D pathogenesis across racial/ethnic groups. However, further work is needed to understand the putative inverse correlation of genetic risk for hyperlipidemia and T2D risk and to develop ethnic-specific GRSs.


Sign in / Sign up

Export Citation Format

Share Document