scholarly journals Re-Identification of Individuals in Genomic Data-Sharing Beacons via Allele Inference

2017 ◽  
Author(s):  
Nora von Thenen ◽  
Erman Ayday ◽  
A. Ercument Cicek

AbstractGenomic datasets are often associated with sensitive phenotypes. Therefore, the leak of membership information is a major privacy risk. Genomic beacons aim to provide a secure, easy to implement, and standardized interface for data sharing by only allowing yes/no queries on the presence of specific alleles in the dataset. Previously deemed secure against re-identification attacks, beacons were shown to be vulnerable despite their stringent policy. Recent studies have demonstrated that it is possible to determine whether the victim is in the dataset, by repeatedly querying the beacon for his/her single nucleotide polymorphisms (SNPs). In this work, we propose a novel re-identification attack and show that the privacy risk is more serious than previously thought. Using the proposed attack, even if the victim systematically hides informative SNPs (i.e., SNPs with very low minor allele frequency -MAF-), it is possible to infer the alleles at positions of interest as well as the beacon query results with very high confidence. Our method is based on the fact that alleles at different loci are not necessarily independent. We use the linkage disequilibrium and a high-order Markov chain-based algorithm for the inference. We show that in a simulated beacon with 65 individuals from the CEU population, we can infer membership of individuals with 95% confidence with only 5 queries, even when SNPs with MAF less than 0.05 are hidden. This means, we need less than 0.5% of the number of queries that existing works require, to determine beacon membership under the same conditions. We further show that countermeasures such as hiding certain parts of the genome or setting a query budget for the user would fail to protect the privacy of the participants under our adversary model.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
William Stone ◽  
Abraham Nunes ◽  
Kazufumi Akiyama ◽  
Nirmala Akula ◽  
Raffaella Ardau ◽  
...  

AbstractPredicting lithium response prior to treatment could both expedite therapy and avoid exposure to side effects. Since lithium responsiveness may be heritable, its predictability based on genomic data is of interest. We thus evaluate the degree to which lithium response can be predicted with a machine learning (ML) approach using genomic data. Using the largest existing genomic dataset in the lithium response literature (n = 2210 across 14 international sites; 29% responders), we evaluated the degree to which lithium response could be predicted based on 47,465 genotyped single nucleotide polymorphisms using a supervised ML approach. Under appropriate cross-validation procedures, lithium response could be predicted to above-chance levels in two constituent sites (Halifax, Cohen’s kappa 0.15, 95% confidence interval, CI [0.07, 0.24]; and Würzburg, kappa 0.2 [0.1, 0.3]). Variants with shared importance in these models showed over-representation of postsynaptic membrane related genes. Lithium response was not predictable in the pooled dataset (kappa 0.02 [− 0.01, 0.04]), although non-trivial performance was achieved within a restricted dataset including only those patients followed prospectively (kappa 0.09 [0.04, 0.14]). Genomic classification of lithium response remains a promising but difficult task. Classification performance could potentially be improved by further harmonization of data collection procedures.


2011 ◽  
Vol 2 (1) ◽  
pp. 5 ◽  
Author(s):  
Rachel Marie Raia ◽  
George Adrian Calin

Non-coding RNAs were previously thought to have little importance because they are not directly translated into a protein like their coding counterparts. However, it was recently found that non-coding RNAs do in fact have a much bigger role than previously thought. They are involved in cancer predisposition, development and progression. MicroRNAs, very short non-coding RNAs, are abnormally expressed in cancer and some harbor mutations that affect expression levels. MicroRNA alterations have been observed in all forms of cancer that have been researched to the current date. MicroRNAs are also located in cancer- associated genomic regions, which have been previously shown to affect gene expression leading to the activation or inhibition of cancer growth. Single-nucleotide polymorphisms within microRNAs can predispose someone to cancer. MicroRNAs have been shown to target both tumor suppressors, inhibiting cancer development, as well as oncogenes, stimulating cancer development. Some microRNAs can switch between these two functions and behave as a tumor suppressor at one time and an oncogene at another time. MicroRNAs can be used for diagnostic purposes as well as prognostic evaluations. Outside of microRNAs, ultraconserved genes, another group of non-coding RNAs, also express differently in cancer patients. Large intervening non-coding RNAs, specifically one termed HOTAIR, have been quantified in very high levels in cancer cells and have been implicated in metastasis. Further research into noncoding RNAs may allow for the development of therapies that will target non-coding RNAs creating better treatment options for cancer patients, improving their prognosis. This review discusses the most current discoveries about non-coding RNAs, revealing their associations with cancer.


2016 ◽  
Author(s):  
Gad Abraham ◽  
Michael Inouye

SummarySparse canonical correlation analysis (SCCA) is a useful approach for correlating one set of measurements, such as single nucleotide polymorphisms (SNPs), with another set of measurements, such as gene expression levels. We present a fast implementation of SCCA, enabling rapid analysis of hundreds of thousands of SNPs together with thousands of phenotypes. Our approach is implemented both as an R package flashpcaR and within the standalone commandline tool flashpca.Availability and implementationhttps://github.com/gabraham/[email protected]


2016 ◽  
Vol 13 (4) ◽  
pp. 33-42 ◽  
Author(s):  
Alexander Mulik ◽  
Valery Novochadov ◽  
Alexander Bondarev ◽  
Sofya Lipnitskaya ◽  
Irina Ulesikova ◽  
...  

Summary The objective of the study was to investigate the genetic basis of general non-specific reactivity of an organism. Systematic search in PubMedCentral, PDB, KEGG and SNP databases identified a set of genes and their polymorphisms that can determine pain sensitivity and therefore the level of general non-specific reactivity of the human organism. Six SNPs were selected for genotyping kit design; 230 healthy volunteers were enrolled in the study. It was revealed that very high pain threshold was associated with allele A in rs1851048 and allele C in rs6777055. High level of general non-specific reactivity of an organism was associated with allele G in rs2562456 (OR=1.804, CI=1.139-2.857, p=0.011) and allele C in rs6923492 (OR=1.582, CI=1.071-2.335, p=0.021). Low level of general non-specific reactivity of an organism was associated with allele T in rs6923492 (OR=0.351, CI=0.154-0.799, p=0.010). A set of genes and single-nucleotide polymorphisms associated with the pain sensitivity and indirectly with the level of general non-specific reactivity of human organism were determined. The identified correlations reveal some molecular mechanisms of general non-specific reactivity of an organism variability and can guide further research in this area.


2020 ◽  
Author(s):  
Tom Druet ◽  
Kamil Oleński ◽  
Laurence Flori ◽  
Amandine R Bertrand ◽  
Wanda Olech ◽  
...  

Abstract After extinction in the wild in the beginning of the 20th century, the European bison has been successfully recovered in 2 distinct genetic lines from only 12 and 7 captive founders. We here aimed at characterizing the levels of realized inbreeding in these 2 restored lines to provide empirical insights into the genomic footprints left by population recovery from a small number of founders. To that end, we genotyped 183 European bison born over the last 40 years with the Illumina BovineHD beadchip that contained 22 602 informative autosomal single-nucleotide polymorphisms after data filtering. We then identified homozygous-by-descent (HBD) segments and classified them into different age-related classes relying on a model-based approach. As expected, we observed that the strong and recent founder effect experienced by the 2 lines resulted in very high levels of recent inbreeding and in the presence of long HBD tracks (up to 120 Mb). These long HBD tracks were associated with ancestors living approximately from 4 to 32 generations in the past, suggesting that inbreeding accumulated over multiple generations after the bottleneck. The contribution to inbreeding of the most recent groups of ancestors was however found to be decreasing in both lines. In addition, comparison of Lowland individuals born at different time periods showed that the levels of inbreeding tended to stabilize, HBD segments being shorter in animals born more recently which indicates efficient control of inbreeding. Monitoring HBD segment lengths over generations may thus be viewed as a valuable genomic diagnostic tool for populations in conservation or recovery programs.


2018 ◽  
Author(s):  
Darrell O. Ricke ◽  
James Watkins ◽  
Philip Fremont-Smith ◽  
Tara Boettcher ◽  
Eric Schwoebel

AbstractHigh throughput sequencing (HTS) of complex DNA mixtures with single nucleotide polymorphisms (SNPs) panels can identify multiple individuals in forensic DNA mixture samples. SNP mixture analysis relies upon the exclusion of non-contributing individuals with the subset of SNP loci with no detected minor alleles in the mixture. Few, if any, individuals are anticipated to be detectable in saturated mixtures by this mixture analysis approach because of the increased probability of matching random individuals. Being able to identify a subset of the contributors in saturated HTS SNP mixtures is valuable for forensic investigations. A desaturated mixture can be created by treating a set of SNPs with the lowest minor allele ratios as having no minor alleles. Leveraging differences in DNA contributor concentrations in saturated mixtures, we introduce TranslucentID for the identification of a subset of individuals with high confidence who contributed DNA to saturated mixtures by desaturating the mixtures.


2019 ◽  
Author(s):  
Dhas D Benet Bosco ◽  
K Rajalakshmi ◽  
S Suganya ◽  
P Pavani ◽  
K Yaswanth

ABSTRACTCytochrome P450 oxidoreductase (POR) is a highly polymorphic gene which is involved in metabolism of drugs and steroids through transfer of electron from NADPH to all CYP enzymes. In this study, we attempt to identify the very high risk single nucleotide polymorphisms in POR gene that would affect phenotype of the enzyme. The genetic variants in POR gene were retrieved from databases and analyzed with appropriate online computation tools. Very high risk non-synonymous SNPs were identified with 12 different sequence and structure homology based tools and evolutionary conservation tool (Consurf). Further the phenotype effect of the variant was assessed with MutPred2 and LigPlot. The very high risk non-coding variants were predicted with HaploReg V4 and RegulomeDB tools. The very high risk SNPs that may affect miRNA target sites were screened using PolymiRTs v3.0, miRNA SNP v2.0 and MirSNP. Among 4,601 variants in POR gene, 58 missense variants, 8 non-coding variants and three SNPs in miRNA target sites were found to be very high risk. These very high risk variants may regulate the expression and activity of cytochrome P450 oxidoreductase enzyme leading to differential drug and steroid metabolism by CYP enzymes.


2004 ◽  
Vol 40 ◽  
pp. 157-167 ◽  
Author(s):  
Maria Nilsson ◽  
Karin Dahlman-Wright ◽  
Jan-Åke Gustafsson

For several decades, it has been known that oestrogens are essential for human health. The discovery that there are two oestrogen receptors (ERs), ERalpha and ERbeta, has facilitated our understanding of how the hormone exerts its physiological effects. The ERs belong to the family of ligand-activated nuclear receptors, which act by modulating the expression of target genes. Studies of ER-knockout (ERKO) mice have been instrumental in defining the relevance of a given receptor subtype in a certain tissue. Phenotypes displayed by ERKO mice suggest diseases in which dysfunctional ERs might be involved in aetiology and pathology. Association between single-nucleotide polymorphisms (SNPs) in ER genes and disease have been demonstrated in several cases. Selective ER modulators (SERMs), which are selective with regard to their effects in a certain cell type, already exist. Since oestrogen has effects in many tissues, the goal with a SERM is to provide beneficial effects in one target tissue while avoiding side effects in others. Refined SERMs will, in the future, provide improved therapeutic strategies for existing and novel indications.


Sign in / Sign up

Export Citation Format

Share Document