Paving the path toward genomic privacy with secure imputation

Cell Systems ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 950-952
Author(s):  
Maxwell A. Sherman
Keyword(s):  
2021 ◽  
Vol 24 (2) ◽  
pp. 1-35
Author(s):  
Isabel Wagner ◽  
Iryna Yevseyeva

The ability to measure privacy accurately and consistently is key in the development of new privacy protections. However, recent studies have uncovered weaknesses in existing privacy metrics, as well as weaknesses caused by the use of only a single privacy metric. Metrics suites, or combinations of privacy metrics, are a promising mechanism to alleviate these weaknesses, if we can solve two open problems: which metrics should be combined and how. In this article, we tackle the first problem, i.e., the selection of metrics for strong metrics suites, by formulating it as a knapsack optimization problem with both single and multiple objectives. Because solving this problem exactly is difficult due to the large number of combinations and many qualities/objectives that need to be evaluated for each metrics suite, we apply 16 existing evolutionary and metaheuristic optimization algorithms. We solve the optimization problem for three privacy application domains: genomic privacy, graph privacy, and vehicular communications privacy. We find that the resulting metrics suites have better properties, i.e., higher monotonicity, diversity, evenness, and shared value range, than previously proposed metrics suites.


2005 ◽  
Vol 44 (05) ◽  
pp. 687-692 ◽  
Author(s):  
B. A. Malin

Summary Objectives: Current genomic privacy technologies assume the identity of genomic sequence data is protected if personal information, such as demographics, are obscured, removed, or encrypted. While demographic features can directly compromise an individual’s identity, recent research demonstrates such protections are insufficient because sequence data itself is susceptible to re-identification. To counteract this problem, we introduce an algorithm for anonymizing a collection of person-specific DNA sequences. Methods: The technique is termed DNA lattice an-onymization (DNALA), and is based upon the formal privacy protection schema of k-anonymity. Under this model, it is impossible to observe or learn features that distinguish one genetic sequence from k-1 other entries in a collection. To maximize information retained in protected sequences, we incorporate a concept generalization lattice to learn the distance between two residues in a single nucleotide region. The lattice provides the most similar generalized concept for two residues (e.g. adenine and guanine are both purines). Results: The method is tested and evaluated with several publicly available human population datasets ranging in size from 30 to 400 sequences. Our findings imply the anonymization schema is feasible for the protection of sequences privacy. Conclusions: The DNALA method is the first computational disclosure control technique for general DNA sequences. Given the computational nature of the method, guarantees of anonymity can be formally proven. There is room for improvement and validation, though this research provides the groundwork from which future researchers can construct genomics anonymization schemas tailored to specific data-sharing scenarios.


2019 ◽  
Vol 21 (2) ◽  
pp. 511-526 ◽  
Author(s):  
Abukari Mohammed Yakubu ◽  
Yi-Ping Phoebe Chen

Abstract In recent times, the reduced cost of DNA sequencing has resulted in a plethora of genomic data that is being used to advance biomedical research and improve clinical procedures and healthcare delivery. These advances are revolutionizing areas in genome-wide association studies (GWASs), diagnostic testing, personalized medicine and drug discovery. This, however, comes with security and privacy challenges as the human genome is sensitive in nature and uniquely identifies an individual. In this article, we discuss the genome privacy problem and review relevant privacy attacks, classified into identity tracing, attribute disclosure and completion attacks, which have been used to breach the privacy of an individual. We then classify state-of-the-art genomic privacy-preserving solutions based on their application and computational domains (genomic aggregation, GWASs and statistical analysis, sequence comparison and genetic testing) that have been proposed to mitigate these attacks and compare them in terms of their underlining cryptographic primitives, security goals and complexities—computation and transmission overheads. Finally, we identify and discuss the open issues, research challenges and future directions in the field of genomic privacy. We believe this article will provide researchers with the current trends and insights on the importance and challenges of privacy and security issues in the area of genomics.


2013 ◽  
Vol 3 (8) ◽  
pp. vii-vii
Author(s):  
Brenda J. Andrews ◽  
Tracey DePellegrin
Keyword(s):  

2021 ◽  
Author(s):  
Arif Ozgun Harmanci ◽  
Miran Kim ◽  
Su Wang ◽  
Wentao Li ◽  
Yongsoo Song ◽  
...  

As DNA sequencing data is available for personal use, genomic privacy is becoming a major challenge. Nevertheless, high-throughput genomic data analysis outsourcing is performed using pipelines that tend to overlook these challenges. Results: We present a client-server-based outsourcing framework for genotype imputation, an important step in genomic data analyses. Genotype data is encrypted by the client and encrypted data are used by the server that never observes the data in plain. Cloud-based framework can benefit from virtually unlimited computational resources while providing provable confidentiality. Availability: Server is publicly available at https://www.secureomics.org/OpenImpute. Users can anonymously test and use imputation server without registration.


2017 ◽  
Vol 15 (5) ◽  
pp. 29-37 ◽  
Author(s):  
Erman Ayday ◽  
Mathias Humbert

2012 ◽  
Vol 91 (3) ◽  
pp. 577-578 ◽  
Author(s):  
Bartha M. Knoppers ◽  
Edward S. Dove ◽  
Jan-Eric Litton ◽  
J.J. Nietfeld
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document