scholarly journals Attacks on genetic privacy via uploads to genealogical databases

eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Michael D Edge ◽  
Graham Coop

Direct-to-consumer (DTC) genetics services are increasingly popular, with tens of millions of customers. Several DTC genealogy services allow users to upload genetic data to search for relatives, identified as people with genomes that share identical by state (IBS) regions. Here, we describe methods by which an adversary can learn database genotypes by uploading multiple datasets. For example, an adversary who uploads approximately 900 genomes could recover at least one allele at SNP sites across up to 82% of the genome of a median person of European ancestries. In databases that detect IBS segments using unphased genotypes, approximately 100 falsified uploads can reveal enough genetic information to allow genome-wide genetic imputation. We provide a proof-of-concept demonstration in the GEDmatch database, and we suggest countermeasures that will prevent the exploits we describe.

2019 ◽  
Author(s):  
Michael D. Edge ◽  
Graham Coop

AbstractDirect-to-consumer (DTC) genetics services are increasingly popular for genetic genealogy, with tens of millions of customers as of 2019. Several DTC genealogy services allow users to upload their own genetic datasets in order to search for genetic relatives. A user and a target person in the database are identified as genetic relatives if the user’s uploaded genome shares one or more sufficiently long segments in common with that of the target person—that is, if the two genomes share one or more long regions identical by state (IBS). IBS matches reveal some information about the genotypes of the target person, particularly if the chromosomal locations of IBS matches are shared with the uploader. Here, we describe several methods by which an adversary who wants to learn the genotypes of people in the database can do so by uploading multiple datasets. Depending on the methods used for IBS matching and the information about IBS segments returned to the user, substantial information about users’ genotypes can be revealed with a few hundred uploaded datasets. For example, using a method we call IBS tiling, we estimate that an adversary who uploads approximately 900 publicly available genomes could recover at least one allele at SNP sites across up to 82% of the genome of a median person of European ancestries. In databases that detect IBS segments using unphased genotypes, approximately 100 uploads of falsified datasets can reveal enough genetic information to allow accurate genome-wide imputation of every person in the database. Different DTC services use different methods for identifying and reporting IBS segments, leading to differences in vulnerability to the attacks we describe. We provide a proof-of-concept demonstration that the GEDmatch database in particular uses unphased genotypes to detect IBS and is vulnerable to genotypes being revealed by artificial datasets. We suggest simple-to-implement suggestions that will prevent the exploits we describe and discuss our results in light of recent trends in genetic privacy, including the recent use of uploads to DTC genetic genealogy services by law enforcement.


2019 ◽  
Vol 7 (1) ◽  
pp. 269-297 ◽  
Author(s):  
Kristi Harbord

The intersection of healthcare and technology is a rapidly growing area. One thriving field at this intersection involves obtaining, processing, and storing genetic data. While the benefits have been great, genetic information can reveal a great deal about individuals and their families. And the information that can be conveyed from genetic data appears limitless and is constantly growing and changing. Many entities have begun storing, processing, and sharing genetic data on a very large scale. This creates many privacy concerns that the current regulatory framework does not account for. The line between patient data and consumer data is blurred; many entities are interested in obtaining genetic data with varied interests. In the direct-to-consumer genetic testing market, consumers pay to send private companies their DNA samples in exchange for a trivial amount of information about their ancestry and health risks. But health data obtained and processed by a company are subjected to far less stringent privacy regulations than health data obtained and processed at a doctor’s office or hospital. This Comment summarizes some of the current genetic privacy problems in United States laws and examines the EU’s recently adopted GDPR for a possible solution. A GDPR-style regulation could provide more consistency, give individuals more control, and protect against future unknown uses.


2013 ◽  
Author(s):  
Yaniv Erlich ◽  
Arvind Narayanan

We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.


2014 ◽  
Vol 42 (1) ◽  
pp. 1-32
Author(s):  
Dianne Nicol ◽  
Meredith Hagger ◽  
Nola Ries ◽  
Johnathon Liddicoat

Genetic information is widely recognised as being particularly sensitive personal information about an individual and his or her family. This article presents an analysis of the privacy policies of Australian companies that were offering direct-to-consumer genetic testing services in 2012–13. The results of this analysis indicate that many of these companies do not comply with the Privacy Act 1988 (Cth), and will need to significantly reassess their privacy policies now that significant new amendments to the Act have come into force. Whilst the Privacy Commissioner has increased powers under the new amendments, the extent to which these will mitigate the deficiencies of the current regime in relation to privacy practices of direct–to-consumer genetic testing companies remains unclear. Accordingly, it may be argued that a privacy code for the direct-to-consumer genetic testing industry would provide clearer standards. Alternatively it may be time to rethink whether a sui generis approach to protecting genetic information is warranted.


Medical Law ◽  
2019 ◽  
pp. 470-505
Author(s):  
Emily Jackson

All books in this flagship series contain carefully selected substantial extracts from key cases, legislation, and academic debate, providing students with a stand-alone resource. This chapter examines the regulation of access to genetic information. It first discusses various third parties’ interests in genetic test results and DNA profiles, and the extent to which genetic privacy is protected by the law. The chapter then considers the issue of whether genetic discrimination should be treated in the same way as other illegitimate discriminatory practices and also discusses recent developments in the field of genetics, namely direct-to-consumer genetic testing and pharmacogenetics.


2019 ◽  
Vol 6 (1) ◽  
pp. 1-36 ◽  
Author(s):  
Ellen Wright Clayton ◽  
Barbara J Evans ◽  
James W Hazel ◽  
Mark A Rothstein

Abstract Recent advances in technology have significantly improved the accuracy of genetic testing and analysis, and substantially reduced its cost, resulting in a dramatic increase in the amount of genetic information generated, analysed, shared, and stored by diverse individuals and entities. Given the diversity of actors and their interests, coupled with the wide variety of ways genetic data are held, it has been difficult to develop broadly applicable legal principles for genetic privacy. This article examines the current landscape of genetic privacy to identify the roles that the law does or should play, with a focus on federal statutes and regulations, including the Health Insurance Portability and Accountability Act (HIPAA) and the Genetic Information Nondiscrimination Act (GINA). After considering the many contexts in which issues of genetic privacy arise, the article concludes that few, if any, applicable legal doctrines or enactments provide adequate protection or meaningful control to individuals over disclosures that may affect them. The article describes why it may be time to shift attention from attempting to control access to genetic information to considering the more challenging question of how these data can be used and under what conditions, explicitly addressing trade-offs between individual and social goods in numerous applications.


2021 ◽  
Vol 10 ◽  
pp. 204800402110236
Author(s):  
Julia Ramírez ◽  
Stefan van Duijvenboden ◽  
William J Young ◽  
Michele Orini ◽  
Aled R Jones ◽  
...  

The electrocardiogram (ECG) is a commonly used clinical tool that reflects cardiac excitability and disease. Many parameters are can be measured and with the improvement of methodology can now be quantified in an automated fashion, with accuracy and at scale. Furthermore, these measurements can be heritable and thus genome wide association studies inform the underpinning biological mechanisms. In this review we describe how we have used the resources in UK Biobank to undertake such work. In particular, we focus on a substudy uniquely describing the response to exercise performed at scale with accompanying genetic information.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Athea Vichas ◽  
Amanda K. Riley ◽  
Naomi T. Nkinsi ◽  
Shriya Kamlapurkar ◽  
Phoebe C. R. Parrish ◽  
...  

AbstractCRISPR-based cancer dependency maps are accelerating advances in cancer precision medicine, but adequate functional maps are limited to the most common oncogenes. To identify opportunities for therapeutic intervention in other rarer subsets of cancer, we investigate the oncogene-specific dependencies conferred by the lung cancer oncogene, RIT1. Here, genome-wide CRISPR screening in KRAS, EGFR, and RIT1-mutant isogenic lung cancer cells identifies shared and unique vulnerabilities of each oncogene. Combining this genetic data with small-molecule sensitivity profiling, we identify a unique vulnerability of RIT1-mutant cells to loss of spindle assembly checkpoint regulators. Oncogenic RIT1M90I weakens the spindle assembly checkpoint and perturbs mitotic timing, resulting in sensitivity to Aurora A inhibition. In addition, we observe synergy between mutant RIT1 and activation of YAP1 in multiple models and frequent nuclear overexpression of YAP1 in human primary RIT1-mutant lung tumors. These results provide a genome-wide atlas of oncogenic RIT1 functional interactions and identify components of the RAS pathway, spindle assembly checkpoint, and Hippo/YAP1 network as candidate therapeutic targets in RIT1-mutant lung cancer.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Yiqing Zhao ◽  
Saravut J. Weroha ◽  
Ellen L. Goode ◽  
Hongfang Liu ◽  
Chen Wang

Abstract Background Next-generation sequencing provides comprehensive information about individuals’ genetic makeup and is commonplace in oncology clinical practice. However, the utility of genetic information in the clinical decision-making process has not been examined extensively from a real-world, data-driven perspective. Through mining real-world data (RWD) from clinical notes, we could extract patients’ genetic information and further associate treatment decisions with genetic information. Methods We proposed a real-world evidence (RWE) study framework that incorporates context-based natural language processing (NLP) methods and data quality examination before final association analysis. The framework was demonstrated in a Foundation-tested women cancer cohort (N = 196). Upon retrieval of patients’ genetic information using NLP system, we assessed the completeness of genetic data captured in unstructured clinical notes according to a genetic data-model. We examined the distribution of different topics regarding BRCA1/2 throughout patients’ treatment process, and then analyzed the association between BRCA1/2 mutation status and the discussion/prescription of targeted therapy. Results We identified seven topics in the clinical context of genetic mentions including: Information, Evaluation, Insurance, Order, Negative, Positive, and Variants of unknown significance. Our rule-based system achieved a precision of 0.87, recall of 0.93 and F-measure of 0.91. Our machine learning system achieved a precision of 0.901, recall of 0.899 and F-measure of 0.9 for four-topic classification and a precision of 0.833, recall of 0.823 and F-measure of 0.82 for seven-topic classification. We found in result-containing sentences, the capture of BRCA1/2 mutation information was 75%, but detailed variant information (e.g. variant types) is largely missing. Using cleaned RWD, significant associations were found between BRCA1/2 positive mutation and targeted therapies. Conclusions In conclusion, we demonstrated a framework to generate RWE using RWD from different clinical sources. Rule-based NLP system achieved the best performance for resolving contextual variability when extracting RWD from unstructured clinical notes. Data quality issues such as incompleteness and discrepancies exist thus manual data cleaning is needed before further analysis can be performed. Finally, we were able to use cleaned RWD to evaluate the real-world utility of genetic information to initiate a prescription of targeted therapy.


2013 ◽  
Vol 41 (S1) ◽  
pp. 65-68 ◽  
Author(s):  
Michelle Huckaby Lewis

Human biological tissue samples are an invaluable resource for biomedical research designed to find causes of diseases and their treatments. Controversy has arisen, however, when research has been conducted with laboratory specimens either without the consent of the source of the specimen or when the research conducted with the specimen has expanded beyond the scope of the original consent agreement. Moreover, disputes have arisen regarding which party, the researcher or the source of the specimen, has control over who may use the specimens and for what purposes. The purposes of this article are: (1) to summarize the most important litigation regarding the use of laboratory specimens, and (2) to demonstrate how legal theory regarding control of laboratory specimens has evolved from arguments based upon property interests in biological samples to claims that the origins of laboratory specimens have privacy interests in their genetic information that should be protected.


Sign in / Sign up

Export Citation Format

Share Document