scholarly journals Truly privacy-preserving federated analytics for precision medicine with multiparty homomorphic encryption

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
David Froelicher ◽  
Juan R. Troncoso-Pastoriza ◽  
Jean Louis Raisaro ◽  
Michel A. Cuendet ◽  
Joao Sa Sousa ◽  
...  

AbstractUsing real-world evidence in biomedical research, an indispensable complement to clinical trials, requires access to large quantities of patient data that are typically held separately by multiple healthcare institutions. We propose FAMHE, a novel federated analytics system that, based on multiparty homomorphic encryption (MHE), enables privacy-preserving analyses of distributed datasets by yielding highly accurate results without revealing any intermediate data. We demonstrate the applicability of FAMHE to essential biomedical analysis tasks, including Kaplan-Meier survival analysis in oncology and genome-wide association studies in medical genetics. Using our system, we accurately and efficiently reproduce two published centralized studies in a federated setting, enabling biomedical insights that are not possible from individual institutions alone. Our work represents a necessary key step towards overcoming the privacy hurdle in enabling multi-centric scientific collaborations.

2020 ◽  
Vol 117 (21) ◽  
pp. 11608-11613 ◽  
Author(s):  
Marcelo Blatt ◽  
Alexander Gusev ◽  
Yuriy Polyakov ◽  
Shafi Goldwasser

Genome-wide association studies (GWASs) seek to identify genetic variants associated with a trait, and have been a powerful approach for understanding complex diseases. A critical challenge for GWASs has been the dependence on individual-level data that typically have strict privacy requirements, creating an urgent need for methods that preserve the individual-level privacy of participants. Here, we present a privacy-preserving framework based on several advances in homomorphic encryption and demonstrate that it can perform an accurate GWAS analysis for a real dataset of more than 25,000 individuals, keeping all individual data encrypted and requiring no user interactions. Our extrapolations show that it can evaluate GWASs of 100,000 individuals and 500,000 single-nucleotide polymorphisms (SNPs) in 5.6 h on a single server node (or in 11 min on 31 server nodes running in parallel). Our performance results are more than one order of magnitude faster than prior state-of-the-art results using secure multiparty computation, which requires continuous user interactions, with the accuracy of both solutions being similar. Our homomorphic encryption advances can also be applied to other domains where large-scale statistical analyses over encrypted data are needed.


Author(s):  
Caroline Uhler ◽  
Aleksandra B. Slavkovic ◽  
Stephen E. Fienberg

Traditional statistical methods for confidentiality protection of statistical databases do not scale well to deal with GWAS databases especially in terms of guarantees regarding protection from linkage to external information. The more recent concept of differential privacy, introduced by the cryptographic community, is an approach which provides a rigorous definition of privacy with meaningful privacy guarantees in the presence of arbitrary external information, although the guarantees may come at a serious price in terms of data utility. Building on such notions, we propose new methods to release aggregate GWAS data without compromising an individual’s privacy. We present methods for releasing differentially private minor allele frequencies, chi-square statistics and p-values. We compare these approaches on simulated data and on a GWAS study of canine hair length involving 685 dogs. We also propose a privacy-preserving method for finding genome-wide associations based on a differentially-private approach to penalized logistic regression.


2021 ◽  
Author(s):  
David Froelicher ◽  
Juan R. Troncoso-Pastoriza ◽  
Jean Louis Raisaro ◽  
Michel A. Cuendet ◽  
Joao Sa Sousa ◽  
...  

ABSTRACTIn biomedical research, real-world evidence, which is emerging as an indispensable complement of clinical trials, relies on access to large quantities of patient data that typically reside at separate healthcare institutions. Conventional approaches for centralizing those data are often not feasible due to privacy and security requirements. As a result, more privacy-friendly solutions based on federated analytics are emerging. They enable to simultaneously analyse medical data distributed across a group of connected institutions. However, these techniques do not inherently protect patients’ privacy as they require institutions to share intermediate results that can reveal patient-level information. To address this issue, state-of-the-art solutions use additional privacy-preserving measures based on data obfuscation, which often introduce noise in the computation of the final result that can become too inaccurate for precision medicine use cases. We propose FAMHE, a modular system based on multiparty homomorphic encryption, that enables the privacy-preserving execution of federated analytics workflows yielding exact results and without leaking any intermediate information. To demonstrate the maturity of our approach, we reproduce the results of two published state-of-the-art centralized biomedical studies, and we demonstrate that FAMHE enables the efficient, privacy-preserving and decentralized execution of analyses that range from low computational complexity, such as Kaplan-Meier overall survival curves used in oncology, to high computational complexity, such as genome-wide association studies on millions of variants.


Sign in / Sign up

Export Citation Format

Share Document