scholarly journals A Federated Record Linkage Algorithm for Secure Medical Data Sharing

Author(s):  
Christian M. Heidt ◽  
Hauke Hund ◽  
Christian Fegeler

The process of consolidating medical records from multiple institutions into one data set makes privacy-preserving record linkage (PPRL) a necessity. Most PPRL approaches, however, are only designed to link records from two institutions, and existing multi-party approaches tend to discard non-matching records, leading to incomplete result sets. In this paper, we propose a new algorithm for federated record linkage between multiple parties by a trusted third party using record-level bloom filters to preserve patient data privacy. We conduct a study to find optimal weights for linkage-relevant data fields and are able to achieve 99.5% linkage accuracy testing on the Febrl record linkage dataset. This approach is integrated into an end-to-end pseudonymization framework for medical data sharing.

2021 ◽  
Author(s):  
Christopher Hampf ◽  
Martin Bialke ◽  
Hauke Hund ◽  
Christian Fegeler ◽  
Stefan Lang ◽  
...  

Abstract BackgroundThe Federal Ministry of Research and Education funded the Network of University Medicine for establishing an infrastructure for pandemic research. This includes the development of a COVID-19 Data Exchange Platform (CODEX) that provides standardised and harmonised data sets for COVID-19 research. Nearly all university hospitals in Germany are part of the project and transmit medical data from the local data integration centres to the CODEX platform. The medical data on a person that has been collected at several sites is to be made available on the CODEX platform in a merged form. To enable this, a federated trusted third party (fTTP) will be established, which will allow the pseudonymised merging of the medical data. The fTTP implements privacy preserving record linkage based on Bloom filters and assigns pseudonyms to enable re-pseudonymisation during data transfer to the CODEX platform.ResultsThe fTTP was implemented conceptually and technically. For this purpose, the processes that are necessary for data delivery were modelled. The resulting communication relationships were identified and corresponding interfaces were specified. These were developed according to the specifications in FHIR and validated with the help of external partners. Existing tools such as the identity management system E-PIX® were further developed accordingly so that sites can generate Bloom filters based on person identifying information. An extension for the comparison of Bloom filters was implemented for the federated trust third party. The correct implementation was shown in the form of a demonstrator and the connection of two data integration centres.ConclusionsThis article describes how the fTTP was modelled and implemented. In a first expansion stage, the fTTP was exemplarily connected through two sites and its functionality was demonstrated. Further expansion stages, which are already planned, have been technically specified and will be implemented in the future in order to also handle cases in which the privacy preserving record linkage achieves ambiguous results. The first expansion stage of the fTTP is available in the University Medicine network and will be connected by all participating sites in the ongoing test phase.


Author(s):  
Mahmoud Barhamgi ◽  
Djamal Benslimane ◽  
Chirine Ghedira ◽  
Brahim Medjahed

Recent years have witnessed a growing interest in using Web services as a reliable means for medical data sharing inside and across healthcare organizations. In such service-based data sharing environments, Web service composition emerged as a viable approach to query data scattered across independent locations. Patient data privacy preservation is an important aspect that must be considered when composing medical Web services. In this paper, the authors show how data privacy can be preserved when composing and executing Web services. Privacy constraints are expressed in the form of RDF queries over a mediated ontology. Query rewriting algorithms are defined to process those queries while preserving users’ privacy.


Author(s):  
Hauke Hund ◽  
Reto Wettstein ◽  
Christian M. Heidt ◽  
Christian Fegeler

Several standards and frameworks have been described in existing literature and technical manuals that contribute to solving the interoperability problem. Their data models usually focus on clinical data and only support healthcare delivery processes. Research processes including cross organizational cohort size estimation, approvals and reviews of research proposals, consent checks, record linkage and pseudonymization need to be supported within the HiGHmed medical informatics consortium. The open source HiGHmed Data Sharing Framework implements a distributed business process engine for executing arbitrary biomedical research and healthcare processes modeled and executed using BPMN 2.0 while exchanging information using FHIR R4 resources. The proposed reference implementation is currently being rolled out to eight university hospitals in Germany as well as a trusted third party and available open source under the Apache 2.0 license.


2019 ◽  
Vol 11 (11) ◽  
pp. 225 ◽  
Author(s):  
Yuling Chen ◽  
Jinyi Guo ◽  
Changlou Li ◽  
Wei Ren

In the big data era, data are envisioned as critical resources with various values, e.g., business intelligence, management efficiency, and financial evaluations. Data sharing is always mandatory for value exchanges and profit promotion. Currently, certain big data markets have been created for facilitating data dissemination and coordinating data transaction, but we have to assume that such centralized management of data sharing must be trustworthy for data privacy and sharing fairness, which very likely imposes limitations such as joining admission, sharing efficiency, and extra costly commissions. To avoid these weaknesses, in this paper, we propose a blockchain-based fair data exchange scheme, called FaDe. FaDe can enable de-centralized data sharing in an autonomous manner, especially guaranteeing trade fairness, sharing efficiency, data privacy, and exchanging automation. A fairness protocol based on bit commitment is proposed. An algorithm based on blockchain script architecture for a smart contract, e.g., by a bitcoin virtual machine, is also proposed and implemented. Extensive analysis justifies that the proposed scheme can guarantee data exchanging without a trusted third party fairly, efficiently, and automatically.


2021 ◽  
Vol 19 (1) ◽  
Author(s):  
Florens Rohde ◽  
Martin Franke ◽  
Ziad Sehili ◽  
Martin Lablans ◽  
Erhard Rahm

Abstract Background Data analysis for biomedical research often requires a record linkage step to identify records from multiple data sources referring to the same person. Due to the lack of unique personal identifiers across these sources, record linkage relies on the similarity of personal data such as first and last names or birth dates. However, the exchange of such identifying data with a third party, as is the case in record linkage, is generally subject to strict privacy requirements. This problem is addressed by privacy-preserving record linkage (PPRL) and pseudonymization services. Mainzelliste is an open-source record linkage and pseudonymization service used to carry out PPRL processes in real-world use cases. Methods We evaluate the linkage quality and performance of the linkage process using several real and near-real datasets with different properties w.r.t. size and error-rate of matching records. We conduct a comparison between (plaintext) record linkage and PPRL based on encoded records (Bloom filters). Furthermore, since the Mainzelliste software offers no blocking mechanism, we extend it by phonetic blocking as well as novel blocking schemes based on locality-sensitive hashing (LSH) to improve runtime for both standard and privacy-preserving record linkage. Results The Mainzelliste achieves high linkage quality for PPRL using field-level Bloom filters due to the use of an error-tolerant matching algorithm that can handle variances in names, in particular missing or transposed name compounds. However, due to the absence of blocking, the runtimes are unacceptable for real use cases with larger datasets. The newly implemented blocking approaches improve runtimes by orders of magnitude while retaining high linkage quality. Conclusion We conduct the first comprehensive evaluation of the record linkage facilities of the Mainzelliste software and extend it with blocking methods to improve its runtime. We observed a very high linkage quality for both plaintext as well as encoded data even in the presence of errors. The provided blocking methods provide order of magnitude improvements regarding runtime performance thus facilitating the use in research projects with large datasets and many participants.


Cyber Crime ◽  
2013 ◽  
pp. 310-324
Author(s):  
Mahmoud Barhamgi ◽  
Djamal Benslimane ◽  
Chirine Ghedira ◽  
Brahim Medjahed

Recent years have witnessed a growing interest in using Web services as a reliable means for medical data sharing inside and across healthcare organizations. In such service-based data sharing environments, Web service composition emerged as a viable approach to query data scattered across independent locations. Patient data privacy preservation is an important aspect that must be considered when composing medical Web services. In this paper, the authors show how data privacy can be preserved when composing and executing Web services. Privacy constraints are expressed in the form of RDF queries over a mediated ontology. Query rewriting algorithms are defined to process those queries while preserving users’ privacy.


2020 ◽  
Vol 18 (1) ◽  
Author(s):  
Thomas Bahls ◽  
Johannes Pung ◽  
Stephanie Heinemann ◽  
Johannes Hauswaldt ◽  
Iris Demmer ◽  
...  

Abstract Background Medical data from family doctors are of great importance to health care researchers but seem to be locked in German practices and, thus, are underused in research. The RADAR project (Routine Anonymized Data for Advanced Health Services Research) aims at designing, implementing and piloting a generic research architecture, technical software solutions as well as procedures and workflows to unlock data from family doctor’s practices. A long-term medical data repository for research taking legal requirements into account is established. Thereby, RADAR helps closing the gap between the European countries and to contribute data from primary care in Germany. Methods The RADAR project comprises three phases: (1) analysis phase, (2) design phase, and (3) pilot. First, interdisciplinary workshops were held to list prerequisites and requirements. Second, an architecture diagram with building blocks and functions, and an ordered list of process steps (workflow) for data capture and storage were designed. Third, technical components and workflows were piloted. The pilot was extended by a data integration workflow using patient-reported outcomes (paper-based questionnaires). Results The analysis phase resulted in listing 17 essential prerequisites and guiding requirements for data management compliant with the General Data Protection Regulation (GDPR). Based on this list existing approaches to fulfil the RADAR tasks were evaluated—for example, re-using BDT interface for data exchange and Trusted Third Party-approach for consent management and record linkage. Consented data sets of 100 patients were successfully exported, separated into person-identifying and medical data, pseudonymised and saved. Record linkage and data integration workflows for patient-reported outcomes in the RADAR research database were successfully piloted for 63 responders. Conclusion The RADAR project successfully developed a generic architecture together with a technical framework of tools, interfaces, and workflows for a complete infrastructure for practicable and secure processing of patient data from family doctors. All technical components and workflows can be reused for further research projects. Additionally, a Trusted Third Party-approach can be used as core element to implement data privacy protection in such heterogeneous family doctor’s settings. Optimisations identified comprise a fully-electronic consent recording using tablet computers, which is part of the project’s extension phase.


Author(s):  
Hesam Izakian

IntroductionBecause of a lack of unique identifiers among datasets, and different data collection standards, record linkage is challenging. Thus, despite the importance of record linkage in unleashing the power of data, there are few software applications built for this purpose. Each software application has unique strengths and weaknesses. Objectives and ApproachData linkage comprises various steps such as selecting linkage identifiers, data cleaning, data pre-processing, calculating the linkage weights for identifiers, and estimating similarity thresholds to decide if two records are true matches. These steps require expertise and are costly for organizations interested in data sharing. Although data linkage software applications have been developed, there are drawbacks with these applications. They are either costly, difficult to use, not able to preserve the privacy of individuals, not able to handle big datasets, or perform poorly in terms of specificity and sensitivity. LinkWise is a software application developed to resolve these issues. ResultsLinkWise is a probabilistic modern linkage software implemented using Microsoft C#.Net. The following features are implemented in this software: automated all data linkage steps, a simple and user friendly interface, ability to link both unencrypted and encrypted data (privacy preserving record linkage), transparent linkage algorithm (not a black box), ability to perform incremental linkage (linking new data to previously linked data), ability to handle millions of records, ability to run on multiple processors to reduce run time, and high specificity and sensitivity. The software was tested over many datasets with various characteristics (e.g., different data fields, data formats, number of records, various amount of noise etc.). Results show that it is able to link data with a high specificity and sensitivity in a reasonable time. Conclusion/ImplicationsLinkWise is a software application designed to address many issues arising in the process of data linkage. The software automated all steps of data linkage and preserves the privacy of individuals. It is very easy to use and technical background knowledge is not required to work with this software.


2021 ◽  
Vol 1 (1) ◽  
pp. 1-8
Author(s):  
Almas Ummi Fatharina ◽  
Sri Sugiarsi ◽  
Trismianto Asmo Sutrisno

Abstract Release of medical information must be subject to applicable procedures and must be with the patient's consent. Patients must make a stamped written statement that has authorized a third party to request medical data from a doctor. The purpose of this study is to determine the policy of releasing medical information and the flow of procedures for releasing medical information to the insurer. The research method in this study is to use a literature review design, namely research that examines research articles on the release of medical information to insurance parties by comparing, summarizing, and drawing conclusions. The search strategy used keywords and operator bundles used in this study, namely "medical records" or "information release" or "insurers". The result of the research is that a hospital is in the process of releasing medical information using policies in the form of SOPs, cooperation agreements with insurance parties, and orally. In addition, there are hospitals that have different procedures for releasing medical information because they do not only serve one insurance party, but there are several insurance parties that are served such as BPJS, Jasa Raharja, and Askes. However, in the process of releasing medical information, there are hospitals that are not yet in accordance with the flow of medical information release procedures that have been determined by the Hospital. Therefore, the hospital conducts outreach on the flow of procedures for releasing medical information so that the officer in charge has a better understanding of the release of medical information. Keyword : medical records, information release, insurers Abstrak Pelepasan informasi medis harus mengacu pada prosedur yang berlaku dan harus dengan persetujuan pasien. Pasien harus membuat pernyataan tertulis bermaterai bahwa telah memberi kuasa kepada pihak ketiga untuk meminta data medis dari dokter. Tujuan penelitian ini adalah untuk mengetahui kebijakan pelepasan informasi medis dan alur prosedur pelepasan informasi medis kepada pihak asuransi. Metode penelitian dalam penelitian ini adalah menggunakan desain literature review yaitu penelitian yang mengkaji artikel-artikel penelitian tentang Pelepasan Informasi Medis Kepada Pihak Asuransi dengan cara membandingkan, meringkas, dan mengambil kesimpulan. Strategi pencarian menggunakan keyword dan booelan operator yang digunakan dalam penelitian ini yaitu “rekam medis” or “pelepasan informasi” or “pihak asuransi”. Hasil penelitian terdapat Rumah Sakit yang dalam proses pelepasan informasi medis menggunakan kebijakan dalam bentuk SOP, perjanjian kerjasama dengan pihak asuransi, dan secara lisan. Selain itu terdapat Rumah Sakit memiliki alur prosedur pelepasan informasi medis yang berbeda-beda karena tidak hanya melayani satu pihak asuransi saja, tetapi ada beberapa pihak asuransi yang dilayani seperti BPJS, Jasa Raharja, dan Askes. Akan tetapi dalam proses pelepasan informasi medis terdapat Rumah Sakit yang belum sesuai dengan alur prosedur pelepasan informasi medis yang telah ditentukan oleh Rumah Sakit. Oleh karena itu pihak Rumah Sakit melakukan sosialisasi mengenai alur prosedur pelepasan informasi medis agar petugas yang bertanggungjawab lebih paham mengenai pelepasan informasi medis.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Liang Huang ◽  
Hyung-Hyo Lee

With the features of decentralization and trustlessness and through distributed data storage, point-to-point transmission, and encryption algorithms, blockchain has shed new light on the security and protection of medical data, and it can resolve the contradiction between data sharing and privacy protection with proper security strategies. In this paper, we integrate the strengths of both blockchain and cloud computing and build the privacy protection scheme for medical data based on blockchain and cloud computing. This scheme introduces cloud computing and provides services to blockchain nodes with cloud server computing; meanwhile, it collects, analyzes, processes, and maintains medical data in the identity authentication interface and solves the insufficient computing abilities of some nodes in blockchain so as to verify the authenticity and reliability of data. The simulation experiment proves that the proposed scheme is effective. It can achieve the secure protection and integrity verification of medical data and address the problems of high computing complexity, data sharing, and privacy protection.


Sign in / Sign up

Export Citation Format

Share Document