Biomedical Data Privacy Enhancement Architecture Based on Multi-Keyword Search Technique

Background Patient privacy is a ubiquitous problem around the world. Many existing studies have demonstrated the potential privacy risks associated with sharing of biomedical data. Owing to the increasing need for data sharing and analysis, health care data privacy is drawing more attention. However, to better protect biomedical data privacy, it is essential to assess the privacy risk in the first place. Objective In China, there is no clear regulation for health systems to deidentify data. It is also not known whether a mechanism such as the Health Insurance Portability and Accountability Act (HIPAA) safe harbor policy will achieve sufficient protection. This study aimed to conduct a pilot study using patient data from Chinese hospitals to understand and quantify the privacy risks of Chinese patients. Methods We used g-distinct analysis to evaluate the reidentification risks with regard to the HIPAA safe harbor approach when applied to Chinese patients’ data. More specifically, we estimated the risks based on the HIPAA safe harbor and limited dataset policies by assuming an attacker has background knowledge of the patient from the public domain. Results The experiments were conducted on 0.83 million patients (with data field of date of birth, gender, and surrogate ZIP codes generated based on home address) across 33 provincial-level administrative divisions in China. Under the Limited Dataset policy, 19.58% (163,262/833,235) of the population could be uniquely identifiable under the g-distinct metric (ie, 1-distinct). In contrast, the Safe Harbor policy is able to significantly reduce privacy risk, where only 0.072% (601/833,235) of individuals are uniquely identifiable, and the majority of the population is 3000 indistinguishable (ie the population is expected to share common attributes with 3000 or less people). Conclusions Through the experiments based on real-world patient data, this work illustrates that the results of g-distinct analysis about Chinese patient privacy risk are similar to those from a previous US study, in which data from different organizations/regions might be vulnerable to different reidentification risks under different policies. This work provides reference to Chinese health care entities for estimating patients’ privacy risk during data sharing, which laid the foundation of privacy risk study about Chinese patients’ data in the future.

Download Full-text

Privacy-Preserving Outsourced Similarity Search

Journal of Database Management ◽

10.4018/jdm.2014070103 ◽

2014 ◽

Vol 25 (3) ◽

pp. 48-71 ◽

Cited By ~ 1

Author(s):

Stepan Kozak ◽

David Novak ◽

Pavel Zezula

Keyword(s):

Similarity Search ◽

Data Privacy ◽

System Analysis ◽

Keyword Search ◽

Evaluation Criteria ◽

Similarity Index ◽

Data Retrieval ◽

Privacy Preserving ◽

Scientific Data ◽

Sensitive Data

The general trend in data management is to outsource data to 3rd party systems that would provide data retrieval as a service. This approach naturally brings privacy concerns about the (potentially sensitive) data. Recently, quite extensive research has been done on privacy-preserving outsourcing of traditional exact-match and keyword search. However, not much attention has been paid to outsourcing of similarity search, which is essential in content-based retrieval in current multimedia, sensor or scientific data. In this paper, the authors propose a scheme of outsourcing similarity search. They define evaluation criteria for these systems with an emphasis on usability, privacy and efficiency in real applications. These criteria can be used as a general guideline for a practical system analysis and we use them to survey and mutually compare existing approaches. As the main result, the authors propose a novel dynamic similarity index EM-Index that works for an arbitrary metric space and ensures data privacy and thus is suitable for search systems outsourced for example in a cloud environment. In comparison with other approaches, the index is fully dynamic (update operations are efficient) and its aim is to transfer as much load from clients to the server as possible.

Download Full-text

Biomedical data privacy: problems, perspectives, and recent advances

Journal of the American Medical Informatics Association ◽

10.1136/amiajnl-2012-001509 ◽

2013 ◽

Vol 20 (1) ◽

pp. 2-6 ◽

Cited By ~ 64

Author(s):

B. A. Malin ◽

K. E. Emam ◽

C. M. O'Keefe

Keyword(s):

Data Privacy ◽

Biomedical Data ◽

Recent Advances

Download Full-text

Public Key Encryption with Keyword Search from Lattices in Multiuser Environments

Mathematical Problems in Engineering ◽

10.1155/2016/6549570 ◽

2016 ◽

Vol 2016 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Daini Wu ◽

Xiaoming Wang ◽

Qingqing Gan

Keyword(s):

Data Privacy ◽

Keyword Search ◽

Random Oracle Model ◽

Random Oracle ◽

Public Key ◽

Public Key Encryption ◽

Encrypted Data ◽

Cloud Server ◽

Learning With Errors ◽

Multiuser Environments

A public key encryption scheme with keyword search capabilities is proposed using lattices for applications in multiuser environments. The proposed scheme enables a cloud server to check if any given encrypted data contains certain keywords specified by multiple users, but the server would not have knowledge of the keywords specified by the users or the contents of the encrypted data, which provides data privacy as well as privacy for user queries in multiuser environments. It can be proven secure under the standard learning with errors assumption in the random oracle model.

Download Full-text

USING DIFFERENT SEARCHING SCHEMAS FOR FUZZY KEYWORD SEARCH OVER CLOUD DATA

Graduate Research in Engineering and Technology ◽

10.47893/gret.2013.1027 ◽

2013 ◽

pp. 41-44

Author(s):

SYEDA FARHA SHAZMEEN ◽

RANGARAJU DEEPIKA

Keyword(s):

Cloud Computing ◽

Data Privacy ◽

Keyword Search ◽

Text Search ◽

Sensitive Data ◽

Cloud Data ◽

Fuzzy Search ◽

Wild Card ◽

Fuzzy Keyword Search ◽

Search And Retrieval

Cloud Computing is a construct that allows you to access applications that actually reside at a location other than our computer or other internet-connected devices, Cloud computing uses internet and central remote servers to maintain data and applications, the data is stored in off-premises and accessing this data through keyword search. So there comes the importance of encrypted cloud data search Traditional keyword search was based on plaintext keyword search, but for protecting data privacy the sensitive data should be encrypted before outsourcing. Fuzzy keyword search greatly enhances system usability by returning the matching files; Fuzzy technique uses approximate full text search and retrieval. Three different Fuzzy Search Schemas, The wild card method, gram based method and tree traverse search scheme, are dicussed and also the efficiency of these algorithms is analyzed.

Download Full-text

Implementing Machine Learning in Interventional Cardiology: The Benefits are Worth the Trouble

10.31219/osf.io/dfwz3 ◽

2021 ◽

Author(s):

Walid Ben Ali ◽

Ahmad Pesaranghader ◽

Robert Avram ◽

Reda Ibrahim ◽

Thomas Modine ◽

...

Keyword(s):

Machine Learning ◽

Data Privacy ◽

High Performance ◽

Interventional Cardiology ◽

Daily Practice ◽

Automated Analysis ◽

Coronary Intervention ◽

Biomedical Data ◽

Clinical Implementation ◽

Automatic Assessment

Driven by recent innovations and technological progress, the increasing quality and amount of biomedical data coupled with the advances in computing power allowed for much progress in artificial intelligence (AI) approaches for health and biomedical research. In interventional cardiology, the hope is for AI to provide automated analysis and deeper interpretation of data from electrocardiography, computed tomography, magnetic resonance imaging, and electronic health records, among others. Furthermore, high-performance predictive models supporting decision-making hold the potential to improve safety, diagnostic and prognostic prediction in patients undergoing interventional cardiology procedures. These applications include robotic-assisted percutaneous coronary intervention procedures and automatic assessment of coronary stenosis during diagnostic coronary angiograms. Machine learning (ML) has been used in these innovations that have improved the field of interventional cardiology, and more recently, deep learning (DL) has emerged as one of the most successful branches of ML in many applications. It remains to be seen if DL approaches will have a major impact on current and future practice. DL-based predictive systems also have several limitations, including lack of interpretability and lack of generalizability due to cohort heterogeneity and low sample sizes. There are also challenges for the clinical implementation of these systems, such as ethical limits and data privacy. This review is intended to bring the attention of health practitioners and interventional cardiologists to the broad and helpful applications of ML and DL algorithms to date in the field. Their implementation challenges in daily practice and future applications in the field of interventional cardiology are also discussed.

Download Full-text

Building a Secure Biomedical Data Sharing Decentralized App (DApp): Tutorial

Journal of Medical Internet Research ◽

10.2196/13601 ◽

2019 ◽

Vol 21 (10) ◽

pp. e13601 ◽

Cited By ~ 2

Author(s):

Matthew Johnson ◽

Michael Jones ◽

Mark Shervey ◽

Joel T Dudley ◽

Noah Zimmerman

Keyword(s):

Data Privacy ◽

Computing System ◽

Use Case ◽

Biomedical Data ◽

New Paradigm ◽

Instructional Resources ◽

Software Developers ◽

Client Server ◽

Smart Contract ◽

Server Architecture

Decentralized apps (DApps) are computer programs that run on a distributed computing system, such as a blockchain network. Unlike the client-server architecture that powers most internet apps, DApps that are integrated with a blockchain network can execute app logic that is guaranteed to be transparent, verifiable, and immutable. This new paradigm has a number of unique properties that are attractive to the biomedical and health care communities. However, instructional resources are scarcely available for biomedical software developers to begin building DApps on a blockchain. Such apps require new ways of thinking about how to build, maintain, and deploy software. This tutorial serves as a complete working prototype of a DApp, motivated by a real use case in biomedical research requiring data privacy. We describe the architecture of a DApp, the implementation details of a smart contract, a sample iPhone operating system (iOS) DApp that interacts with the smart contract, and the development tools and libraries necessary to get started. The code necessary to recreate the app is publicly available.

Download Full-text

Privacy-Preserving and Efficient Public Key Encryption with Keyword Search Based on CP-ABE in Cloud

Cryptography ◽

10.3390/cryptography4040028 ◽

2020 ◽

Vol 4 (4) ◽

pp. 28

Author(s):

Yunhong Zhou ◽

Shihui Zheng ◽

Licheng Wang

Keyword(s):

Access Control ◽

Data Privacy ◽

Keyword Search ◽

Good Method ◽

Privacy Preserving ◽

Public Key ◽

Public Key Encryption ◽

Encrypted Data ◽

Fine Grained ◽

Attribute Based Encryption

In the area of searchable encryption, public key encryption with keyword search (PEKS) has been a critically important and promising technique which provides secure search over encrypted data in cloud computing. PEKS can protect user data privacy without affecting the usage of the data stored in the untrusted cloud server environment. However, most of the existing PEKS schemes concentrate on data users’ rich search functionalities, regardless of their search permission. Attribute-based encryption technology is a good method to solve the security issues, which provides fine-grained access control to the encrypted data. In this paper, we propose a privacy-preserving and efficient public key encryption with keyword search scheme by using the ciphertext-policy attribute-based encryption (CP-ABE) technique to support both fine-grained access control and keyword search over encrypted data simultaneously. We formalize the security definition, and prove that our scheme achieves selective indistinguishability security against an adaptive chosen keyword attack. Finally, we present the performance analysis in terms of theoretical analysis and experimental analysis, and demonstrate the efficiency of our scheme.

Download Full-text

Privacy Preserving Classification of Biomedical Data With Secure Removing of Duplicate Records

Research Anthology on Privatizing and Securing Data ◽

10.4018/978-1-7998-8954-0.ch026 ◽

2021 ◽

pp. 569-588

Author(s):

Boudheb Tarik ◽

Elberrichi Zakaria

Keyword(s):

Data Mining ◽

Data Privacy ◽

Privacy Preserving ◽

Third Party ◽

Distributed Data ◽

Biomedical Data ◽

Collaborative Models ◽

Highly Sensitive ◽

Complete Access

Classifying data is to automatically assign predefined classes to data. It is one of the main applications of data mining. Having complete access to all data is critical for building accurate models. Data can be highly sensitive, such as biomedical data, which cannot be disclosed or shared with third party, because it can harm individuals and organizations. The challenge is how to preserve privacy and usefulness of data. Privacy preserving classification addresses this problem. Collaborative models are constructed over networks without violating the data owners' privacy. In this article, the authors address two problems: privacy records deduplication of the same records and privacy-preserving classification. They propose a randomized hash technic for deduplication and an enhanced privacy preserving classification of biomedical data over horizontally distributed data based on two homomorphic encryptions. No private, intermediate or final results are disclosed. Experimentations show that their solution is efficient and secure without loss of accuracy.

Download Full-text

An Adaptive Biomedical Data Managing Scheme Based on the Blockchain Technique

Applied Sciences ◽

10.3390/app9122494 ◽

2019 ◽

Vol 9 (12) ◽

pp. 2494 ◽

Cited By ~ 2

Author(s):

Ahmed Faeq Hussein ◽

Abbas K. ALZubaidi ◽

Qais Ahmed Habash ◽

Mustafa Musa Jaber

Keyword(s):

Data Privacy ◽

Privacy Preservation ◽

Biomedical Data ◽

Data Types ◽

Health Records ◽

Multiple Data ◽

Biomedical Systems ◽

Unified View ◽

Keyword Searching ◽

Hash Key

A crucial role is played by personal biomedical data when it comes to maintaining proficient access to health records by patients as well as health professionals. However, it is difficult to get a unified view pertaining to health data that have been scattered across various health centers/hospital sections. To be specific, health records are distributed across many places and cannot be integrated easily. In recent years, blockchain has arisen as a promising solution that helps to achieve the sharing of individual biomedical information in a secure way, whilst also having the benefit of privacy preservation because of its immutability. This research puts forward a blockchain-based managing scheme that helps to establish interpretation improvements pertaining to electronic biomedical systems. In this scheme, two blockchains were employed to construct the base, whereby the second blockchain algorithm was used to generate a secure sequence for the hash key that was generated in first blockchain algorithm. This adaptive feature enables the algorithm to use multiple data types and also combines various biomedical images and text records. All data, including keywords, digital records, and the identity of patients, are private key encrypted with a keyword searching function so as to maintain data privacy, access control, and a protected search function. The obtained results, which show a low latency (less than 750 ms) at 400 requests/second, indicate the possibility of its use within several health care units such as hospitals and clinics.

Download Full-text