sensitive data
Recently Published Documents

Due to the Internet of Things evolution, the clinical data is exponentially growing and using smart technologies. The generated big biomedical data is confidential, as it contains a patient’s personal information and findings. Usually, big biomedical data is stored over the cloud, making it convenient to be accessed and shared. In this view, the data shared for research purposes helps to reveal useful and unexposed aspects. Unfortunately, sharing of such sensitive data also leads to certain privacy threats. Generally, the clinical data is available in textual format (e.g., perception reports). Under the domain of natural language processing, many research studies have been published to mitigate the privacy breaches in textual clinical data. However, there are still limitations and shortcomings in the current studies that are inevitable to be addressed. In this article, a novel framework for textual medical data privacy has been proposed as Deep-Confidentiality . The proposed framework improves Medical Entity Recognition (MER) using deep neural networks and sanitization compared to the current state-of-the-art techniques. Moreover, the new and generic utility metric is also proposed, which overcomes the shortcomings of the existing utility metric. It provides the true representation of sanitized documents as compared to the original documents. To check our proposed framework’s effectiveness, it is evaluated on the i2b2-2010 NLP challenge dataset, which is considered one of the complex medical data for MER. The proposed framework improves the MER with 7.8% recall, 7% precision, and 3.8% F1-score compared to the existing deep learning models. It also improved the data utility of sanitized documents up to 13.79%, where the value of the k is 3.

Download Full-text

Formal Modelling and Automated Trade-off Analysis of Enforcement Architectures for Cryptographic Access Control in the Cloud

ACM Transactions on Privacy and Security ◽

10.1145/3474056 ◽

2022 ◽

Vol 25 (1) ◽

pp. 1-37

Author(s):

Stefano Berlato ◽

Roberto Carbone ◽

Adam J. Lee ◽

Silvio Ranise

Keyword(s):

Access Control ◽

Data Storage ◽

Service Providers ◽

Cloud Service ◽

Sensitive Data ◽

Trade Offs ◽

Malicious Insiders ◽

Lock In Effect ◽

Lock In ◽

Cryptographic Access Control

To facilitate the adoption of cloud by organizations, Cryptographic Access Control (CAC) is the obvious solution to control data sharing among users while preventing partially trusted Cloud Service Providers (CSP) from accessing sensitive data. Indeed, several CAC schemes have been proposed in the literature. Despite their differences, available solutions are based on a common set of entities—e.g., a data storage service or a proxy mediating the access of users to encrypted data—that operate in different (security) domains—e.g., on-premise or the CSP. However, the majority of these CAC schemes assumes a fixed assignment of entities to domains; this has security and usability implications that are not made explicit and can make inappropriate the use of a CAC scheme in certain scenarios with specific trust assumptions and requirements. For instance, assuming that the proxy runs at the premises of the organization avoids the vendor lock-in effect but may give rise to other security concerns (e.g., malicious insiders attackers). To the best of our knowledge, no previous work considers how to select the best possible architecture (i.e., the assignment of entities to domains) to deploy a CAC scheme for the trust assumptions and requirements of a given scenario. In this article, we propose a methodology to assist administrators in exploring different architectures for the enforcement of CAC schemes in a given scenario. We do this by identifying the possible architectures underlying the CAC schemes available in the literature and formalizing them in simple set theory. This allows us to reduce the problem of selecting the most suitable architectures satisfying a heterogeneous set of trust assumptions and requirements arising from the considered scenario to a decidable Multi-objective Combinatorial Optimization Problem (MOCOP) for which state-of-the-art solvers can be invoked. Finally, we show how we use the capability of solving the MOCOP to build a prototype tool assisting administrators to preliminarily perform a “What-if” analysis to explore the trade-offs among the various architectures and then use available standards and tools (such as TOSCA and Cloudify) for automated deployment in multiple CSPs.

Download Full-text

Secure Selections on Encrypted Multi-writer Streams

ACM Transactions on Privacy and Security ◽

10.1145/3485470 ◽

2022 ◽

Vol 25 (1) ◽

pp. 1-33

Author(s):

Angelo Massimo Perillo ◽

Giuseppe Persiano ◽

Alberto Trombetta

Keyword(s):

Efficient Solutions ◽

Direct Access ◽

Security Model ◽

Sensitive Data ◽

Encrypted Data ◽

The Public ◽

Data Owner ◽

Hard Problems ◽

Pairing Based Cryptography ◽

Access Policies

Performing searches over encrypted data is a very current and active area. Several efficient solutions have been provided for the single-writer scenario in which all sensitive data originate with one party (the Data Owner ) that encrypts and uploads the data to a public repository. Subsequently, the Data Owner accesses the encrypted data through a Query Processor , which has direct access to the public encrypted repository. Motivated by the recent trend in pervasive data collection, we depart from this model and consider a multi-writer scenario in which the data originate with several and mutually untrusted parties, the Data Sources . In this new scenario, the Data Owner provides public parameters so that each Data Source can add encrypted items to the public encrypted stream; moreover, the Data Owner keeps some related secret information needed to generate tokens so that different Query Sources can decrypt different subsets of the encrypted stream, as specified by corresponding access policies. We propose security model for this problem that we call Secure Selective Stream ( SSS ) and give a secure construction for it based on hard problems in Pairing-Based Cryptography. The cryptographic core of our construction is a new primitive, Amortized Orthogonality Encryption , that is crucial for the efficiency of the proposed implementation for SSS .

Download Full-text

C3PO: C loud-based C onfidentiality-preserving C ontinuous Query P r o cessing

ACM Transactions on Privacy and Security ◽

10.1145/3472717 ◽

2022 ◽

Vol 25 (1) ◽

pp. 1-36

Author(s):

Savvas Savvides ◽

Seema Kumar ◽

Julian James Stephen ◽

Patrick Eugster

Keyword(s):

Low Cost ◽

Empirical Evaluation ◽

Raspberry Pi ◽

Sensitive Data ◽

Cloud Resource ◽

Small Device ◽

Computationally Intensive ◽

Iot Devices ◽

Cloud Infrastructures ◽

Memory Resources

With the advent of the Internet of things (IoT), billions of devices are expected to continuously collect and process sensitive data (e.g., location, personal health factors). Due to the limited computational capacity available on IoT devices, the current de facto model for building IoT applications is to send the gathered data to the cloud for computation. While building private cloud infrastructures for handling large amounts of data streams can be expensive, using low-cost public (untrusted) cloud infrastructures for processing continuous queries including sensitive data leads to strong concerns over data confidentiality. This article presents C3PO, a confidentiality-preserving, continuous query processing engine, that leverages the public cloud. The key idea is to intelligently utilize partially homomorphic and property-preserving encryption to perform as many computationally intensive operations as possible—without revealing plaintext—in the untrusted cloud. C3PO provides simple abstractions to the developer to hide the complexities of applying complex cryptographic primitives, reasoning about the performance of such primitives, deciding which computations can be executed in an untrusted tier, and optimizing cloud resource usage. An empirical evaluation with several benchmarks and case studies shows the feasibility of our approach. We consider different classes of IoT devices that differ in their computational and memory resources (from a Raspberry Pi 3 to a very small device with a Cortex-M3 microprocessor) and through the use of optimizations, we demonstrate the feasibility of using partially homomorphic and property-preserving encryption on IoT devices.

Download Full-text

Computing Blindfolded on Data Homomorphically Encrypted under Multiple Keys: A Survey

ACM Computing Surveys ◽

10.1145/3477139 ◽

2022 ◽

Vol 54 (9) ◽

pp. 1-37

Author(s):

Asma Aloufi ◽

Peizhao Hu ◽

Yongsoo Song ◽

Kristin Lauter

Keyword(s):

Homomorphic Encryption ◽

Lessons Learned ◽

Secret Key ◽

Sensitive Data ◽

Encrypted Data ◽

Outsourced Computation ◽

Secret Keys ◽

Comprehensive Survey ◽

Multiple Keys ◽

Cryptographic Techniques

With capability of performing computations on encrypted data without needing the secret key, homomorphic encryption (HE) is a promising cryptographic technique that makes outsourced computations secure and privacy-preserving. A decade after Gentry’s breakthrough discovery of how we might support arbitrary computations on encrypted data, many studies followed and improved various aspects of HE, such as faster bootstrapping and ciphertext packing. However, the topic of how to support secure computations on ciphertexts encrypted under multiple keys does not receive enough attention. This capability is crucial in many application scenarios where data owners want to engage in joint computations and are preferred to protect their sensitive data under their own secret keys. Enabling this capability is a non-trivial task. In this article, we present a comprehensive survey of the state-of-the-art multi-key techniques and schemes that target different systems and threat models. In particular, we review recent constructions based on Threshold Homomorphic Encryption (ThHE) and Multi-Key Homomorphic Encryption (MKHE). We analyze these cryptographic techniques and schemes based on a new secure outsourced computation model and examine their complexities. We share lessons learned and draw observations for designing better schemes with reduced overheads.

Download Full-text

Evolutionary tree-based quasi identifier and federated gradient privacy preservations over big healthcare data

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i1.pp903-913 ◽

2022 ◽

Vol 12 (1) ◽

pp. 903

Author(s):

Sujatha Krishna ◽

Udayarani Vinayaka Murthy

Keyword(s):

Privacy Preservation ◽

Learning Model ◽

Information Loss ◽

Evolutionary Tree ◽

Sensitive Data ◽

Protection Mechanism ◽

Healthcare Data ◽

Numerical Attributes ◽

Optimal Balance ◽

Influence Of Noise

<span>Big data has remodeled the way organizations supervise, examine and leverage data in any industry. To safeguard sensitive data from public contraventions, several countries investigated this issue and carried out privacy protection mechanism. With the aid of quasi-identifiers privacy is not said to be preserved to a greater extent. This paper proposes a method called evolutionary tree-based quasi-identifier and federated gradient (ETQI-FD) for privacy preservations over big healthcare data. The first step involved in the ETQI-FD is learning quasi-identifiers. Learning quasi-identifiers by employing information loss function separately for categorical and numerical attributes accomplishes both the largest dissimilarities and partition without a comprehensive exploration between tuples of features or attributes. Next with the learnt quasi-identifiers, privacy preservation of data item is made by applying federated gradient arbitrary privacy preservation learning model. This model attains optimal balance between privacy and accuracy. In the federated gradient privacy preservation learning model, we evaluate the determinant of each attribute to the outputs. Then injecting Adaptive Lorentz noise to data attributes our ETQI-FD significantly minimizes the influence of noise on the final results and therefore contributing to privacy and accuracy. An experimental evaluation of ETQI-FD method achieves better accuracy and privacy than the existing methods.</span>

Download Full-text

Oblivious algebraic data types

Proceedings of the ACM on Programming Languages ◽

10.1145/3498713 ◽

2022 ◽

Vol 6 (POPL) ◽

pp. 1-29

Author(s):

Qianchuan Ye ◽

Benjamin Delaware

Keyword(s):

Programming Languages ◽

Type System ◽

Secure Computation ◽

Data Types ◽

Sensitive Data ◽

Private Data ◽

Algebraic Data Types ◽

Recursive Data Structures ◽

Recursive Data ◽

Cryptographic Techniques

Secure computation allows multiple parties to compute joint functions over private data without leaking any sensitive data, typically using powerful cryptographic techniques. Writing secure applications using these techniques directly can be challenging, resulting in the development of several programming languages and compilers that aim to make secure computation accessible. Unfortunately, many of these languages either lack or have limited support for rich recursive data structures, like trees. In this paper, we propose a novel representation of structured data types, which we call oblivious algebraic data types, and a language for writing secure computations using them. This language combines dependent types with constructs for oblivious computation, and provides a security-type system which ensures that adversaries can learn nothing more than the result of a computation. Using this language, authors can write a single function over private data, and then easily build an equivalent secure computation according to a desired public view of their data.

Download Full-text

A Survey of Privacy Vulnerabilities of Mobile Device Sensors

ACM Computing Surveys ◽

10.1145/3510579 ◽

2022 ◽

Author(s):

Paula Delgado-Santos ◽

Giuseppe Stragapede ◽

Ruben Tolosana ◽

Richard Guest ◽

Farzin Deravi ◽

...

Keyword(s):

Mobile Devices ◽

Mobile Device ◽

State Of The Art ◽

The State ◽

Sensitive Data ◽

Data Utility ◽

Open Research ◽

Research Questions ◽

Behaviour Recognition ◽

Critical Aspects

The number of mobile devices, such as smartphones and smartwatches, is relentlessly increasing to almost 6.8 billion by 2022, and along with it, the amount of personal and sensitive data captured by them. This survey overviews the state of the art of what personal and sensitive user attributes can be extracted from mobile device sensors, emphasising critical aspects such as demographics, health and body features, activity and behaviour recognition, etc. In addition, we review popular metrics in the literature to quantify the degree of privacy, and discuss powerful privacy methods to protect the sensitive data while preserving data utility for analysis. Finally, open research questions are presented for further advancements in the field.

Download Full-text

pShare: Privacy-Preserving Ride-Sharing System with Minimum-Detouring Route

Applied Sciences ◽

10.3390/app12020842 ◽

2022 ◽

Vol 12 (2) ◽

pp. 842

Author(s):

Junxin Huang ◽

Yuchuan Luo ◽

Ming Xu ◽

Bowen Hu ◽

Jian Long

Keyword(s):

Service Provider ◽

Time Estimation ◽

Privacy Preserving ◽

Travel Time Estimation ◽

Sensitive Data ◽

Transportation Services ◽

Privacy Leakage ◽

Ride Sharing ◽

Multiple Data ◽

Privacy Problem

Online ride-hailing (ORH) services allow people to enjoy on-demand transportation services through their mobile devices in a short responding time. Despite the great convenience, users need to submit their location information to the ORH service provider, which may incur unexpected privacy problems. In this paper, we mainly study the privacy and utility of the ride-sharing system, which enables multiple riders to share one driver. To solve the privacy problem and reduce the ride-sharing detouring waste, we propose a privacy-preserving ride-sharing system named pShare. To hide users’ precise locations from the service provider, we apply a zone-based travel time estimation approach to privately compute over sensitive data while cloaking each rider’s location in a zone area. To compute the matching results along with the least-detouring route, the service provider first computes the shortest path for each eligible rider combination, then compares the additional traveling time (ATT) of all combinations, and finally selects the combination with minimum ATT. We designed a secure comparing protocol by utilizing the garbled circuit, which enables the ORH server to execute the protocol with a crypto server without privacy leakage. Moreover, we apply the data packing technique, by which multiple data can be packed as one to reduce the communication and computation overhead. Through the theoretical analysis and evaluation results, we prove that pShare is a practical ride-sharing scheme that can find out the sharing riders with minimum ATT in acceptable accuracy while protecting users’ privacy.

Download Full-text

Sensing pedestrian flows for real-time assessment of non-pharmaceutical policy interventions during the COVID-19 pandemic

International Journal for Population Data Science ◽

10.23889/ijpds.v5i4.1688 ◽

2022 ◽

Vol 5 (4) ◽

Author(s):

Jonas Klingwort ◽

Sofie Myriam Marcel Gabrielle De Broe ◽

Sven Alexander Brocker

Keyword(s):

Real Time ◽

Pharmaceutical Policy ◽

Mobility Patterns ◽

Sensitive Data ◽

Fast Evaluation ◽

Social Contacts ◽

Behavioural Changes ◽

Mobility Data ◽

Policy Interventions ◽

Time Assessment

IntroductionTo combat and mitigate the transmission of the SARS-CoV-2 virus, reducing the number of social contacts within a population is highly effective. Non-pharmaceutical policy interventions, e.g. stay-at-home orders, closing schools, universities, and (non-essential) businesses, are expected to decrease pedestrian flows in public areas, leading to reduced social contacts. The extent to which such interventions show the targeted effect is often measured retrospectively by surveying behavioural changes. Approaches that use data generated through mobile phones are hindered by data confidentiality and privacy regulations and complicated by selection effects. Furthermore, access to such sensitive data is limited. However, a complex pandemic situation requires a fast evaluation of the effectiveness of the introduced interventions aiming to reduce social contacts. Location-based sensor systems installed in cities, providing objective measurements of spatial mobility in the form of pedestrian flows, are suited for such a purpose. These devices record changes in a population’s behaviour in real-time, do not have privacy problems as they do not identify persons, and have no selection problems due to ownership of a device. ObjectiveThis work aimed to analyse location-based sensor measurements of pedestrian flows in 49 metropolitan areas at 100 locations in Germany to study whether such technology is suitable for the real-time assessment of behavioural changes during a phase of several different pandemic-related policy interventions. MethodsSpatial mobility data of pedestrian flows was linked with policy interventions using the date as a unique linkage key. Data was visualised to observe potential changes in pedestrian flows before or after interventions. Furthermore, differences in time series of pedestrian counts between the pandemic and the pre-pandemic year were analysed. ResultsThe sensors detected changes in mobility patterns even before policy interventions were enacted. Compared to the pre-pandemic year, pedestrian counts were 85% lower. ConclusionsThe study illustrated the practical value of sensor-based real-time measurements when linked with non-pharmaceutical policy intervention data. This study’s core contribution is that the sensors detected behavioural changes before enacting or loosening non-pharmaceutical policy interventions. Therefore, such technologies should be considered in the future by policymakers for crisis management and policy evaluation.

Download Full-text

sensitive dataRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Deep-Confidentiality : An IoT-Enabled Privacy-Preserving Framework for Unstructured Big Biomedical Data

Formal Modelling and Automated Trade-off Analysis of Enforcement Architectures for Cryptographic Access Control in the Cloud

Secure Selections on Encrypted Multi-writer Streams

C3PO: C loud-based C onfidentiality-preserving C ontinuous Query P r o cessing

Computing Blindfolded on Data Homomorphically Encrypted under Multiple Keys: A Survey

Evolutionary tree-based quasi identifier and federated gradient privacy preservations over big healthcare data

Oblivious algebraic data types

A Survey of Privacy Vulnerabilities of Mobile Device Sensors

pShare: Privacy-Preserving Ride-Sharing System with Minimum-Detouring Route

Sensing pedestrian flows for real-time assessment of non-pharmaceutical policy interventions during the COVID-19 pandemic

sensitive data
Recently Published Documents