sensitive data
Recently Published Documents





2022 ◽  
Vol 22 (2) ◽  
pp. 1-21
Syed Atif Moqurrab ◽  
Adeel Anjum ◽  
Abid Khan ◽  
Mansoor Ahmed ◽  
Awais Ahmad ◽  

Due to the Internet of Things evolution, the clinical data is exponentially growing and using smart technologies. The generated big biomedical data is confidential, as it contains a patient’s personal information and findings. Usually, big biomedical data is stored over the cloud, making it convenient to be accessed and shared. In this view, the data shared for research purposes helps to reveal useful and unexposed aspects. Unfortunately, sharing of such sensitive data also leads to certain privacy threats. Generally, the clinical data is available in textual format (e.g., perception reports). Under the domain of natural language processing, many research studies have been published to mitigate the privacy breaches in textual clinical data. However, there are still limitations and shortcomings in the current studies that are inevitable to be addressed. In this article, a novel framework for textual medical data privacy has been proposed as Deep-Confidentiality . The proposed framework improves Medical Entity Recognition (MER) using deep neural networks and sanitization compared to the current state-of-the-art techniques. Moreover, the new and generic utility metric is also proposed, which overcomes the shortcomings of the existing utility metric. It provides the true representation of sanitized documents as compared to the original documents. To check our proposed framework’s effectiveness, it is evaluated on the i2b2-2010 NLP challenge dataset, which is considered one of the complex medical data for MER. The proposed framework improves the MER with 7.8% recall, 7% precision, and 3.8% F1-score compared to the existing deep learning models. It also improved the data utility of sanitized documents up to 13.79%, where the value of the  k is 3.

2022 ◽  
Vol 25 (1) ◽  
pp. 1-37
Stefano Berlato ◽  
Roberto Carbone ◽  
Adam J. Lee ◽  
Silvio Ranise

To facilitate the adoption of cloud by organizations, Cryptographic Access Control (CAC) is the obvious solution to control data sharing among users while preventing partially trusted Cloud Service Providers (CSP) from accessing sensitive data. Indeed, several CAC schemes have been proposed in the literature. Despite their differences, available solutions are based on a common set of entities—e.g., a data storage service or a proxy mediating the access of users to encrypted data—that operate in different (security) domains—e.g., on-premise or the CSP. However, the majority of these CAC schemes assumes a fixed assignment of entities to domains; this has security and usability implications that are not made explicit and can make inappropriate the use of a CAC scheme in certain scenarios with specific trust assumptions and requirements. For instance, assuming that the proxy runs at the premises of the organization avoids the vendor lock-in effect but may give rise to other security concerns (e.g., malicious insiders attackers). To the best of our knowledge, no previous work considers how to select the best possible architecture (i.e., the assignment of entities to domains) to deploy a CAC scheme for the trust assumptions and requirements of a given scenario. In this article, we propose a methodology to assist administrators in exploring different architectures for the enforcement of CAC schemes in a given scenario. We do this by identifying the possible architectures underlying the CAC schemes available in the literature and formalizing them in simple set theory. This allows us to reduce the problem of selecting the most suitable architectures satisfying a heterogeneous set of trust assumptions and requirements arising from the considered scenario to a decidable Multi-objective Combinatorial Optimization Problem (MOCOP) for which state-of-the-art solvers can be invoked. Finally, we show how we use the capability of solving the MOCOP to build a prototype tool assisting administrators to preliminarily perform a “What-if” analysis to explore the trade-offs among the various architectures and then use available standards and tools (such as TOSCA and Cloudify) for automated deployment in multiple CSPs.

2022 ◽  
Vol 25 (1) ◽  
pp. 1-33
Angelo Massimo Perillo ◽  
Giuseppe Persiano ◽  
Alberto Trombetta

Performing searches over encrypted data is a very current and active area. Several efficient solutions have been provided for the single-writer scenario in which all sensitive data originate with one party (the Data Owner ) that encrypts and uploads the data to a public repository. Subsequently, the Data Owner accesses the encrypted data through a Query Processor , which has direct access to the public encrypted repository. Motivated by the recent trend in pervasive data collection, we depart from this model and consider a multi-writer scenario in which the data originate with several and mutually untrusted parties, the Data Sources . In this new scenario, the Data Owner provides public parameters so that each Data Source can add encrypted items to the public encrypted stream; moreover, the Data Owner keeps some related secret information needed to generate tokens so that different Query Sources can decrypt different subsets of the encrypted stream, as specified by corresponding access policies. We propose security model for this problem that we call Secure Selective Stream ( SSS ) and give a secure construction for it based on hard problems in Pairing-Based Cryptography. The cryptographic core of our construction is a new primitive, Amortized Orthogonality Encryption , that is crucial for the efficiency of the proposed implementation for SSS .

2022 ◽  
Vol 25 (1) ◽  
pp. 1-36
Savvas Savvides ◽  
Seema Kumar ◽  
Julian James Stephen ◽  
Patrick Eugster

With the advent of the Internet of things (IoT), billions of devices are expected to continuously collect and process sensitive data (e.g., location, personal health factors). Due to the limited computational capacity available on IoT devices, the current de facto model for building IoT applications is to send the gathered data to the cloud for computation. While building private cloud infrastructures for handling large amounts of data streams can be expensive, using low-cost public (untrusted) cloud infrastructures for processing continuous queries including sensitive data leads to strong concerns over data confidentiality. This article presents C3PO, a confidentiality-preserving, continuous query processing engine, that leverages the public cloud. The key idea is to intelligently utilize partially homomorphic and property-preserving encryption to perform as many computationally intensive operations as possible—without revealing plaintext—in the untrusted cloud. C3PO provides simple abstractions to the developer to hide the complexities of applying complex cryptographic primitives, reasoning about the performance of such primitives, deciding which computations can be executed in an untrusted tier, and optimizing cloud resource usage. An empirical evaluation with several benchmarks and case studies shows the feasibility of our approach. We consider different classes of IoT devices that differ in their computational and memory resources (from a Raspberry Pi 3 to a very small device with a Cortex-M3 microprocessor) and through the use of optimizations, we demonstrate the feasibility of using partially homomorphic and property-preserving encryption on IoT devices.

2022 ◽  
Vol 54 (9) ◽  
pp. 1-37
Asma Aloufi ◽  
Peizhao Hu ◽  
Yongsoo Song ◽  
Kristin Lauter

With capability of performing computations on encrypted data without needing the secret key, homomorphic encryption (HE) is a promising cryptographic technique that makes outsourced computations secure and privacy-preserving. A decade after Gentry’s breakthrough discovery of how we might support arbitrary computations on encrypted data, many studies followed and improved various aspects of HE, such as faster bootstrapping and ciphertext packing. However, the topic of how to support secure computations on ciphertexts encrypted under multiple keys does not receive enough attention. This capability is crucial in many application scenarios where data owners want to engage in joint computations and are preferred to protect their sensitive data under their own secret keys. Enabling this capability is a non-trivial task. In this article, we present a comprehensive survey of the state-of-the-art multi-key techniques and schemes that target different systems and threat models. In particular, we review recent constructions based on Threshold Homomorphic Encryption (ThHE) and Multi-Key Homomorphic Encryption (MKHE). We analyze these cryptographic techniques and schemes based on a new secure outsourced computation model and examine their complexities. We share lessons learned and draw observations for designing better schemes with reduced overheads.

Sujatha Krishna ◽  
Udayarani Vinayaka Murthy

<span>Big data has remodeled the way organizations supervise, examine and leverage data in any industry. To safeguard sensitive data from public contraventions, several countries investigated this issue and carried out privacy protection mechanism. With the aid of quasi-identifiers privacy is not said to be preserved to a greater extent. This paper proposes a method called evolutionary tree-based quasi-identifier and federated gradient (ETQI-FD) for privacy preservations over big healthcare data. The first step involved in the ETQI-FD is learning quasi-identifiers. Learning quasi-identifiers by employing information loss function separately for categorical and numerical attributes accomplishes both the largest dissimilarities and partition without a comprehensive exploration between tuples of features or attributes. Next with the learnt quasi-identifiers, privacy preservation of data item is made by applying federated gradient arbitrary privacy preservation learning model. This model attains optimal balance between privacy and accuracy. In the federated gradient privacy preservation learning model, we evaluate the determinant of each attribute to the outputs. Then injecting Adaptive Lorentz noise to data attributes our ETQI-FD significantly minimizes the influence of noise on the final results and therefore contributing to privacy and accuracy. An experimental evaluation of ETQI-FD method achieves better accuracy and privacy than the existing methods.</span>

2022 ◽  
Vol 6 (POPL) ◽  
pp. 1-29
Qianchuan Ye ◽  
Benjamin Delaware

Secure computation allows multiple parties to compute joint functions over private data without leaking any sensitive data, typically using powerful cryptographic techniques. Writing secure applications using these techniques directly can be challenging, resulting in the development of several programming languages and compilers that aim to make secure computation accessible. Unfortunately, many of these languages either lack or have limited support for rich recursive data structures, like trees. In this paper, we propose a novel representation of structured data types, which we call oblivious algebraic data types, and a language for writing secure computations using them. This language combines dependent types with constructs for oblivious computation, and provides a security-type system which ensures that adversaries can learn nothing more than the result of a computation. Using this language, authors can write a single function over private data, and then easily build an equivalent secure computation according to a desired public view of their data.

2022 ◽  
Paula Delgado-Santos ◽  
Giuseppe Stragapede ◽  
Ruben Tolosana ◽  
Richard Guest ◽  
Farzin Deravi ◽  

The number of mobile devices, such as smartphones and smartwatches, is relentlessly increasing to almost 6.8 billion by 2022, and along with it, the amount of personal and sensitive data captured by them. This survey overviews the state of the art of what personal and sensitive user attributes can be extracted from mobile device sensors, emphasising critical aspects such as demographics, health and body features, activity and behaviour recognition, etc. In addition, we review popular metrics in the literature to quantify the degree of privacy, and discuss powerful privacy methods to protect the sensitive data while preserving data utility for analysis. Finally, open research questions are presented for further advancements in the field.

2022 ◽  
Vol 12 (2) ◽  
pp. 842
Junxin Huang ◽  
Yuchuan Luo ◽  
Ming Xu ◽  
Bowen Hu ◽  
Jian Long

Online ride-hailing (ORH) services allow people to enjoy on-demand transportation services through their mobile devices in a short responding time. Despite the great convenience, users need to submit their location information to the ORH service provider, which may incur unexpected privacy problems. In this paper, we mainly study the privacy and utility of the ride-sharing system, which enables multiple riders to share one driver. To solve the privacy problem and reduce the ride-sharing detouring waste, we propose a privacy-preserving ride-sharing system named pShare. To hide users’ precise locations from the service provider, we apply a zone-based travel time estimation approach to privately compute over sensitive data while cloaking each rider’s location in a zone area. To compute the matching results along with the least-detouring route, the service provider first computes the shortest path for each eligible rider combination, then compares the additional traveling time (ATT) of all combinations, and finally selects the combination with minimum ATT. We designed a secure comparing protocol by utilizing the garbled circuit, which enables the ORH server to execute the protocol with a crypto server without privacy leakage. Moreover, we apply the data packing technique, by which multiple data can be packed as one to reduce the communication and computation overhead. Through the theoretical analysis and evaluation results, we prove that pShare is a practical ride-sharing scheme that can find out the sharing riders with minimum ATT in acceptable accuracy while protecting users’ privacy.

Jonas Klingwort ◽  
Sofie Myriam Marcel Gabrielle De Broe ◽  
Sven Alexander Brocker

IntroductionTo combat and mitigate the transmission of the SARS-CoV-2 virus, reducing the number of social contacts within a population is highly effective. Non-pharmaceutical policy interventions, e.g. stay-at-home orders, closing schools, universities, and (non-essential) businesses, are expected to decrease pedestrian flows in public areas, leading to reduced social contacts. The extent to which such interventions show the targeted effect is often measured retrospectively by surveying behavioural changes. Approaches that use data generated through mobile phones are hindered by data confidentiality and privacy regulations and complicated by selection effects. Furthermore, access to such sensitive data is limited. However, a complex pandemic situation requires a fast evaluation of the effectiveness of the introduced interventions aiming to reduce social contacts. Location-based sensor systems installed in cities, providing objective measurements of spatial mobility in the form of pedestrian flows, are suited for such a purpose. These devices record changes in a population’s behaviour in real-time, do not have privacy problems as they do not identify persons, and have no selection problems due to ownership of a device. ObjectiveThis work aimed to analyse location-based sensor measurements of pedestrian flows in 49 metropolitan areas at 100 locations in Germany to study whether such technology is suitable for the real-time assessment of behavioural changes during a phase of several different pandemic-related policy interventions. MethodsSpatial mobility data of pedestrian flows was linked with policy interventions using the date as a unique linkage key. Data was visualised to observe potential changes in pedestrian flows before or after interventions. Furthermore, differences in time series of pedestrian counts between the pandemic and the pre-pandemic year were analysed. ResultsThe sensors detected changes in mobility patterns even before policy interventions were enacted. Compared to the pre-pandemic year, pedestrian counts were 85% lower. ConclusionsThe study illustrated the practical value of sensor-based real-time measurements when linked with non-pharmaceutical policy intervention data. This study’s core contribution is that the sensors detected behavioural changes before enacting or loosening non-pharmaceutical policy interventions. Therefore, such technologies should be considered in the future by policymakers for crisis management and policy evaluation.

Sign in / Sign up

Export Citation Format

Share Document