scholarly journals Towards a Systematic Analysis of Privacy Definitions

Author(s):  
Bing-Rong Lin ◽  
Dan Kifer

In statistical privacy, a privacy definition is regarded as a set of algorithms that are allowed to process sensitive data. It is often helpful to consider the complementary view that privacy definitions are also contracts that guide the behavior of algorithms that take in sensitive data and produce sanitized data. Historically, data privacy breaches have been the result of fundamental misunderstandings about what a particular privacy definition guarantees. Privacy definitions are often analyzed using a highly targeted approach: a specific attack strategy is evaluated to determine if a specific type of information can be inferred. If the attack works, one can conclude that the privacy definition is too weak. If it doesn't work, one often gains little information about its security (perhaps a slightly different attack would have worked?). Furthermore, these strategies will not identify cases where a privacy definition protects unnecessary pieces of information. On the other hand, technical results concerning generalizable and systematic analyses of privacy are few in number, but such results have significantly advanced our understanding of the design of privacy definitions. We add to this literature with a novel methodology for analyzing the Bayesian properties of a privacy definition. Its goal is to identify precisely the type of information being protected, hence making it easier to identify (and later remove) unnecessary data protections. Using privacy building blocks (which we refer to as axioms), we turn questions about semantics into mathematical problems -- the construction of a consistent normal form and the subsequent construction of the row cone (which is a geometric object that encapsulates Bayesian guarantees provided by a privacy definition). We apply these ideas to study randomized response, FRAPP/PRAM, and several algorithms that add integer-valued noise to their inputs; we show that their privacy properties can be stated in terms of the protection of various notions of parity of a dataset. Randomized response, in particular, provides unnecessarily strong protections for parity, and so we also show how our methodology can be used to relax privacy definitions.

2022 ◽  
Vol 22 (2) ◽  
pp. 1-21
Author(s):  
Syed Atif Moqurrab ◽  
Adeel Anjum ◽  
Abid Khan ◽  
Mansoor Ahmed ◽  
Awais Ahmad ◽  
...  

Due to the Internet of Things evolution, the clinical data is exponentially growing and using smart technologies. The generated big biomedical data is confidential, as it contains a patient’s personal information and findings. Usually, big biomedical data is stored over the cloud, making it convenient to be accessed and shared. In this view, the data shared for research purposes helps to reveal useful and unexposed aspects. Unfortunately, sharing of such sensitive data also leads to certain privacy threats. Generally, the clinical data is available in textual format (e.g., perception reports). Under the domain of natural language processing, many research studies have been published to mitigate the privacy breaches in textual clinical data. However, there are still limitations and shortcomings in the current studies that are inevitable to be addressed. In this article, a novel framework for textual medical data privacy has been proposed as Deep-Confidentiality . The proposed framework improves Medical Entity Recognition (MER) using deep neural networks and sanitization compared to the current state-of-the-art techniques. Moreover, the new and generic utility metric is also proposed, which overcomes the shortcomings of the existing utility metric. It provides the true representation of sanitized documents as compared to the original documents. To check our proposed framework’s effectiveness, it is evaluated on the i2b2-2010 NLP challenge dataset, which is considered one of the complex medical data for MER. The proposed framework improves the MER with 7.8% recall, 7% precision, and 3.8% F1-score compared to the existing deep learning models. It also improved the data utility of sanitized documents up to 13.79%, where the value of the  k is 3.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Qi Dou ◽  
Tiffany Y. So ◽  
Meirui Jiang ◽  
Quande Liu ◽  
Varut Vardhanabhuti ◽  
...  

AbstractData privacy mechanisms are essential for rapidly scaling medical training databases to capture the heterogeneity of patient data distributions toward robust and generalizable machine learning systems. In the current COVID-19 pandemic, a major focus of artificial intelligence (AI) is interpreting chest CT, which can be readily used in the assessment and management of the disease. This paper demonstrates the feasibility of a federated learning method for detecting COVID-19 related CT abnormalities with external validation on patients from a multinational study. We recruited 132 patients from seven multinational different centers, with three internal hospitals from Hong Kong for training and testing, and four external, independent datasets from Mainland China and Germany, for validating model generalizability. We also conducted case studies on longitudinal scans for automated estimation of lesion burden for hospitalized COVID-19 patients. We explore the federated learning algorithms to develop a privacy-preserving AI model for COVID-19 medical image diagnosis with good generalization capability on unseen multinational datasets. Federated learning could provide an effective mechanism during pandemics to rapidly develop clinically useful AI across institutions and countries overcoming the burden of central aggregation of large amounts of sensitive data.


2018 ◽  
Vol 2018 ◽  
pp. 1-10
Author(s):  
Hua Dai ◽  
Hui Ren ◽  
Zhiye Chen ◽  
Geng Yang ◽  
Xun Yi

Outsourcing data in clouds is adopted by more and more companies and individuals due to the profits from data sharing and parallel, elastic, and on-demand computing. However, it forces data owners to lose control of their own data, which causes privacy-preserving problems on sensitive data. Sorting is a common operation in many areas, such as machine learning, service recommendation, and data query. It is a challenge to implement privacy-preserving sorting over encrypted data without leaking privacy of sensitive data. In this paper, we propose privacy-preserving sorting algorithms which are on the basis of the logistic map. Secure comparable codes are constructed by logistic map functions, which can be utilized to compare the corresponding encrypted data items even without knowing their plaintext values. Data owners firstly encrypt their data and generate the corresponding comparable codes and then outsource them to clouds. Cloud servers are capable of sorting the outsourced encrypted data in accordance with their corresponding comparable codes by the proposed privacy-preserving sorting algorithms. Security analysis and experimental results show that the proposed algorithms can protect data privacy, while providing efficient sorting on encrypted data.


2018 ◽  
Vol 2018 ◽  
pp. 1-7 ◽  
Author(s):  
Run Xie ◽  
Chanlian He ◽  
Dongqing Xie ◽  
Chongzhi Gao ◽  
Xiaojun Zhang

With the advent of cloud computing, data privacy has become one of critical security issues and attracted much attention as more and more mobile devices are relying on the services in cloud. To protect data privacy, users usually encrypt their sensitive data before uploading to cloud servers, which renders the data utilization to be difficult. The ciphertext retrieval is able to realize utilization over encrypted data and searchable public key encryption is an effective way in the construction of encrypted data retrieval. However, the previous related works have not paid much attention to the design of ciphertext retrieval schemes that are secure against inside keyword-guessing attacks (KGAs). In this paper, we first construct a new architecture to resist inside KGAs. Moreover we present an efficient ciphertext retrieval instance with a designated tester (dCRKS) based on the architecture. This instance is secure under the inside KGAs. Finally, security analysis and efficiency comparison show that the proposal is effective for the retrieval of encrypted data in cloud computing.


2016 ◽  
Vol 13 (1) ◽  
pp. 204-211
Author(s):  
Baghdad Science Journal

The internet is a basic source of information for many specialities and uses. Such information includes sensitive data whose retrieval has been one of the basic functions of the internet. In order to protect the information from falling into the hands of an intruder, a VPN has been established. Through VPN, data privacy and security can be provided. Two main technologies of VPN are to be discussed; IPSec and Open VPN. The complexity of IPSec makes the OpenVPN the best due to the latter’s portability and flexibility to use in many operating systems. In the LAN, VPN can be implemented through Open VPN to establish a double privacy layer(privacy inside privacy). The specific subnet will be used in this paper. The key and certificate will be generated by the server. An authentication and key exchange will be based on standard protocol SSL/TLS. Various operating systems from open source and windows will be used. Each operating system uses a different hardware specification. Tools such as tcpdump and jperf will be used to verify and measure the connectivity and performance. OpenVPN in the LAN is based on the type of operating system, portability and straightforward implementation. The bandwidth which is captured in this experiment is influenced by the operating system rather than the memory and capacity of the hard disk. Relationship and interoperability between each peer and server will be discussed. At the same time privacy for the user in the LAN can be introduced with a minimum specification.


1986 ◽  
Vol 21 (1) ◽  
pp. 5-14
Author(s):  
Benjamin G. Walker

The protection of data in computer-based systems is a serious and growing problem. It is one of the most challenging technical problems in the field of computer science today. The objective of this paper is to provide a technical overview of the problem and to suggest some steps that need to be taken to assure progress in the field toward cost-effective systems that provide adequate protection.The Problem: Protecting the privacy of data in computer systems involves establishing safeguards against accidental disclosure as well as protection against a deliberate attack. During system failures and restart procedures errors in coding procedures often cause data to be stored in the wrong files or put sensitive data out on the printer along with diagnostic information intended for maintenance personnel. You have probably had the experience at some time of being wired into someone else's telephone conversation.


Author(s):  
Eva Hudlicka

Computational affective models are being developed both to elucidate affective mechanisms, and to enhance believability of synthetic agents and robots. Yet in spite of the rapid growth of computational affective modeling, no systematic guidelines exist for model design and analysis. Lack of systematic guidelines contributes to ad hoc design practices, hinders model sharing and re-use, and makes systematic comparison of existing models and theories challenging. Lack of a common computational terminology also hinders cross-disciplinary communication that is essential to advance our understanding of emotions. In this chapter the author proposes a computational analytical framework to provide a basis for systematizing affective model design by: (1) viewing emotion models in terms of two core types: emotion generation and emotion effects, and (2) identifying the generic computational tasks necessary to implement these processes. The chapter then discusses how these computational ‘building blocks' can support the development of design guidelines, and a systematic analysis of distinct emotion theories and alternative means of their implementation.


Sensors ◽  
2020 ◽  
Vol 20 (15) ◽  
pp. 4110
Author(s):  
Matei-Sorin Axente ◽  
Ciprian Dobre ◽  
Radu-Ioan Ciobanu ◽  
Raluca Purnichescu-Purtan

With the rate at which smartphones are currently evolving, more and more of human life will be contained in these devices. At a time when data privacy is extremely important, it is crucial to protect one’s mobile device. In this paper, we propose a new non-intrusive gait recognition based mechanism that can enhance the security of smartphones by rapidly identifying users with a high degree of confidence and securing sensitive data in case of an attack, with a focus on a potential architecture for such an algorithm for the Android environment. The motion sensors on an Android device are used to create a statistical model of a user’s gait, which is later used for identification. Through experimental testing, we prove the capability of our proposed solution by correctly classifying individuals with an accuracy upwards of 90% when tested on data recorded during multiple activities. The experiments, conducted on a low sampling rate and at short time intervals, show the benefits of our solution and highlight the feasibility of an efficient gait recognition mechanism on modern smartphones.


2014 ◽  
Vol 25 (3) ◽  
pp. 48-71 ◽  
Author(s):  
Stepan Kozak ◽  
David Novak ◽  
Pavel Zezula

The general trend in data management is to outsource data to 3rd party systems that would provide data retrieval as a service. This approach naturally brings privacy concerns about the (potentially sensitive) data. Recently, quite extensive research has been done on privacy-preserving outsourcing of traditional exact-match and keyword search. However, not much attention has been paid to outsourcing of similarity search, which is essential in content-based retrieval in current multimedia, sensor or scientific data. In this paper, the authors propose a scheme of outsourcing similarity search. They define evaluation criteria for these systems with an emphasis on usability, privacy and efficiency in real applications. These criteria can be used as a general guideline for a practical system analysis and we use them to survey and mutually compare existing approaches. As the main result, the authors propose a novel dynamic similarity index EM-Index that works for an arbitrary metric space and ensures data privacy and thus is suitable for search systems outsourced for example in a cloud environment. In comparison with other approaches, the index is fully dynamic (update operations are efficient) and its aim is to transfer as much load from clients to the server as possible.


Author(s):  
Divya Asok ◽  
Chitra P. ◽  
Bharathiraja Muthurajan

In the past years, the usage of internet and quantity of digital data generated by large organizations, firms, and governments have paved the way for the researchers to focus on security issues of private data. This collected data is usually related to a definite necessity. For example, in the medical field, health record systems are used for the exchange of medical data. In addition to services based on users' current location, many potential services rely on users' location history or their spatial-temporal provenance. However, most of the collected data contain data identifying individual which is sensitive. With the increase of machine learning applications around every corner of the society, it could significantly contribute to the preservation of privacy of both individuals and institutions. This chapter gives a wider perspective on the current literature on privacy ML and deep learning techniques, along with the non-cryptographic differential privacy approach for ensuring sensitive data privacy.


Sign in / Sign up

Export Citation Format

Share Document