CGM

2021 ◽  
Vol 14 (11) ◽  
pp. 2258-2270
Author(s):  
Ergute Bao ◽  
Yin Yang ◽  
Xiaokui Xiao ◽  
Bolin Ding

Local differential privacy (LDP) is a well-established privacy protection scheme for collecting sensitive data, which has been integrated into major platforms such as iOS, Chrome, and Windows. The main idea is that each individual randomly perturbs her data on her local device, and only uploads the noisy version to an untrusted data aggregator. This paper focuses on the collection of streaming data consisting of regular updates, e.g. , daily app usage. Such streams, when aggregated over a large population, often exhibit strong autocorrelations , e.g. , the average usage of an app usually does not change dramatically from one day to the next. To our knowledge, this property has been largely neglected in existing LDP mechanisms. Consequently, data collected with current LDP methods often exhibit unrealistically violent fluctuations due to the added noise, drowning the overall trend, as shown in our experiments. This paper proposes a novel correlated Gaussian mechanism ( CGM ) for enforcing (ϵ, δ)-LDP on streaming data collection, which reduces noise by exploiting public-known autocorrelation patterns of the aggregated data. This is done through non-trivial modifications to the core of the underlying Gaussian Mechanism; in particular, CGM injects temporally correlated noise, computed through an optimization program that takes into account the given autocorrelation pattern, data value range, and utility metric. CGM comes with formal proof of correctness, and consumes negligible computational resources. Extensive experiments using real datasets from different application domains demonstrate that CGM achieves consistent and significant utility gains compared to the baseline method of repeatedly running the underlying one-shot LDP mechanism.

Author(s):  
Scott A. McDonald ◽  
Fuminari Miura ◽  
Eric R. A. Vos ◽  
Michiel van Boven ◽  
Hester E. de Melker ◽  
...  

Abstract Background The proportion of SARS-CoV-2 positive persons who are asymptomatic—and whether this proportion is age-dependent—are still open research questions. Because an unknown proportion of reported symptoms among SARS-CoV-2 positives will be attributable to another infection or affliction, the observed, or 'crude' proportion without symptoms may underestimate the proportion of persons without symptoms that are caused by SARS-CoV-2 infection. Methods Based on two rounds of a large population-based serological study comprising test results on seropositivity and self-reported symptom history conducted in April/May and June/July 2020 in the Netherlands (n = 7517), we estimated the proportion of reported symptoms among those persons infected with SARS-CoV-2 that is attributable to this infection, where the set of relevant symptoms fulfills the ECDC case definition of COVID-19, using inferential methods for the attributable risk (AR). Generalised additive regression modelling was used to estimate the age-dependent relative risk (RR) of reported symptoms, and the AR and asymptomatic proportion (AP) were calculated from the fitted RR. Results Using age-aggregated data, the 'crude' AP was 37% but the model-estimated AP was 65% (95% CI 63–68%). The estimated AP varied with age, from 74% (95% CI 65–90%) for < 20 years, to 61% (95% CI 57–65%) for the 50–59 years age-group. Conclusion Whereas the 'crude' AP represents a lower bound for the proportion of persons infected with SARS-CoV-2 without COVID-19 symptoms, the AP as estimated via an attributable risk approach represents an upper bound. Age-specific AP estimates can inform the implementation of public health actions such as targetted virological testing and therefore enhance containment strategies.


2021 ◽  
Vol 21 (2) ◽  
pp. 1-22
Author(s):  
Abhinav Kumar ◽  
Sanjay Kumar Singh ◽  
K Lakshmanan ◽  
Sonal Saxena ◽  
Sameer Shrivastava

The advancements in the Internet of Things (IoT) and cloud services have enabled the availability of smart e-healthcare services in a distant and distributed environment. However, this has also raised major privacy and efficiency concerns that need to be addressed. While sharing clinical data across the cloud that often consists of sensitive patient-related information, privacy is a major challenge. Adequate protection of patients’ privacy helps to increase public trust in medical research. Additionally, DL-based models are complex, and in a cloud-based approach, efficient data processing in such models is complicated. To address these challenges, we propose an efficient and secure cancer diagnostic framework for histopathological image classification by utilizing both differential privacy and secure multi-party computation. For efficient computation, instead of performing the whole operation on the cloud, we decouple the layers into two modules: one for feature extraction using the VGGNet module at the user side and the remaining layers for private prediction over the cloud. The efficacy of the framework is validated on two datasets composed of histopathological images of the canine mammary tumor and human breast cancer. The application of differential privacy preserving to the proposed model makes the model secure and capable of preserving the privacy of sensitive data from any adversary, without significantly compromising the model accuracy. Extensive experiments show that the proposed model efficiently achieves the trade-off between privacy and model performance.


2021 ◽  
Author(s):  
Jude TCHAYE-KONDI ◽  
Yanlong Zhai ◽  
Liehuang Zhu

<div>We address privacy and latency issues in the edge/cloud computing environment while training a centralized AI model. In our particular case, the edge devices are the only data source for the model to train on the central server. Current privacy-preserving and reducing network latency solutions rely on a pre-trained feature extractor deployed on the devices to help extract only important features from the sensitive dataset. However, finding a pre-trained model or pubic dataset to build a feature extractor for certain tasks may turn out to be very challenging. With the large amount of data generated by edge devices, the edge environment does not really lack data, but its improper access may lead to privacy concerns. In this paper, we present DeepGuess , a new privacy-preserving, and latency aware deeplearning framework. DeepGuess uses a new learning mechanism enabled by the AutoEncoder(AE) architecture called Inductive Learning, which makes it possible to train a central neural network using the data produced by end-devices while preserving their privacy. With inductive learning, sensitive data remains on devices and is not explicitly involved in any backpropagation process. The AE’s Encoder is deployed on devices to extracts and transfers important features to the server. To enhance privacy, we propose a new local deferentially private algorithm that allows the Edge devices to apply random noise to features extracted from their sensitive data before transferred to an untrusted server. The experimental evaluation of DeepGuess demonstrates its effectiveness and ability to converge on a series of experiments.</div>


2014 ◽  
Vol 8 (2) ◽  
pp. 13-24 ◽  
Author(s):  
Arkadiusz Liber

Introduction: Medical documentation ought to be accessible with the preservation of its integrity as well as the protection of personal data. One of the manners of its protection against disclosure is anonymization. Contemporary methods ensure anonymity without the possibility of sensitive data access control. it seems that the future of sensitive data processing systems belongs to the personalized method. In the first part of the paper k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, and (k,e)-Anonymity methods were discussed. these methods belong to well - known elementary methods which are the subject of a significant number of publications. As the source papers to this part, Samarati, Sweeney, wang, wong and zhang’s works were accredited. the selection of these publications is justified by their wider research review work led, for instance, by Fung, Wang, Fu and y. however, it should be noted that the methods of anonymization derive from the methods of statistical databases protection from the 70s of 20th century. Due to the interrelated content and literature references the first and the second part of this article constitute the integral whole.Aim of the study: The analysis of the methods of anonymization, the analysis of the methods of protection of anonymized data, the study of a new security type of privacy enabling device to control disclosing sensitive data by the entity which this data concerns.Material and methods: Analytical methods, algebraic methods.Results: Delivering material supporting the choice and analysis of the ways of anonymization of medical data, developing a new privacy protection solution enabling the control of sensitive data by entities which this data concerns.Conclusions: In the paper the analysis of solutions for data anonymization, to ensure privacy protection in medical data sets, was conducted. the methods of: k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, (k,e)-Anonymity, (X,y)-Privacy, lKc-Privacy, l-Diversity, (X,y)-linkability, t-closeness, confidence Bounding and Personalized Privacy were described, explained and analyzed. The analysis of solutions of controlling sensitive data by their owner was also conducted. Apart from the existing methods of the anonymization, the analysis of methods of the protection of anonymized data was included. In particular, the methods of: δ-Presence, e-Differential Privacy, (d,γ)-Privacy, (α,β)-Distributing Privacy and protections against (c,t)-isolation were analyzed. Moreover, the author introduced a new solution of the controlled protection of privacy. the solution is based on marking a protected field and the multi-key encryption of sensitive value. The suggested way of marking the fields is in accordance with Xmlstandard. For the encryption, (n,p) different keys cipher was selected. to decipher the content the p keys of n were used. The proposed solution enables to apply brand new methods to control privacy of disclosing sensitive data.


2019 ◽  
Vol 1 (1) ◽  
pp. 483-491 ◽  
Author(s):  
Makhamisa Senekane

The ubiquity of data, including multi-media data such as images, enables easy mining and analysis of such data. However, such an analysis might involve the use of sensitive data such as medical records (including radiological images) and financial records. Privacy-preserving machine learning is an approach that is aimed at the analysis of such data in such a way that privacy is not compromised. There are various privacy-preserving data analysis approaches such as k-anonymity, l-diversity, t-closeness and Differential Privacy (DP). Currently, DP is a golden standard of privacy-preserving data analysis due to its robustness against background knowledge attacks. In this paper, we report a scheme for privacy-preserving image classification using Support Vector Machine (SVM) and DP. SVM is chosen as a classification algorithm because unlike variants of artificial neural networks, it converges to a global optimum. SVM kernels used are linear and Radial Basis Function (RBF), while ϵ -differential privacy was the DP framework used. The proposed scheme achieved an accuracy of up to 98%. The results obtained underline the utility of using SVM and DP for privacy-preserving image classification.


Author(s):  
Divya Asok ◽  
Chitra P. ◽  
Bharathiraja Muthurajan

In the past years, the usage of internet and quantity of digital data generated by large organizations, firms, and governments have paved the way for the researchers to focus on security issues of private data. This collected data is usually related to a definite necessity. For example, in the medical field, health record systems are used for the exchange of medical data. In addition to services based on users' current location, many potential services rely on users' location history or their spatial-temporal provenance. However, most of the collected data contain data identifying individual which is sensitive. With the increase of machine learning applications around every corner of the society, it could significantly contribute to the preservation of privacy of both individuals and institutions. This chapter gives a wider perspective on the current literature on privacy ML and deep learning techniques, along with the non-cryptographic differential privacy approach for ensuring sensitive data privacy.


2019 ◽  
Vol 3 (Supplement_1) ◽  
pp. S621-S621
Author(s):  
Sudha Seshadri

Abstract Apolipoprotein E is a glycoprotein mediator and regulator of lipid transport and uptake. The APOE-ε4 allele has been associated with higher risk of Alzheimer’s disease and of mortality, but the effect of the less prevalent APOE-ε2 on survival remains elusive. We aggregated data of 38,537 individuals of European ancestry (mean age 65.5 years; 55.6% women) from six large population-based cohorts to determine the association of APOE-ε2, with survival in the general population. During a mean follow-up of 11.7 years, 17,021 individuals died. Compared with homozygous APOE-ε3 carriers, APOE-ε2 carriers were at lower risk of death (hazard ratio,95% confidence interval: 0.94,0.90-0.99; P=1.1*10-2), whereas APOE-ε4 carriers were at increased risk (HR 1.17,1.12-1.21; P=2.8*10-16). Risk was lowest for homozygous APOE-ε2 (HR 0.89,0.74-1.08), and highest for homozygous APOE-ε4 (HR 1.52,1.37-1.70). Results did not differ by sex. The association was unaltered after adjustment for baseline LDL or cardiovascular disease. Larger, multiethnic collaborations are ongoing.


Sign in / Sign up

Export Citation Format

Share Document