scholarly journals A Sampling-Based Method for Highly Efficient Privacy-Preserving Data Publication

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Guoming Lu ◽  
Xu Zheng ◽  
Jingyuan Duan ◽  
Ling Tian ◽  
Xia Wang

The data publication from multiple contributors has been long considered a fundamental task for data processing in various domains. It has been treated as one prominent prerequisite for enabling AI techniques in wireless networks. With the emergence of diversified smart devices and applications, data held by individuals becomes more pervasive and nontrivial for publication. First, the data are more private and sensitive, as they cover every aspect of daily life, from the incoming data to the fitness data. Second, the publication of such data is also bandwidth-consuming, as they are likely to be stored on mobile devices. The local differential privacy has been considered a novel paradigm for such distributed data publication. However, existing works mostly request the encoding of contents into vector space for publication, which is still costly in network resources. Therefore, this work proposes a novel framework for highly efficient privacy-preserving data publication. Specifically, two sampling-based algorithms are proposed for the histogram publication, which is an important statistic for data analysis. The first algorithm applies a bit-level sampling strategy to both reduce the overall bandwidth and balance the cost among contributors. The second algorithm allows consumers to adjust their focus on different intervals and can properly allocate the sampling ratios to optimize the overall performance. Both the analysis and the validation of real-world data traces have demonstrated the advancement of our work.

2021 ◽  
Vol 2021 (1) ◽  
pp. 64-84
Author(s):  
Ashish Dandekar ◽  
Debabrota Basu ◽  
Stéphane Bressan

AbstractThe calibration of noise for a privacy-preserving mechanism depends on the sensitivity of the query and the prescribed privacy level. A data steward must make the non-trivial choice of a privacy level that balances the requirements of users and the monetary constraints of the business entity.Firstly, we analyse roles of the sources of randomness, namely the explicit randomness induced by the noise distribution and the implicit randomness induced by the data-generation distribution, that are involved in the design of a privacy-preserving mechanism. The finer analysis enables us to provide stronger privacy guarantees with quantifiable risks. Thus, we propose privacy at risk that is a probabilistic calibration of privacy-preserving mechanisms. We provide a composition theorem that leverages privacy at risk. We instantiate the probabilistic calibration for the Laplace mechanism by providing analytical results.Secondly, we propose a cost model that bridges the gap between the privacy level and the compensation budget estimated by a GDPR compliant business entity. The convexity of the proposed cost model leads to a unique fine-tuning of privacy level that minimises the compensation budget. We show its effectiveness by illustrating a realistic scenario that avoids overestimation of the compensation budget by using privacy at risk for the Laplace mechanism. We quantitatively show that composition using the cost optimal privacy at risk provides stronger privacy guarantee than the classical advanced composition. Although the illustration is specific to the chosen cost model, it naturally extends to any convex cost model. We also provide realistic illustrations of how a data steward uses privacy at risk to balance the trade-off between utility and privacy.


2015 ◽  
Vol 2015 ◽  
pp. 1-8 ◽  
Author(s):  
Katsuhiro Honda ◽  
Toshiya Oda ◽  
Daiji Tanaka ◽  
Akira Notsu

In many real world data analysis tasks, it is expected that we can get much more useful knowledge by utilizing multiple databases stored in different organizations, such as cooperation groups, state organs, and allied countries. However, in many such organizations, they often hesitate to publish their databases because of privacy and security issues although they believe the advantages of collaborative analysis. This paper proposes a novel collaborative framework for utilizing vertically partitioned cooccurrence matrices in fuzzy co-cluster structure estimation, in which cooccurrence information among objects and items is separately stored in several sites. In order to utilize such distributed data sets without fear of information leaks, a privacy preserving procedure is introduced to fuzzy clustering for categorical multivariate data (FCCM). Withholding each element of cooccurrence matrices, only object memberships are shared by multiple sites and their (implicit) joint co-cluster structures are revealed through an iterative clustering process. Several experimental results demonstrate that collaborative analysis can contribute to revealing global intrinsic co-cluster structures of separate matrices rather than individual site-wise analysis. The novel framework makes it possible for many private and public organizations to share common data structural knowledge without fear of information leaks.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Xu Zheng ◽  
Ke Yan ◽  
Jingyuan Duan ◽  
Wenyi Tang ◽  
Ling Tian

Local differential privacy has been considered the standard measurement for privacy preservation in distributed data collection. Corresponding mechanisms have been designed for multiple types of tasks, like the frequency estimation for categorical values and the mean value estimation for numerical values. However, the histogram publication of numerical values, containing abundant and crucial clues for the whole dataset, has not been thoroughly considered under this measurement. To simply encode data into different intervals upon each query will soon exhaust the bandwidth and the privacy budgets, which is infeasible for real scenarios. Therefore, this paper proposes a highly efficient framework for differentially private histogram publication of numerical values in a distributed environment. The proposed algorithms can efficiently adopt the correlations among multiple queries and achieve an optimal resource consumption. We also conduct extensive experiments on real-world data traces, and the results validate the improvement of proposed algorithms.


Measurement ◽  
2020 ◽  
pp. 108675
Author(s):  
Muhammad Arif ◽  
Jianer Chen ◽  
Guojun Wang ◽  
Oana Geman ◽  
Valentina Emilia Balas

Author(s):  
Dan Wang ◽  
Ju Ren ◽  
Zhibo Wang ◽  
Xiaoyi Pang ◽  
Yaoxue Zhang ◽  
...  

2021 ◽  
Vol 15 (3) ◽  
pp. 1-28
Author(s):  
Xueyan Liu ◽  
Bo Yang ◽  
Hechang Chen ◽  
Katarzyna Musial ◽  
Hongxu Chen ◽  
...  

Stochastic blockmodel (SBM) is a widely used statistical network representation model, with good interpretability, expressiveness, generalization, and flexibility, which has become prevalent and important in the field of network science over the last years. However, learning an optimal SBM for a given network is an NP-hard problem. This results in significant limitations when it comes to applications of SBMs in large-scale networks, because of the significant computational overhead of existing SBM models, as well as their learning methods. Reducing the cost of SBM learning and making it scalable for handling large-scale networks, while maintaining the good theoretical properties of SBM, remains an unresolved problem. In this work, we address this challenging task from a novel perspective of model redefinition. We propose a novel redefined SBM with Poisson distribution and its block-wise learning algorithm that can efficiently analyse large-scale networks. Extensive validation conducted on both artificial and real-world data shows that our proposed method significantly outperforms the state-of-the-art methods in terms of a reasonable trade-off between accuracy and scalability. 1


Author(s):  
Varsha R ◽  
Meghna Manoj Nair ◽  
Siddharth M. Nair ◽  
Amit Kumar Tyagi

The Internet of Things (smart things) is used in many sectors and applications due to recent technological advances. One of such application is in the transportation system, which is of primary use for the users to move from one place to another place. The smart devices which were embedded in vehicles are useful for the passengers to solve his/her query, wherein future vehicles will be fully automated to the advanced stage, i.e. future cars with driverless feature. These autonomous cars will help people a lot to reduce their time and increases their productivity in their respective (associated) business. In today’s generation and in the near future, privacy preserving and trust will be a major concern among users and autonomous vehicles and hence, this paper will be able to provide clarity for the same. Many attempts in previous decade have provided many efficient mechanisms, but they all work only with vehicles along with a driver. However, these mechanisms are not valid and useful for future vehicles. In this paper, we will use deep learning techniques for building trust using recommender systems and Blockchain technology for privacy preserving. We also maintain a certain level of trust via maintaining the highest level of privacy among users living in a particular environment. In this research, we developed a framework that could offer maximum trust or reliable communication to users over the road network. With this, we also preserve privacy of users during traveling, i.e., without revealing identity of respective users from Trusted Third Parties or even Location Based Service in reaching a destination. Thus, Deep Learning based Blockchain Solution (DLBS) is illustrated for providing an efficient recommendation system.


Sign in / Sign up

Export Citation Format

Share Document