A Lightweight Matrix Factorization for Recommendation with Local Differential Privacy in Big Data

Author(s):  
Hao Zhou ◽  
Geng Yang ◽  
Yang Xiang ◽  
Yunlu Bai ◽  
Weiya Wang
2020 ◽  
Vol 2020 ◽  
pp. 1-29 ◽  
Author(s):  
Xingxing Xiong ◽  
Shubo Liu ◽  
Dan Li ◽  
Zhaohui Cai ◽  
Xiaoguang Niu

With the advent of the era of big data, privacy issues have been becoming a hot topic in public. Local differential privacy (LDP) is a state-of-the-art privacy preservation technique that allows to perform big data analysis (e.g., statistical estimation, statistical learning, and data mining) while guaranteeing each individual participant’s privacy. In this paper, we present a comprehensive survey of LDP. We first give an overview on the fundamental knowledge of LDP and its frameworks. We then introduce the mainstream privatization mechanisms and methods in detail from the perspective of frequency oracle and give insights into recent studied on private basic statistical estimation (e.g., frequency estimation and mean estimation) and complex statistical estimation (e.g., multivariate distribution estimation and private estimation over complex data) under LDP. Furthermore, we present current research circumstances on LDP including the private statistical learning/inferencing, private statistical data analysis, privacy amplification techniques for LDP, and some application fields under LDP. Finally, we identify future research directions and open challenges for LDP. This survey can serve as a good reference source for the research of LDP to deal with various privacy-related scenarios to be encountered in practice.


2018 ◽  
Vol 36 (3) ◽  
pp. 458-481 ◽  
Author(s):  
Yezheng Liu ◽  
Lu Yang ◽  
Jianshan Sun ◽  
Yuanchun Jiang ◽  
Jinkun Wang

Purpose Academic groups are designed specifically for researchers. A group recommendation procedure is essential to support scholars’ research-based social activities. However, group recommendation methods are rarely applied in online libraries and they often suffer from scalability problem in big data context. The purpose of this paper is to facilitate academic group activities in big data-based library systems by recommending satisfying articles for academic groups. Design/methodology/approach The authors propose a collaborative matrix factorization (CoMF) mechanism and implement paralleled CoMF under Hadoop framework. Its rationale is collaboratively decomposing researcher-article interaction matrix and group-article interaction matrix. Furthermore, three extended models of CoMF are proposed. Findings Empirical studies on CiteULike data set demonstrate that CoMF and three variants outperform baseline algorithms in terms of accuracy and robustness. The scalability evaluation of paralleled CoMF shows its potential value in scholarly big data environment. Research limitations/implications The proposed methods fill the gap of group-article recommendation in online libraries domain. The proposed methods have enriched the group recommendation methods by considering the interaction effects between groups and members. The proposed methods are the first attempt to implement group recommendation methods in big data contexts. Practical implications The proposed methods can improve group activity effectiveness and information shareability in academic groups, which are beneficial to membership retention and enhance the service quality of online library systems. Furthermore, the proposed methods are applicable to big data contexts and make library system services more efficient. Social implications The proposed methods have potential value to improve scientific collaboration and research innovation. Originality/value The proposed CoMF method is a novel group recommendation method based on the collaboratively decomposition of researcher-article matrix and group-article matrix. The process indirectly reflects the interaction between groups and members, which accords with actual library environments and provides an interpretable recommendation result.


Author(s):  
Trupti Vishwambhar Kenekar ◽  
Ajay R. Dani

As Big Data is group of structured, unstructured and semi-structure data collected from various sources, it is important to mine and provide privacy to individual data. Differential Privacy is one the best measure which provides strong privacy guarantee. The chapter proposed differentially private frequent item set mining using map reduce requires less time for privately mining large dataset. The chapter discussed problem of preserving data privacy, different challenges to preserving data privacy in big data environment, Data privacy techniques and their applications to unstructured data. The analyses of experimental results on structured and unstructured data set are also presented.


Author(s):  
Bharat Singh ◽  
Om Prakash Vyas

Now a day's application deal with Big Data has tremendously been used in the popular areas. To tackle with such kind of data various approaches have been developed by researchers in the last few decades. A recent investigated techniques to factored the data matrix through a known latent factor in a lower size space is the so called matrix factorization. In addition, one of the problems with the NMF approaches, its randomized valued could not provide absolute optimization in limited iteration, but having local optimization. Due to this, the authors have proposed a new approach that considers the initial values of the decomposition to tackle the issues of computationally expensive. They have devised an algorithm for initializing the values of the decomposed matrix based on the PSO. In this paper, the auhtors have intended a genetic algorithm based technique while incorporating the nonnegative matrix factorization. Through the experimental result, they will show the proposed method converse very fast in comparison to other low rank approximation like simple NMF multiplicative, and ACLS technique.


2020 ◽  
Vol 36 (4) ◽  
pp. 1067-1074
Author(s):  
James Bailie

Differential privacy (DP) has emerged in the computer science literature as a measure of the impact on an individual’s privacy resulting from the publication of a statistical output such as a frequency table. This paper provides an introduction to DP for official statisticians and discuss its relevance, benefits and challenges from a National Statistical Organisation (NSO) perspective. We motivate our study by examining how privacy is evolving in the era of big data and how this might prompt a shift from traditional statistical disclosure techniques used in official statistics – which are generally applied on a cell-by-cell or table-by-table basis – to formal privacy methods, like DP, which are applied from a perspective encompassing the totality of the outputs generated from a given dataset. We identify an important interplay between DP’s holistic privacy risk measure and the difficulty for NSOs in implementing DP, showing that DP’s major advantage is also DP’s major challenge. This paper provides new work addressing two key DP research areas for NSOs: DP’s application to survey data and its incorporation within the Five Safes framework.


Sign in / Sign up

Export Citation Format

Share Document