scholarly journals An accuracy-assured privacy-preserving recommender system for internet commerce

2015 ◽  
Vol 12 (4) ◽  
pp. 1307-1326 ◽  
Author(s):  
Zhigang Lu ◽  
Hong Shen

Recommender systems, tool for predicting users? potential preferences by computing history data and users? interests, show an increasing importance in various Internet applications such as online shopping. As a well-known recommendation method, neighbourhood-based collaborative filtering has attracted considerable attentions recently. The risk of revealing users? private information during the process of filtering has attracted noticeable research interests. Among the current solutions, the probabilistic techniques have shown a powerful privacy preserving effect. The existing methods deploying probabilistic methods are in three categories, one [19] adds differential privacy noises in the covariance matrix; one [1] introduces the randomisation in the neighbour selection process; the other [29] applies differential privacy in both the neighbour selection process and covariance matrix. When facing the k Nearest Neighbour (kNN) attack, all the existing methods provide no data utility guarantee, for the introduction of global randomness. In this paper, to overcome the problem of recommendation accuracy loss, we propose a novel approach, Partitioned Probabilistic Neighbour Selection, to ensure a required prediction accuracy while maintaining high security against the kNN attack. We define the sum of k neighbours? similarity as the accuracy metric ?, the number of user partitions, across which we select the k neighbours, as the security metric ?. We generalise the k Nearest Neighbour attack to the ?k Nearest Neighbours attack. Differing from the existing approach that selects neighbours across the entire candidate list randomly, our method selects neighbours from each exclusive partition of size k with a decreasing probability. Theoretical and experimental analysis show that to provide an accuracy-assured recommendation, our Partitioned Probabilistic Neighbour Selection method yields a better trade-off between the recommendation accuracy and system security.

Author(s):  
Artrim Kjamilji

Nowadays many different entities collect data of the same nature, but in slightly different environments. In this sense different hospitals collect data about their patients’ symptoms and corresponding disease diagnoses, different banks collect transactions of their customers’ bank accounts, multiple cyber-security companies collect data about log files and corresponding attacks, etc. It is shown that if those different entities would merge their privately collected data in a single dataset and use it to train a machine learning (ML) model, they often end up with a trained model that outperforms the human experts of the corresponding fields in terms of accurate predictions. However, there is a drawback. Due to privacy concerns, empowered by laws and ethical reasons, no entity is willing to share with others their privately collected data. The same problem appears during the classification case over an already trained ML model. On one hand, a user that has an unclassified query (record), doesn’t want to share with the server that owns the trained model neither the content of the query (which might contain private data such as credit card number, IP address, etc.), nor the final prediction (classification) of the query. On the other hand, the owner of the trained model doesn’t want to leak any parameter of the trained model to the user. In order to overcome those shortcomings, several cryptographic and probabilistic techniques have been proposed during the last few years to enable both privacy preserving training and privacy preserving classification schemes. Some of them include anonymization and k-anonymity, differential privacy, secure multiparty computation (MPC), federated learning, Private Information Retrieval (PIR), Oblivious Transfer (OT), garbled circuits and/or homomorphic encryption, to name a few. Theoretical analyses and experimental results show that the current privacy preserving schemes are suitable for real-case deployment, while the accuracy of most of them differ little or not at all with the schemes that work in non-privacy preserving fashion.


2021 ◽  
Vol 2022 (1) ◽  
pp. 481-500
Author(s):  
Xue Jiang ◽  
Xuebing Zhou ◽  
Jens Grossklags

Abstract Business intelligence and AI services often involve the collection of copious amounts of multidimensional personal data. Since these data usually contain sensitive information of individuals, the direct collection can lead to privacy violations. Local differential privacy (LDP) is currently considered a state-ofthe-art solution for privacy-preserving data collection. However, existing LDP algorithms are not applicable to high-dimensional data; not only because of the increase in computation and communication cost, but also poor data utility. In this paper, we aim at addressing the curse-of-dimensionality problem in LDP-based high-dimensional data collection. Based on the idea of machine learning and data synthesis, we propose DP-Fed-Wae, an efficient privacy-preserving framework for collecting high-dimensional categorical data. With the combination of a generative autoencoder, federated learning, and differential privacy, our framework is capable of privately learning the statistical distributions of local data and generating high utility synthetic data on the server side without revealing users’ private information. We have evaluated the framework in terms of data utility and privacy protection on a number of real-world datasets containing 68–124 classification attributes. We show that our framework outperforms the LDP-based baseline algorithms in capturing joint distributions and correlations of attributes and generating high-utility synthetic data. With a local privacy guarantee ∈ = 8, the machine learning models trained with the synthetic data generated by the baseline algorithm cause an accuracy loss of 10% ~ 30%, whereas the accuracy loss is significantly reduced to less than 3% and at best even less than 1% with our framework. Extensive experimental results demonstrate the capability and efficiency of our framework in synthesizing high-dimensional data while striking a satisfactory utility-privacy balance.


2020 ◽  
Author(s):  
Hyunghoon Cho ◽  
Sean Simmons ◽  
Ryan Kim ◽  
Bonnie Berger

AbstractSharing data across research groups is an essential driver of biomedical research. In particular, biomedical databases with interactive query-answering systems allow users to retrieve information from the database using restricted types of queries (e.g. number of subjects satisfying certain criteria). While these systems aim to facilitate the sharing of aggregate biomedical insights without divulging sensitive individual-level data, they can still leak private information about the individuals in the database through the query answers. Existing strategies to mitigate such risks either provide insufficient levels of privacy or greatly diminish the usefulness of the database. Here, we draw upon recent advances in differential privacy to introduce privacy-preserving query-answering mechanisms for biomedical databases that provably maximize the expected utility of the system while achieving formal privacy guarantees. We demonstrate the accuracy improvement of our methods over existing approaches for a range of use cases, including count, membership, and association queries. Notably, our new theoretical results extend the proof of optimality of the underlying mechanism, previously known only for count queries with symmetric utility functions, to asymmetric utility functions needed for count queries in cohort discovery workflows as well as membership queries—a core functionality of the Beacon Project recently launched by the Global Alliance for Genomics and Health (GA4GH). Our work presents a path towards biomedical query-answering systems that achieve the best privacy-utility trade-offs permitted by the theory of differential privacy.


2018 ◽  
Vol 94 (3) ◽  
pp. 1-26 ◽  
Author(s):  
Dichu Bao ◽  
Yongtae Kim ◽  
G. Mujtaba Mian ◽  
Lixin (Nancy) Su

ABSTRACT Prior studies provide conflicting evidence as to whether managers have a general tendency to disclose or withhold bad news. A key challenge for this literature is that researchers cannot observe the negative private information that managers possess. We tackle this challenge by constructing a proxy for managers' private bad news (residual short interest) and then perform a series of tests to validate this proxy. Using management earnings guidance and 8-K filings as measures of voluntary disclosure, we find a negative relation between bad-news disclosure and residual short interest, suggesting that managers withhold bad news in general. This tendency is tempered when firms are exposed to higher litigation risk, and it is strengthened when managers have greater incentives to support the stock price. Based on a novel approach to identifying the presence of bad news, our study adds to the debate on whether managers tend to withhold or release bad news. Data Availability: Data used in this study are available from public sources identified in the study.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Fu Jie Tey ◽  
Tin-Yu Wu ◽  
Chiao-Ling Lin ◽  
Jiann-Liang Chen

AbstractRecent advances in Internet applications have facilitated information spreading and, thanks to a wide variety of mobile devices and the burgeoning 5G networks, users easily and quickly gain access to information. Great amounts of digital information moreover have contributed to the emergence of recommender systems that help to filter information. When the rise of mobile networks has pushed forward the growth of social media networks and users get used to posting whatever they do and wherever they visit on the Web, such quick social media updates already make it difficult for users to find historical data. For this reason, this paper presents a social network-based recommender system. Our purpose is to build a user-centered recommender system to exclude the products that users are disinterested in according to user preferences and their friends' shopping experiences so as to make recommendations effective. Since there might be no corresponding reference value for new products or services, we use indirect relations between friends and “friends’ friends” as well as sentinel friends to improve the recommendation accuracy. The simulation result has proven that our proposed mechanism is efficient in enhancing recommendation accuracy.


Author(s):  
Dan Wang ◽  
Ju Ren ◽  
Zhibo Wang ◽  
Xiaoyi Pang ◽  
Yaoxue Zhang ◽  
...  

2021 ◽  
Vol 18 (11) ◽  
pp. 42-60
Author(s):  
Ting Bao ◽  
Lei Xu ◽  
Liehuang Zhu ◽  
Lihong Wang ◽  
Ruiguang Li ◽  
...  

Author(s):  
Shushu Liu ◽  
An Liu ◽  
Zhixu Li ◽  
Guanfeng Liu ◽  
Jiajie Xu ◽  
...  

Author(s):  
Cheng Huang ◽  
Rongxing Lu ◽  
Hui Zhu ◽  
Jun Shao ◽  
Abdulrahman Alamer ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document