scholarly journals A Utility-Theoretic Approach to Privacy in Online Services

2010 ◽  
Vol 39 ◽  
pp. 633-662 ◽  
Author(s):  
A. Krause ◽  
E. Horvitz

Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user's demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users’ preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples’ willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users.

Author(s):  
Anastasia Kozyreva ◽  
Philipp Lorenz-Spreen ◽  
Ralph Hertwig ◽  
Stephan Lewandowsky ◽  
Stefan M. Herzog

AbstractPeople rely on data-driven AI technologies nearly every time they go online, whether they are shopping, scrolling through news feeds, or looking for entertainment. Yet despite their ubiquity, personalization algorithms and the associated large-scale collection of personal data have largely escaped public scrutiny. Policy makers who wish to introduce regulations that respect people’s attitudes towards privacy and algorithmic personalization on the Internet would greatly benefit from knowing how people perceive personalization and personal data collection. To contribute to an empirical foundation for this knowledge, we surveyed public attitudes towards key aspects of algorithmic personalization and people’s data privacy concerns and behavior using representative online samples in Germany (N = 1065), Great Britain (N = 1092), and the United States (N = 1059). Our findings show that people object to the collection and use of sensitive personal information and to the personalization of political campaigning and, in Germany and Great Britain, to the personalization of news sources. Encouragingly, attitudes are independent of political preferences: People across the political spectrum share the same concerns about their data privacy and show similar levels of acceptance regarding personalized digital services and the use of private data for personalization. We also found an acceptability gap: People are more accepting of personalized services than of the collection of personal data and information required for these services. A large majority of respondents rated, on average, personalized services as more acceptable than the collection of personal information or data. The acceptability gap can be observed at both the aggregate and the individual level. Across countries, between 64% and 75% of respondents showed an acceptability gap. Our findings suggest a need for transparent algorithmic personalization that minimizes use of personal data, respects people’s preferences on personalization, is easy to adjust, and does not extend to political advertising.


2019 ◽  
Vol 9 (19) ◽  
pp. 3997
Author(s):  
Md Mehedi Hassan Onik ◽  
Chul-Soo Kim ◽  
Nam-Yong Lee ◽  
Jinhong Yang

Android is offering millions of apps on Google Play-store by the application publishers. However, those publishers do have a parent organization and share information with them. Through the ‘Android permission system’, a user permits an app to access sensitive personal data. Large-scale personal data integration can reveal user identity, enabling new insights and earn revenue for the organizations. Similarly, aggregation of Android app permissions by the app owning parent organizations can also cause privacy leakage by revealing the user profile. This work classifies risky personal data by proposing a threat model on the large-scale app permission aggregation by the app publishers and associated owners. A Google-play application programming interface (API) assisted web app is developed that visualizes all the permissions an app owner can collectively gather through multiple apps released via several publishers. The work empirically validates the performance of the risk model with two case studies. The top two Korean app owners, seven publishers, 108 apps and 720 sets of permissions are studied. With reasonable accuracy, the study finds the contact number, biometric ID, address, social graph, human behavior, email, location and unique ID as frequently exposed data. Finally, the work concludes that the real-time tracking of aggregated permissions can limit the odds of user profiling.


Data Science ◽  
2020 ◽  
Vol 3 (2) ◽  
pp. 79-106
Author(s):  
Vero Estrada-Galiñanes ◽  
Katarzyna Wac

New, multi-channel personal data sources (like heart rate, sleep patterns, travel patterns, or social activities) are enabled by ever increased availability of miniaturised technologies embedded within smartphones and wearables. These data sources enable personal self-management of lifestyle choices (e.g., exercise, move to a bike-friendly area) and, on a large scale, scientific discoveries to improve health and quality of life. However, there are no simple and reliable ways for individuals to securely collect, explore and share these sources. Additionally, much data is also wasted, especially when the technology provider ceases to exist, leaving the users without any opportunity to retrieve own datasets from “dead” devices or systems. Our research reveals evidence of what we term human data bleeding and offers guidance on how to address current issues by reasoning upon five core aspects, namely technological, financial, legal, institutional and cultural factors. To this end, we present preliminary specifications of an open platform for personal data storage and quality of life research. The Open Health Archive (OHA) is a platform that would support individual, community and societal needs by facilitating collecting, exploring and sharing personal health and QoL data.


2018 ◽  
Vol 7 (2.24) ◽  
pp. 353
Author(s):  
Nishant Pal ◽  
Akshat Chawla ◽  
A Meena Priyadharsini

In Information systems working at a large scale where retrieval of information is an essential operation for example search engines etc. The users are not only concerned with the quality of results but also the time they consume for querying the data. These aspects lead to a natural tradeoff in which the approaches that lead to an increase in data have a similar larger response time and vice-versa. Hence, as the requirement for faster search query processing time along with efficient results is increasing, we need to identify other ways for increasing efficiency. This work proposes an application of the meta-heuristic algorithm called Grey Wolf Optimization (GWO) algorithm to improve Query Processing Time in Search Engines. The GWO algorithm is an alter ego of the way in which the grey wolves are organised and their hunting techniques. There are four categories of  grey wolves in a single pack of grey wolves which are alpha, beta, delta, and omega respectively. They are used to work in a simulating hierarchy. These help achieve better search results at decrease query response timings.


2020 ◽  
Author(s):  
Anastasia Kozyreva ◽  
Philipp Lorenz-Spreen ◽  
Ralph Hertwig ◽  
Stephan Lewandowsky ◽  
Stefan Michael Herzog

Despite their ubiquity online, personalization algorithms and the associated large-scale collection of personal data have largely escaped public scrutiny. Yet policy makers who wish to introduce regulations that respect people's attitudes towards privacy and algorithmic personalization on the Internet would greatly benefit from knowing how people perceive different aspects of personalization and data collection. To contribute to an empirical foundation for this knowledge, we surveyed public attitudes using representative online samples in Germany, Great Britain, and the United States on key aspects of algorithmic personalization and on people's data privacy concerns and behavior. Our findings show that people object to the collection and use of sensitive personal information and to the personalization of political campaigning and, in Germany and Great Britain, to the personalization of news sources. Encouragingly, attitudes are independent of political preferences: People across the political spectrum share the same concerns about their data privacy and the effects of personalization on news and politics. We also found that people are more accepting of personalized services than of the collection of personal data and information currently collected for these services. This acceptability gap---the difference between the acceptability of personalized online services and the acceptability of the collection and use of data and information---in people's attitudes can be observed at both the aggregate and the individual level. Our findings suggest a need for transparent algorithmic personalization that respects people’s data privacy, can be easily adjusted, and does not extend to political advertising.


Abakós ◽  
2012 ◽  
Vol 1 (1) ◽  
pp. 28-49
Author(s):  
Kaio Wagner ◽  
Edleno Silva de Moura ◽  
David Fernandes ◽  
Marco Cristo ◽  
Altigran Soares da Silva

Previous work in literature has indicated that template of web pages represent noisy information in web collections, and advocate that the simple removal of template result in improvements in quality of results provided by Web search systems. In this paper, we study the impact of template removal in two distinct scenarios: large scale web search collections, which consist of several distinct websites, and intrasite web collections, involving searches inside of web sites.  Our  work  is the  first  in literature to  study the  impact of template removal  to  search systems in large  scale  Web  collections. The study was carried out using an automatic template detection method previously proposed by us. As contributions, we present statistics about the application of this automatic template detection method to the well known GOV2 reference collection, a large scale Web collection. We also present experiments comparing the amount of template detected by our automatic method to the ones obtained when humans select templates. And finally, experiments which indicate that, in both experimented scenarios, template removal does not improve the quality of results provided by search systems, but can play the role of an effective loss compression method by reducing the size of their indexes.


2020 ◽  
Vol 16 (5) ◽  
pp. 155014772091211 ◽  
Author(s):  
Tomás Robles ◽  
Borja Bordel ◽  
Ramón Alcarria ◽  
Diego Sánchez-de-Rivera

Users are each day more aware of their privacy and data protection. Although this problem is transversal to every digital service, it is especially relevant when critical and personal information is managed, as in eHealth and well-being services. During the last years, many different innovative services in this area have been proposed. However, data management challenges are still in need of a solution. In general, data are directly sent to services but no trustworthy instruments to recover these data or remove them from services are available. In this scheme, services become the users’ data owners although users keep the rights to access, modify, and be forgotten. Nevertheless, the adequate implementation of these rights is not guaranteed, as services use the received data with commercial purposes. In order to address and solve this situation, we propose a new trustworthy personal data protection mechanism for well-being services, based on privacy-by-design technologies. This new mechanism is based on Blockchain networks and indirection functions and tokens. Blockchain networks execute transparent smart contracts, where users’ rights are codified, and store the users’ personal data which are never sent or given to external services. Besides, permissions and privacy restrictions designed by users to be applied to their data and services consuming them are also implemented in these smart contracts. Finally, an experimental validation is also described to evaluate the Quality of Experience (in terms of user satisfaction) and Quality of Service (in terms of processing delay) compared to traditional service provision solutions.


2016 ◽  
Vol 12 (8) ◽  
pp. 737-744 ◽  
Author(s):  
John Paparrizos ◽  
Ryen W. White ◽  
Eric Horvitz

Introduction: People’s online activities can yield clues about their emerging health conditions. We performed an intensive study to explore the feasibility of using anonymized Web query logs to screen for the emergence of pancreatic adenocarcinoma. The methods used statistical analyses of large-scale anonymized search logs considering the symptom queries from millions of people, with the potential application of warning individual searchers about the value of seeking attention from health care professionals. Methods: We identified searchers in logs of online search activity who issued special queries that are suggestive of a recent diagnosis of pancreatic adenocarcinoma. We then went back many months before these landmark queries were made, to examine patterns of symptoms, which were expressed as searches about concerning symptoms. We built statistical classifiers that predicted the future appearance of the landmark queries based on patterns of signals seen in search logs. Results: We found that signals about patterns of queries in search logs can predict the future appearance of queries that are highly suggestive of a diagnosis of pancreatic adenocarcinoma. We showed specifically that we can identify 5% to 15% of cases, while preserving extremely low false-positive rates (0.00001 to 0.0001). Conclusion: Signals in search logs show the possibilities of predicting a forthcoming diagnosis of pancreatic adenocarcinoma from combinations of subtle temporal signals revealed in the queries of searchers.


Author(s):  
A. Babirad

Cerebrovascular diseases are a problem of the world today, and according to the forecast, the problem of the near future arises. The main risk factors for the development of ischemic disorders of the cerebral circulation include oblique and aging, arterial hypertension, smoking, diabetes mellitus and heart disease. An effective strategy for the prevention of cerebrovascular events is based on the implementation of large-scale risk control measures, including the use of antiagregant and anticoagulant therapy, invasive interventions such as atheromectomy, angioplasty and stenting. In this connection, the efforts of neurologists, cardiologists, angiosurgery, endocrinologists and other specialists are the basis for achieving an acceptable clinical outcome. A review of the SF-36 method for assessing the quality of life in patients with the effects of transient ischemic stroke is presented. The assessment of quality of life is recognized in world medical practice and research, an indicator that is also used to assess the quality of the health system and in general sociological research.


Sign in / Sign up

Export Citation Format

Share Document