scholarly journals Releasing Microdata: Disclosure Risk Estimation, Data Masking and Assessing Utility

Author(s):  
Natalie Shlomo

Statistical Agencies need to make informed decisions when releasing sample microdata from social surveys with respect to the level of protection required in the data and the mode of access. These decisions should be based on objective quantitative measures of disclosure risk and data utility. This paper reviews recent developments in disclosure risk assessment and discusses how these can be integrated with established methods of data masking and utility assessment for releasing microdata. We illustrate the Disclosure risk-Data Utility approach based on samples drawn from a Census where the population is known and can be used to investigate sample-based methods and validate results.

2015 ◽  
Vol 31 (4) ◽  
pp. 737-761 ◽  
Author(s):  
Matthias Templ

Abstract Scientific- or public-use files are typically produced by applying anonymisation methods to the original data. Anonymised data should have both low disclosure risk and high data utility. Data utility is often measured by comparing well-known estimates from original data and anonymised data, such as comparing their means, covariances or eigenvalues. However, it is a fact that not every estimate can be preserved. Therefore the aim is to preserve the most important estimates, that is, instead of calculating generally defined utility measures, evaluation on context/data dependent indicators is proposed. In this article we define such indicators and utility measures for the Structure of Earnings Survey (SES) microdata and proper guidelines for selecting indicators and models, and for evaluating the resulting estimates are given. For this purpose, hundreds of publications in journals and from national statistical agencies were reviewed to gain insight into how the SES data are used for research and which indicators are relevant for policy making. Besides the mathematical description of the indicators and a brief description of the most common models applied to SES, four different anonymisation procedures are applied and the resulting indicators and models are compared to those obtained from the unmodified data. The disclosure risk is reported and the data utility is evaluated for each of the anonymised data sets based on the most important indicators and a model which is often used in practice.


Author(s):  
JOSEP DOMINGO-FERRER ◽  
VICENÇ TORRA

In statistical disclosure control of tabular data, sensitivity rules are commonly used to decide whether a table cell is sensitive and should therefore not be published. The most popular sensitivity rules are the dominance rule, the p%-rule and the pq-rule. The dominance rule has received critiques based on specific numerical examples and is being gradually abandoned by leading statistical agencies. In this paper, we construct general counterexamples which show that none of the above rules does adequately reflect disclosure risk if cell contributors or coalitions of them behave as intruders: in that case, releasing a cell declared non-sensitive can imply higher disclosure risk than releasing a cell declared sensitive. As possible solutions, we propose an alternative sensitivity rule based on the concentration of relative contributions. More generally, we suggest to complement a priori risk assessment based on sensitivity rules with a posteriori risk assessment which takes into account tables after they have been protected.


2014 ◽  
Vol 9 (1) ◽  
pp. 12-24
Author(s):  
Michael Comerford

The plethora of new data sources, combined with a growing interest in increased access to previously unpublished data, poses a set of ethical challenges regarding individual privacy. This paper sets out one aspect of those challenges: the need to anonymise data in such a form that protects the privacy of individuals while providing sufficient data utility for data users. This issue is discussed using a case study of Scottish Government’s administrative data, in which disclosure risk is examined and data utility is assessed using a potential ‘real-world’ analysis.


Author(s):  
Deborah G. Mayo

In this chapter I shall discuss what seems to me to be a systematic ambiguity running through the large and complex risk-assessment literature. The ambiguity concerns the question of separability: can (and ought) risk assessment be separated from the policy values of risk management? Roughly, risk assessment is the process of estimating the risks associated with a practice or substance, and risk management is the process of deciding what to do about such risks. The separability question asks whether the empirical, scientific, and technical questions in estimating the risks either can or should be separated (conceptually or institutionally) from the social, political, and ethical questions of how the risks should be managed. For example, is it possible (advisable) for risk-estimation methods to be separated from social or policy values? Can (should) risk analysts work independently of policymakers (or at least of policy pressures)? The preponderant answer to the variants of the separability question in recent riskresearch literature is no. Such denials of either the possibility or desirability of separation may be termed nonseparatist positions. What needs to be recognized, however, is that advocating a nonseparatist position masks radically different views about the nature of risk-assessment controversies and of how best to improve risk assessment. These nonseparatist views, I suggest, may be divided into two broad camps (although individuals in each camp differ in degree), which I label the sociological view and the metascientific view. The difference between the two may be found in what each finds to be problematic about any attempt to separate assessment and management. Whereas the former (sociological) view argues against separatist attempts on the grounds that they give too small a role to societal (and other nonscientific) values, the latter (metascientific) view does so on the grounds that they give too small a role to scientific and methodological understanding. Examples of those I place under the sociological view are the cultural reductionists discussed in the preceding chapter by Shrader-Frechette. Examples of those I place under the metascientific view are the contributors to this volume themselves. A major theme running through this volume is that risk assessment cannot and should not be separated from societal and policy values (e.g., Silbergeld's uneasy divorce).


Author(s):  
George Dragoi ◽  
Anca Draghici ◽  
Sebastian Marius Rosu ◽  
Alexandru Radovici ◽  
Costel Emil Cotet

The article presents research results based on the concept of collaborative infrastructure (as the virtual enterprise network PREMINV e-platform from “Politehnica” University of Bucharest, Romania), in order to unify existing standards for supply chain management and to provide support in various decision making processes in manufacturing supply networks. The intent is to facilitate and enhance the required knowledge management processes linked with the business process management. The virtual enterprise network expects to reduce small and medium-sized enterprises involvement in networking efforts, enable better and faster decision processes and promote the development of the business services. In addition, the new product development paradigm requires software tools for risk estimation and assessment. For this purpose, the authors describe a knowledge bases method build and use for the professional risk assessment as part of risk management process. The risk level is established based on the probability and severity of its consequences.


Sign in / Sign up

Export Citation Format

Share Document