Releasing Microdata: Disclosure Risk Estimation, Data Masking and Assessing Utility

Statistical Agencies need to make informed decisions when releasing sample microdata from social surveys with respect to the level of protection required in the data and the mode of access. These decisions should be based on objective quantitative measures of disclosure risk and data utility. This paper reviews recent developments in disclosure risk assessment and discusses how these can be integrated with established methods of data masking and utility assessment for releasing microdata. We illustrate the Disclosure risk-Data Utility approach based on samples drawn from a Census where the population is known and can be used to investigate sample-based methods and validate results.

Download Full-text

Quality Indicators for Statistical Disclosure Methods: A Case Study on the Structure of Earnings Survey

Journal of Official Statistics ◽

10.1515/jos-2015-0043 ◽

2015 ◽

Vol 31 (4) ◽

pp. 737-761 ◽

Cited By ~ 2

Author(s):

Matthias Templ

Keyword(s):

Original Data ◽

Data Sets ◽

Data Utility ◽

High Data ◽

Disclosure Risk ◽

Statistical Disclosure ◽

Context Data ◽

Statistical Agencies ◽

Utility Measures

Abstract Scientific- or public-use files are typically produced by applying anonymisation methods to the original data. Anonymised data should have both low disclosure risk and high data utility. Data utility is often measured by comparing well-known estimates from original data and anonymised data, such as comparing their means, covariances or eigenvalues. However, it is a fact that not every estimate can be preserved. Therefore the aim is to preserve the most important estimates, that is, instead of calculating generally defined utility measures, evaluation on context/data dependent indicators is proposed. In this article we define such indicators and utility measures for the Structure of Earnings Survey (SES) microdata and proper guidelines for selecting indicators and models, and for evaluating the resulting estimates are given. For this purpose, hundreds of publications in journals and from national statistical agencies were reviewed to gain insight into how the SES data are used for research and which indicators are relevant for policy making. Besides the mathematical description of the indicators and a brief description of the most common models applied to SES, four different anonymisation procedures are applied and the resulting indicators and models are compared to those obtained from the unmodified data. The disclosure risk is reported and the data utility is evaluated for each of the anonymised data sets based on the most important indicators and a model which is often used in practice.

Download Full-text

A CRITIQUE OF THE SENSITIVITY RULES USUALLY EMPLOYED FOR STATISTICAL TABLE PROTECTION

International Journal of Uncertainty Fuzziness and Knowledge-Based Systems ◽

10.1142/s0218488502001636 ◽

2002 ◽

Vol 10 (05) ◽

pp. 545-556 ◽

Cited By ~ 11

Author(s):

JOSEP DOMINGO-FERRER ◽

VICENÇ TORRA

Keyword(s):

Risk Assessment ◽

A Priori ◽

Tabular Data ◽

Disclosure Control ◽

Disclosure Risk ◽

Dominance Rule ◽

Statistical Table ◽

Statistical Disclosure ◽

A Cell ◽

Statistical Agencies

In statistical disclosure control of tabular data, sensitivity rules are commonly used to decide whether a table cell is sensitive and should therefore not be published. The most popular sensitivity rules are the dominance rule, the p%-rule and the pq-rule. The dominance rule has received critiques based on specific numerical examples and is being gradually abandoned by leading statistical agencies. In this paper, we construct general counterexamples which show that none of the above rules does adequately reflect disclosure risk if cell contributors or coalitions of them behave as intruders: in that case, releasing a cell declared non-sensitive can imply higher disclosure risk than releasing a cell declared sensitive. As possible solutions, we propose an alternative sensitivity rule based on the concentration of relative contributions. More generally, we suggest to complement a priori risk assessment based on sensitivity rules with a posteriori risk assessment which takes into account tables after they have been protected.

Download Full-text

Dose response models and a quantitative microbial risk assessment framework for the Mycobacterium avium complex that account for recent developments in molecular biology, taxonomy, and epidemiology

Water Research ◽

10.1016/j.watres.2016.11.053 ◽

2017 ◽

Vol 109 ◽

pp. 310-326 ◽

Cited By ~ 13

Author(s):

Kerry A. Hamilton ◽

Mark H. Weir ◽

Charles N. Haas

Keyword(s):

Risk Assessment ◽

Molecular Biology ◽

Mycobacterium Avium ◽

Mycobacterium Avium Complex ◽

Quantitative Microbial Risk Assessment ◽

Assessment Framework ◽

Response Models ◽

Microbial Risk Assessment ◽

Microbial Risk ◽

Recent Developments

Download Full-text

The Individual Risk Assessment of Terrorism: Recent Developments

SSRN Electronic Journal ◽

10.2139/ssrn.2665815 ◽

2015 ◽

Cited By ~ 1

Author(s):

John Monahan

Keyword(s):

Risk Assessment ◽

Individual Risk ◽

Recent Developments ◽

The Individual ◽

Individual Risk Assessment

Download Full-text

Geo-Information tools for landslide risk assessment: an overview of recent developments

Landslides: Evaluation and Stabilization/Glissement de Terrain: Evaluation et Stabilisation, Set of 2 Volumes ◽

10.1201/b16816-6 ◽

2004 ◽

pp. 39-56 ◽

Cited By ~ 25

Author(s):

C Van Westen

Keyword(s):

Risk Assessment ◽

Landslide Risk ◽

Information Tools ◽

Landslide Risk Assessment ◽

Recent Developments

Download Full-text

Examining Disclosure Risk and Data Utility: An Administrative Data Case Study

International Journal of Digital Curation ◽

10.2218/ijdc.v9i1.297 ◽

2014 ◽

Vol 9 (1) ◽

pp. 12-24

Author(s):

Michael Comerford

Keyword(s):

Administrative Data ◽

Real World ◽

Data Sources ◽

Sufficient Data ◽

Ethical Challenges ◽

Data Utility ◽

Disclosure Risk ◽

Individual Privacy

The plethora of new data sources, combined with a growing interest in increased access to previously unpublished data, poses a set of ethical challenges regarding individual privacy. This paper sets out one aspect of those challenges: the need to anonymise data in such a form that protects the privacy of individuals while providing sufficient data utility for data users. This issue is discussed using a case study of Scottish Government’s administrative data, in which disclosure risk is examined and data utility is assessed using a potential ‘real-world’ analysis.

Download Full-text

A Generalized Negative Binomial Smoothing Model for Sample Disclosure Risk Estimation

Privacy in Statistical Databases - Lecture Notes in Computer Science ◽

10.1007/11930242_8 ◽

2006 ◽

pp. 82-93 ◽

Cited By ~ 8

Author(s):

Yosef Rinott ◽

Natalie Shlomo

Keyword(s):

Negative Binomial ◽

Risk Estimation ◽

Disclosure Risk

Download Full-text

Sociological Versus Metascientific Views of Risk Assessment

Acceptable Evidence ◽

10.1093/oso/9780195089295.003.0019 ◽

1994 ◽

Author(s):

Deborah G. Mayo

Keyword(s):

Risk Assessment ◽

Risk Management ◽

Risk Estimation ◽

Estimation Methods ◽

Major Theme ◽

The Social ◽

The Difference ◽

Sociological View

In this chapter I shall discuss what seems to me to be a systematic ambiguity running through the large and complex risk-assessment literature. The ambiguity concerns the question of separability: can (and ought) risk assessment be separated from the policy values of risk management? Roughly, risk assessment is the process of estimating the risks associated with a practice or substance, and risk management is the process of deciding what to do about such risks. The separability question asks whether the empirical, scientific, and technical questions in estimating the risks either can or should be separated (conceptually or institutionally) from the social, political, and ethical questions of how the risks should be managed. For example, is it possible (advisable) for risk-estimation methods to be separated from social or policy values? Can (should) risk analysts work independently of policymakers (or at least of policy pressures)? The preponderant answer to the variants of the separability question in recent riskresearch literature is no. Such denials of either the possibility or desirability of separation may be termed nonseparatist positions. What needs to be recognized, however, is that advocating a nonseparatist position masks radically different views about the nature of risk-assessment controversies and of how best to improve risk assessment. These nonseparatist views, I suggest, may be divided into two broad camps (although individuals in each camp differ in degree), which I label the sociological view and the metascientific view. The difference between the two may be found in what each finds to be problematic about any attempt to separate assessment and management. Whereas the former (sociological) view argues against separatist attempts on the grounds that they give too small a role to societal (and other nonscientific) values, the latter (metascientific) view does so on the grounds that they give too small a role to scientific and methodological understanding. Examples of those I place under the sociological view are the cultural reductionists discussed in the preceding chapter by Shrader-Frechette. Examples of those I place under the metascientific view are the contributors to this volume themselves. A major theme running through this volume is that risk assessment cannot and should not be separated from societal and policy values (e.g., Silbergeld's uneasy divorce).

Download Full-text

Report on the public consultation on the EFSA draft Scientific Opinion on Recent developments in the risk assessment of chemicals in food and their potential impact on the safety assessment of substances used in food contact materials

EFSA Supporting Publications ◽

10.2903/sp.efsa.2016.en-988 ◽

2016 ◽

Vol 13 (1) ◽

Author(s):

Keyword(s):

Risk Assessment ◽

Safety Assessment ◽

Public Consultation ◽

Contact Materials ◽

Food Contact Materials ◽

The Public ◽

Food Contact ◽

Recent Developments ◽

Scientific Opinion ◽

Potential Impact

Download Full-text

Knowledge Base Development in Virtual Enterprise Network as Support for Workplace Risk Assessment

International Journal of Human Capital and Information Technology Professionals ◽

10.4018/jhcitp.2011070104 ◽

2011 ◽

Vol 2 (3) ◽

pp. 44-60

Author(s):

George Dragoi ◽

Anca Draghici ◽

Sebastian Marius Rosu ◽

Alexandru Radovici ◽

Costel Emil Cotet

Keyword(s):

Risk Assessment ◽

New Product Development ◽

Process Management ◽

Risk Estimation ◽

Virtual Enterprise ◽

Knowledge Bases ◽

Risk Level ◽

Business Services ◽

Enterprise Network ◽

Risk Management Process

The article presents research results based on the concept of collaborative infrastructure (as the virtual enterprise network PREMINV e-platform from “Politehnica” University of Bucharest, Romania), in order to unify existing standards for supply chain management and to provide support in various decision making processes in manufacturing supply networks. The intent is to facilitate and enhance the required knowledge management processes linked with the business process management. The virtual enterprise network expects to reduce small and medium-sized enterprises involvement in networking efforts, enable better and faster decision processes and promote the development of the business services. In addition, the new product development paradigm requires software tools for risk estimation and assessment. For this purpose, the authors describe a knowledge bases method build and use for the professional risk assessment as part of risk management process. The risk level is established based on the probability and severity of its consequences.

Download Full-text