Random disclosure in confidential statistical databases

2020 ◽  
pp. 1-13
Author(s):  
Rainer Lenz ◽  
Tim Hochgürtel

As part of statistical disclosure control National Statistical Offices can only deliver confidential data being sufficiently protected meeting national legislation. When releasing confidential microdata to users, data holders usually apply what are called anonymisation methods to the data. In order to fulfil the privacy requirements, it is possible to measure the level of privacy of some confidential data file by simulating potential data intrusion scenarios matching publicly or commercially available data with the entire set of confidential data, both sharing a non-empty set of variables (quasi-identifiers). According to real world microdata, incompatibility between data sets and not unique combinations of quasi-identifiers are very likely. In this situation, it is nearly impossible to decide whether or not two records refer to the same underlying statistical unit. Even a successful assignment of records may be a fruitless disclosure attempt, if a rationale data intruder would keep distance from that match. The paper lines out that disclosure risks estimated thus far are overrated in the sense that revealed information is always a combination of both, systematically derived results and non-negligible random assignment.

2012 ◽  
Vol 9 (1) ◽  
Author(s):  
Neeraj Tiwari

The most common method of providing data to the public is through statistical tables. The problem of protecting confidentiality in statistical tables containing sensitive information has been of great concern during the recent years. Rounding methods are perturbation techniques widely used by statistical agencies for protecting the confidential data. Random rounding is one of these methods. In this paper, using the technique of random rounding and quadratic programming, we introduce a new methodology for protecting the confidential information of tabular data with minimum loss of information. The tables obtained through the proposed method consist of unbiasedly rounded values, are additive and have specified level of confidentiality protection. Some numerical examples are also discussed to demonstrate the superiority of the proposed procedure over the existing procedures.


2010 ◽  
Vol 37 (4) ◽  
pp. 3256-3263 ◽  
Author(s):  
Jun-Lin Lin ◽  
Tsung-Hsien Wen ◽  
Jui-Chien Hsieh ◽  
Pei-Chann Chang

2020 ◽  
Vol 3 (348) ◽  
pp. 7-24
Author(s):  
Michał Pietrzak

The aim of this article is to analyse the possibility of applying selected perturbative masking methods of Statistical Disclosure Control to microdata, i.e. unit‑level data from the Labour Force Survey. In the first step, the author assessed to what extent the confidentiality of information was protected in the original dataset. In the second step, after applying selected methods implemented in the sdcMicro package in the R programme, the impact of those methods on the disclosure risk, the loss of information and the quality of estimation of population quantities was assessed. The conclusion highlights some problematic aspects of the use of Statistical Disclosure Control methods which were observed during the conducted analysis.


Sign in / Sign up

Export Citation Format

Share Document