scholarly journals Maintaining Analytic Utility while Protecting Confidentiality of Survey and Nonsurvey Data

Author(s):  
Avinash C. Singh

Consider a complete rectangular database at the micro (or unit) level from a survey (sample or census) or nonsurvey (administrative source) in which potential identifying variables (IVs) are suitably categorized (so that the analytic utility is essentially maintained) for reducing the pretreatment disclosure risk to the extent possible. The pretreatment risk is due to the presence of unique records (with respect to IVs) or nonuniques (i.e., more than one record having a common IV profile) with similar values of at least one sensitive variable (SV). This setup covers macro (or aggregate) level data including tabular data because a common mean value (of 1 in the case of count data) to all units in the aggregation or cell can be assigned. Our goal is to create a public use file with simultaneous control of disclosure risk and information loss after disclosure treatment by perturbation (i.e., substitution of IVs and not SVs) and suppression (i.e., subsampling-out of records). In this paper, an alternative framework of measuring information loss and disclosure risk under a nonsynthetic approach as proposed by Singh (2002, 2006) is considered which, in contrast to the commonly used deterministic treatment, is based on a stochastic selection of records for disclosure treatment in the sense that all records are subject to treatment (with possibly different probabilities), but only a small proportion of them are actually treated. We also propose an extension of the above alternative framework of Singh with the goal of generalizing risk measures to allow partial risk scores for unique and nonunique records. Survey sampling techniques of sample allocation are used to assign substitution and subsampling rates to risk strata defined by unique and nonunique records such that bias due to substitution and variance due to subsampling for main study variables (functions of SVs and IVs) are minimized. This is followed by calibration to controls based on original estimates of main study variables so that these estimates are preserved, and bias and variance for other study variables may also be reduced. The above alternative framework leads to the method of disclosure treatment known as MASSC (signifying micro-agglomeration, substitution, subsampling, and calibration) and to an enhanced method (denoted GenMASSC) which uses generalized risk measures. The GenMASSC method is illustrated through a simple example followed by a discussion of relative merits and demerits of nonsynthetic and synthetic methods of disclosure treatment.

1957 ◽  
Vol 8 (4) ◽  
pp. 359 ◽  
Author(s):  
R Milford

For each four subtropical grasses there is a significant correlation between daily dry matter intake and total nitrogen in faeces per day. The data have been tested for homogeneity under two hypotheses. In the first a test of the difference in slope between the four regression lines showed that they were not statistically different. It was shown that for a common mean value for total faecal nitrogen, the calculated mean daily dry matter intakes of Paspalum commersonii Lam., Urochloa pullulans Stapf, and Chloris gayana Kunth were similar and the relationship for these three could be expressed by one regression line. However, the calculated mean daily dry matter intake for Panicum maximum var. trichoglume (K. Schum.) Eyles was significantly different from those for the other three grasses and P. maximum var. trichoglume cannot be included in a general regression. In the second hypothesis it was shown that all regression lines could pass through the origin. However, as in the first hypothesis, P. commersonii, U. pullulans, and C. gayana could be represented by a common regression line whllst the regression line for P. maximum differed significantly in slope from those of the other three grasses. The results indicate that species can be grouped for this relationship, and that it could be used to measure intake of the free grazing animal on monospecific swards or on mixed swards of species with similar relationships. Lancaster's technique for determining digestibility is discussed in the light of these relationships. Neither percentage faecal nitrogen nor faecal crude fibre was found to be satisfactorily correlated with dry matter digestibility.


2020 ◽  
Vol 12 (7) ◽  
pp. 3005
Author(s):  
Xi Yang ◽  
Ke Song ◽  
Fuan Pu

This study collected and analyzed dynamic spatial data of eight traditional villages scattered in different regions of China. A multi-temporal analysis of morphological metrics of spatial patterns and a regression analysis of the morphological evolution were used to analyze and contrast the historical spatial processes of different villages. These were then compared using patch texture and rural macro-morphology perspectives. This led to an assessment of the general laws and trends associated with rural spatial processes. (1) There has been a significant shift in the stability of rural spatial development since the founding of the People’s Republic of China (PRC). (2) Most small and medium-sized villages have maintained a relatively stable spatial texture, while large villages have changed significantly. (3) The mean and variance of the patch area, and the Euclidean nearest-neighbor distance, are correlated in some cases. (4) The mode of rural expansion may be relevant to limitations in the total area of growth. (5) The fractal dimension of the rural macro-morphology may follow a morphological order of oscillation around the equilibrium level. (6) The common mean value of the projected area of rural building patches is expected to be 100 m2.


2011 ◽  
Author(s):  
Marc Goovaerts ◽  
Daniël Linders ◽  
Koen Van Weert ◽  
Fatih Tank
Keyword(s):  

2012 ◽  
Vol 51 (1) ◽  
pp. 10-18 ◽  
Author(s):  
Marc Goovaerts ◽  
Daniël Linders ◽  
Koen Van Weert ◽  
Fatih Tank
Keyword(s):  

Data ◽  
2021 ◽  
Vol 6 (5) ◽  
pp. 53
Author(s):  
Ebaa Fayyoumi ◽  
Omar Alhuniti

This research investigates the micro-aggregation problem in secure statistical databases by integrating the divide and conquer concept with a genetic algorithm. This is achieved by recursively dividing a micro-data set into two subsets based on the proximity distance similarity. On each subset the genetic operation “crossover” is performed until the convergence condition is satisfied. The recursion will be terminated if the size of the generated subset is satisfied. Eventually, the genetic operation “mutation” will be performed over all generated subsets that satisfied the variable group size constraint in order to maximize the objective function. Experimentally, the proposed micro-aggregation technique was applied to recommended real-life data sets. Results demonstrated a remarkable reduction in the computational time, which sometimes exceeded 70% compared to the state-of-the-art. Furthermore, a good equilibrium value of the Scoring Index (SI) was achieved by involving a linear combination of the General Information Loss (GIL) and the General Disclosure Risk (GDR).


2020 ◽  
Vol 8 (1) ◽  
pp. 19-37
Author(s):  
Kamaldeen Ibraheem Nageri ◽  
Umar Gunu

AbstractCorruption has a major impact on growth in low-income economies, while ease of doing business has a major impact on growth in developed countries. The study empirically examines the effect of corruption on ease of doing business. The study analyses unbalanced panel data of corruption rank, corruption score, control of corruption, and inflation, together with other economic and financial institutional factors and ease of doing business score for the period of 2004–2017. Results indicate that: corruption rank, inflation, and import have negative and significant effect on ease of doing business; corruption score, control of corruption, lending rate spread, and education (skill level) have positive and significant effect on ease of doing business; gross capital formation and population have insignificant negative effect on ease of doing business; export and gross domestic product have insignificant positive effect on ease of doing business. The random effect model is a consistent and most efficient model, indicating common mean value for ease of doing business for the dataset. The study recommends improved corruption scores, control of corruption, and ranks to encourage ease of doing business through monetary policy and infrastructural facilities.


Author(s):  
Thomas R. Beck ◽  
Andrei Antohe ◽  
Francesco Cardellini ◽  
Alexandra Cucoş ◽  
Eliska Fialova ◽  
...  

An interlaboratory comparison for European radon calibration facilities was conducted to evaluate the establishment of a harmonized quality level for the activity concentration of radon in air and to demonstrate the performance of the facilities when calibrating measurement instruments for radon. Fifteen calibration facilities from 13 different European countries participated. They represented different levels in the metrological hierarchy: national metrology institutes and designated institutes, national authorities for radiation protection and participants from universities. The interlaboratory comparison was conducted by the German Federal Office for Radiation Protection (BfS) and took place from 2018 to 2020. Participants were requested to measure radon in atmospheres of their own facilities according to their own procedures and requirements for metrological traceability. A measurement device with suitable properties was used to determine the comparison values. The results of the comparison showed that the radon activity concentrations that were determined by European calibration facilities complying with metrological traceability requirements were consistent with each other and had common mean values. The deviations from these values were normally distributed. The range of variation of the common mean value was a measure of the degree of agreement between the participants. For exposures above 1000 Bq/m3, the variation was about 4% for a level of confidence of approximately 95% (k=2). For lower exposure levels, the variation increased to about 6%.


Sign in / Sign up

Export Citation Format

Share Document