Aggregation Methods to Evaluate Multiple Protected Versions of the Same Confidential Data Set

Rigidity in Packer-Feedlot Relationships

Journal of Agricultural and Applied Economics ◽

10.1017/s1074070800026912 ◽

2004 ◽

Vol 36 (3) ◽

pp. 627-638 ◽

Cited By ~ 1

Author(s):

Lynn Hunnicutt ◽

Dee Von Bailey ◽

Michelle Crook

Keyword(s):

Regression Analysis ◽

Transaction Costs ◽

Switching Behavior ◽

Data Set ◽

The Past ◽

Confidential Data

Concentration in beef packing has risen dramatically in the past 25 years. We develop measures used to describe feedlot-packer relations: (1) a statistic based on the proportion of its sales a feedlot makes to a given packer, and (2) a measure of the switching behavior of feedlots. The measures are calculated using a confidential data set from the USDA Grain Inspection, Packers, and Stockyards Administration. Relationships are found to be both exclusive and stable. Causes for this rigidity are then examined using regression analysis. Transaction costs are shown to help explain why this market differs from a perfectly competitive one.

Download Full-text

Analysis of the effects of spatiotemporal demand data aggregation methods on distance and volume errors

Journal of Defense Analytics and Logistics ◽

10.1108/jdal-03-2020-0003 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Zachary Hornberger ◽

Bruce Cox ◽

Raymond R. Hill

Keyword(s):

Optimization Problems ◽

Theoretical Research ◽

Aggregation Method ◽

Zonal Distribution ◽

Data Set ◽

Distribution Method ◽

Content Type ◽

Aggregation Methods ◽

Trade Offs ◽

Sar Data

Purpose Large/stochastic spatiotemporal demand data sets can prove intractable for location optimization problems, motivating the need for aggregation. However, demand aggregation induces errors. Significant theoretical research has been performed related to the modifiable areal unit problem and the zone definition problem. Minimal research has been accomplished related to the specific issues inherent to spatiotemporal demand data, such as search and rescue (SAR) data. This study provides a quantitative comparison of various aggregation methodologies and their relation to distance and volume based aggregation errors. Design/methodology/approach This paper introduces and applies a framework for comparing both deterministic and stochastic aggregation methods using distance- and volume-based aggregation error metrics. This paper additionally applies weighted versions of these metrics to account for the reality that demand events are nonhomogeneous. These metrics are applied to a large, highly variable, spatiotemporal demand data set of SAR events in the Pacific Ocean. Comparisons using these metrics are conducted between six quadrat aggregations of varying scales and two zonal distribution models using hierarchical clustering. Findings As quadrat fidelity increases the distance-based aggregation error decreases, while the two deliberate zonal approaches further reduce this error while using fewer zones. However, the higher fidelity aggregations detrimentally affect volume error. Additionally, by splitting the SAR data set into training and test sets this paper shows the stochastic zonal distribution aggregation method is effective at simulating actual future demands. Originality/value This study indicates no singular best aggregation method exists, by quantifying trade-offs in aggregation-induced errors practitioners can utilize the method that minimizes errors most relevant to their study. Study also quantifies the ability of a stochastic zonal distribution method to effectively simulate future demand data.

Download Full-text

On Regression-Tree-Based Synthetic Data Methods for Business Data

Journal of Privacy and Confidentiality ◽

10.29012/jpc.v5i1.628 ◽

2013 ◽

Vol 5 (1) ◽

Author(s):

Joo Ho Lee ◽

In Yong Kim ◽

Christine M. O'Keefe

Keyword(s):

Statistical Models ◽

Synthetic Data ◽

Regression Tree ◽

Population Census ◽

Data Sets ◽

Data Set ◽

Confidential Data ◽

Business Data ◽

Different Characteristics ◽

The Impact

This paper concerns the use of synthetic data for protecting the confidentiality of business data during statistical analysis. Synthetic data sets are traditionally constructed by replacing sensitive values in a confidential data set with draws from statistical models estimated on the confidential data set. Unfortunately, the process of generating effective statistical models can be a difficult and labour-intensive task. Recently, it has been proposed to use easily-implemented methods from machine learning instead of statistical model estimation in the data synthesis task. J. Drechsler and J.P. Reiter (2011) have conducted an evaluation of four such methods, and have found that regression trees could give rise to synthetic data sets which provide reliable analysis results as well as low disclosure risks. Their conclusion was based on simulations using a subset of the 2002 Uganda census public use file. It is an interesting question whether the same conclusion applies to other types of data with different characteristics, for example business data which have quite different characteristics from population census and survey data. In particular, business data generally have few variables that are mostly categorical, and often have highly skewed distributions with outliers. In this paper we investigate the applicability of regression-tree-based methods for constructing synthetic business data. We give a detailed example comparing exploratory data analysis and linear regression results under two variants of a regression-tree-based synthetic data approach. We also include an evaluation of the analysis results with respect to the results of analysis of the original data. We further investigate the impact of different stopping criteria on performance. While it is certainly true that any method designed to protect confidentiality introduces error, and may indeed give misleading conclusions, our analysis of the results for synthesisers based on CART models has provided some evidence that this error is not random but is due to the particular characteristics of business data. We conclude that more careful analysis needs to be done in applying these methods and end users certainly need aware of possible discrepancies.

Download Full-text

Core Data Necessary for Reporting Clinical Trials on Nutrition in Infancy

Annals of Nutrition and Metabolism ◽

10.1159/000365766 ◽

2014 ◽

Vol 66 (1) ◽

pp. 31-35 ◽

Cited By ~ 4

Author(s):

Berthold Koletzko ◽

Mary Fewtrell ◽

Robert Gibson ◽

Johannes B. van Goudoever ◽

Olle Hernell ◽

...

Keyword(s):

Clinical Trials ◽

Infant Nutrition ◽

Main Body ◽

Data Set ◽

Core Data ◽

Confidential Data ◽

Previous Proposal ◽

Meta Analyses ◽

Made In

This paper presents an updated and revised summary of the ‘core data set' that has been proposed to be recorded and reported in all clinical trials on infant nutrition by the recently formed Consensus Group on Outcome Measures Made in Paediatric Enteral Nutrition Clinical Trials (COMMENT). This core data set was developed based on a previous proposal by the European Society for Paediatric Gastroenterology, Hepatology and Nutrition (ESPGHAN) Committee on Nutrition in 2003. It comprises confidential data to identify subjects and facilitate contact for further follow-up, data to characterize the cohort studied and data on withdrawals from the study, and some additional core data for all nutrition studies on preterm infants. We recommend that all studies on nutrition in infancy should collect and report this core data set to facilitate interpretation and comparison of results from clinical studies, and of systematic data evaluation and meta-analyses. Editors of journals publishing such reports are encouraged to require the reporting of the minimum data set described here either in the main body of the publication or as supplementary online material. © 2014 S. Karger AG, Basel

Download Full-text

Rotational Characteristics of the Green Solar Corona: 1947-1991

International Astronomical Union Colloquium ◽

10.1017/s0252921100025197 ◽

1994 ◽

Vol 144 ◽

pp. 139-141 ◽

Cited By ~ 2

Author(s):

J. Rybák ◽

V. Rušin ◽

M. Rybanský

Keyword(s):

Solar Corona ◽

World Wide ◽

Rotation Period ◽

Original Data ◽

Coronal Emission ◽

Time Intervals ◽

Data Set ◽

The World ◽

Coronal Emission Line ◽

Homogeneous Data

AbstractFe XIV 530.3 nm coronal emission line observations have been used for the estimation of the green solar corona rotation. A homogeneous data set, created from measurements of the world-wide coronagraphic network, has been examined with a help of correlation analysis to reveal the averaged synodic rotation period as a function of latitude and time over the epoch from 1947 to 1991.The values of the synodic rotation period obtained for this epoch for the whole range of latitudes and a latitude band ±30° are 27.52±0.12 days and 26.95±0.21 days, resp. A differential rotation of green solar corona, with local period maxima around ±60° and minimum of the rotation period at the equator, was confirmed. No clear cyclic variation of the rotation has been found for examinated epoch but some monotonic trends for some time intervals are presented.A detailed investigation of the original data and their correlation functions has shown that an existence of sufficiently reliable tracers is not evident for the whole set of examinated data. This should be taken into account in future more precise estimations of the green corona rotation period.

Download Full-text

Difference Fourier Analysis of Glucose Embedded and Frozen Hydrated Purple Membrane

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100053164 ◽

1982 ◽

Vol 40 ◽

pp. 74-75

Author(s):

Jules S. Jaffe ◽

Robert M. Glaeser

Keyword(s):

Purple Membrane ◽

Data Set ◽

High Resolution Data ◽

X Ray ◽

X Ray Crystallography ◽

Fourier Techniques ◽

Versus Protein ◽

The Difference ◽

Difference Fourier ◽

Ideal Method

Although difference Fourier techniques are standard in X-ray crystallography it has only been very recently that electron crystallographers have been able to take advantage of this method. We have combined a high resolution data set for frozen glucose embedded Purple Membrane (PM) with a data set collected from PM prepared in the frozen hydrated state in order to visualize any differences in structure due to the different methods of preparation. The increased contrast between protein-ice versus protein-glucose may prove to be an advantage of the frozen hydrated technique for visualizing those parts of bacteriorhodopsin that are embedded in glucose. In addition, surface groups of the protein may be disordered in glucose and ordered in the frozen state. The sensitivity of the difference Fourier technique to small changes in structure provides an ideal method for testing this hypothesis.

Download Full-text

Algorithms for automated montage synthesis of images from laser-scanning confocal microscopes

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100139627 ◽

1995 ◽

Vol 53 ◽

pp. 650-651

Author(s):

D. E. Becker

Keyword(s):

Image Analysis ◽

High Resolution ◽

Laser Scanning ◽

Cell Counting ◽

Wide Area ◽

Data Set ◽

Computational Synthesis ◽

Automated Cell Counting ◽

Partial View ◽

The Individual

An efficient, robust, and widely-applicable technique is presented for computational synthesis of high-resolution, wide-area images of a specimen from a series of overlapping partial views. This technique can also be used to combine the results of various forms of image analysis, such as segmentation, automated cell counting, deblurring, and neuron tracing, to generate representations that are equivalent to processing the large wide-area image, rather than the individual partial views. This can be a first step towards quantitation of the higher-level tissue architecture. The computational approach overcomes mechanical limitations, such as hysterisis and backlash, of microscope stages. It also automates a procedure that is currently done manually. One application is the high-resolution visualization and/or quantitation of large batches of specimens that are much wider than the field of view of the microscope.The automated montage synthesis begins by computing a concise set of landmark points for each partial view. The type of landmarks used can vary greatly depending on the images of interest. In many cases, image analysis performed on each data set can provide useful landmarks. Even when no such “natural” landmarks are available, image processing can often provide useful landmarks.

Download Full-text

Merging of electron-diffraction intensities acquired with a slow-scan CCD camera of crotoxin complex crystals to 3.5Å resolution

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100168256 ◽

1994 ◽

Vol 52 ◽

pp. 104-105

Author(s):

Jaap Brink ◽

Wah Chiu

Keyword(s):

Electron Diffraction ◽

Ccd Camera ◽

Crystal Thickness ◽

Local Contrast ◽

Camera Model ◽

Data Set ◽

3 Dimensional ◽

Diffraction Mode ◽

Diffraction Patterns ◽

Complex Crystals

Crotoxin complex is the principal neurotoxin of the South American rattlesnake, Crotalus durissus terrificus and has a molecular weight of 24 kDa. The protein is a heterodimer with subunit A assigneda chaperone function. Subunit B carries the lethal activity, which is exerted on both sides ofthe neuro-muscular junction, and which is thought to involve binding to the acetylcholine receptor. Insight in crotoxin complex’ mode of action can be gained from a 3 Å resolution structure obtained by electron crystallography. This abstract communicates our progress in merging the electron diffraction amplitudes into a 3-dimensional (3D) intensity data set close to completion. Since the thickness of crotoxin complex crystals varies from one crystal to the other, we chose to collect tilt series of electron diffraction patterns after determining their thickness. Furthermore, by making use of the symmetry present in these tilt data, intensities collected only from similar crystals will be merged.Suitable crystals of glucose-embedded crotoxin complex were searched for in the defocussed diffraction mode with the goniometer tilted to 55° of higher in a JEOL4000 electron cryo-microscopc operated at 400 kV with the crystals kept at -120°C in a Gatan 626 cryo-holder. The crystal thickness was measured using the local contrast of the crystal relative to the supporting film from search-mode images acquired using a 1024 x 1024 slow-scan CCD camera (model 679, Gatan Inc.).

Download Full-text

Three-dimensional image analysis and visualization in confocal light microscopy

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100146631 ◽

1993 ◽

Vol 51 ◽

pp. 158-159

Author(s):

J. K. Samarabandu ◽

R. Acharya ◽

D. R. Pareddy ◽

P. C. Cheng

Keyword(s):

Depth Perception ◽

Computer Graphic ◽

Three Dimensional ◽

Cellular Organization ◽

Data Set ◽

Computational Burden ◽

Interactive Operation ◽

Cell Organization ◽

Confocal Images ◽

3D Digitization

In the study of cell organization in a maize meristem, direct viewing of confocal optical sections in 3D (by means of 3D projection of the volumetric data set, Figure 1) becomes very difficult and confusing because of the large number of nucleus involved. Numerical description of the cellular organization (e.g. position, size and orientation of each structure) and computer graphic presentation are some of the solutions to effectively study the structure of such a complex system. An attempt at data-reduction by means of manually contouring cell nucleus in 3D was reported (Summers et al., 1990). Apart from being labour intensive, this 3D digitization technique suffers from the inaccuracies of manual 3D tracing related to the depth perception of the operator. However, it does demonstrate that reducing stack of confocal images to a 3D graphic representation helps to visualize and analyze complex tissues (Figure 2). This procedure also significantly reduce computational burden in an interactive operation.

Download Full-text

Identifying groups of airborne particles by ordination

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100172012 ◽

1994 ◽

Vol 52 ◽

pp. 856-857

Author(s):

M. Shlepr ◽

C. M. Vicroy

Keyword(s):

Elemental Analysis ◽

Energy Dispersive Spectroscopy ◽

Manufacturing Process ◽

Spatial Representation ◽

Airborne Particles ◽

Common Source ◽

Data Set ◽

Microelectronics Industry ◽

Induced Changes ◽

Source Materials

The microelectronics industry is heavily tasked with minimizing contaminates at all steps of the manufacturing process. Particles are generated by physical and/or chemical fragmentation from a mothersource. The tools and macrovolumes of chemicals used for processing, the environment surrounding the process, and the circuits themselves are all potential particle sources. A first step in eliminating these contaminants is to identify their source. Elemental analysis of the particles often proves useful toward this goal, and energy dispersive spectroscopy (EDS) is a commonly used technique. However, the large variety of source materials and process induced changes in the particles often make it difficult to discern if the particles are from a common source.Ordination is commonly used in ecology to understand community relationships. This technique usespair-wise measures of similarity. Separation of the data set is based on discrimination functions. Theend product is a spatial representation of the data with the distance between points equaling the degree of dissimilarity.

Download Full-text