Re-Identification risk in anonymized data sets with parent-child information

Privacy preserving in Data mining & publishing, plays a major role in today networked world. It is important to preserve the privacy of the vital information corresponding to a data set. This process can be achieved by k-anonymization solution for classification. Along with the privacy preserving using anonymization, yielding the optimized data sets is also of equal importance with a cost effective approach. In this paper Top-Down Refinement algorithm has been proposed which yields optimum results in a cost effective manner. Bayesian Classification has been proposed in this paper to predict class membership probabilities for a data tuple for which the associated class label is unknown.

Download Full-text

An Improved Classification Analysis on Utility Aware K-Anonymized Dataset

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.7748 ◽

2019 ◽

Vol 16 (2) ◽

pp. 445-452

Author(s):

Kishore S. Verma ◽

A. Rajesh ◽

Adeline J. S. Johnsana

Keyword(s):

Data Mining ◽

Analytical Approach ◽

Value Added ◽

Data Sets ◽

Data Set ◽

Privacy Preserving Data Mining ◽

Privacy Leakage ◽

Anonymized Data ◽

Null Values ◽

The Individual

K anonymization is one of the worldwide used approaches to protect the individual records from the privacy leakage attack of Privacy Preserving Data Mining (PPDM) arena. Typically anonymized dataset will impact the effectiveness of data mining results. Anyhow, currently researchers of PPDM progress in driving their efforts in finding out the optimum trade-off between privacy and utility. This work tends in bringing out the optimum classifier from a set of best classifiers of data mining approaches that are capable enough in generating value-added classifying results on utility aware k-anonymized data set. We performed the analytical approach on the data set that are anonymized in sense of accompanying the anonymity utility factors like null values count and transformation pattern loss. The experimentation is done with three widely used classifiers HNB, PART and J48 and these classifiers are analysed with Accuracy, F-measure, and ROC-AUC which are literately proved to be the perfect measures of classification. Our experimental analysis reveals the best classifiers on the utility aware anonymized data sets of Cell oriented Anonymization (CoA), Attribute oriented Anonymization (AoA) and Record oriented Anonymization (RoA).

Download Full-text

ZELDA: a 3D Image Segmentation and Parent-Child relation plugin for microscopy image analysis in napari

10.1101/2021.10.24.465596 ◽

2021 ◽

Author(s):

Rocco D'Antuono ◽

Giuseppina Pisignano

Keyword(s):

Image Analysis ◽

Image Segmentation ◽

Open Source ◽

Cell Number ◽

Cell Segmentation ◽

Data Sets ◽

3D Analysis ◽

3D Segmentation ◽

Analysis Workflow ◽

Parent Child

Bioimage analysis workflows allow the measurement of sample properties such as fluorescence intensity and polarization, cell number, and vesicles distribution, but often require the integration of multiple software tools. Furthermore, it is increasingly appreciated that to overcome the limitations of the 2D-view-based image analysis approaches and to correctly understand and interpret biological processes, a 3D segmentation of microscopy data sets becomes imperative. Despite the availability of numerous algorithms for the 2D and 3D segmentation, the latter still offers some challenges for the end-users, who often do not have either an extensive knowledge of the existing software or coding skills to link the output of multiple tools. While several commercial packages are available on the market, fewer are the open-source solutions able to execute a complete 3D analysis workflow. Here we present ZELDA, a new napari plugin that easily integrates the cutting-edge solutions offered by python ecosystem, such as scikit-image for image segmentation, matplotlib for data visualization, and napari multi-dimensional image viewer for 3D rendering. This plugin aims to provide interactive and zero-scripting customizable workflows for cell segmentation, vesicles counting, parent-child relation between objects, signal quantification, and results presentation; all included in the same open-source napari viewer, and 'few clicks away'.

Download Full-text

Learning Realistic Patterns from Visually Unrealistic Stimuli: Generalization and Data Anonymization

Journal of Artificial Intelligence Research ◽

10.1613/jair.1.13252 ◽

2021 ◽

Vol 72 ◽

pp. 1163-1214

Author(s):

Konstantinos Nikolaidis ◽

Stein Kristiansen ◽

Thomas Plagemann ◽

Vera Goebel ◽

Knut Liestøl ◽

...

Keyword(s):

Sleep Stage ◽

Original Data ◽

Training Data ◽

Data Sets ◽

Classification Models ◽

Data Anonymization ◽

Sleep Stage Classification ◽

Machine Learning Applications ◽

Anonymized Data ◽

Accuracy Difference

Good training data is a prerequisite to develop useful Machine Learning applications. However, in many domains existing data sets cannot be shared due to privacy regulations (e.g., from medical studies). This work investigates a simple yet unconventional approach for anonymized data synthesis to enable third parties to benefit from such anonymized data. We explore the feasibility of learning implicitly from visually unrealistic, task-relevant stimuli, which are synthesized by exciting the neurons of a trained deep neural network. As such, neuronal excitation can be used to generate synthetic stimuli. The stimuli data is used to train new classification models. Furthermore, we extend this framework to inhibit representations that are associated with specific individuals. We use sleep monitoring data from both an open and a large closed clinical study, and Electroencephalogram sleep stage classification data, to evaluate whether (1) end-users can create and successfully use customized classification models, and (2) the identity of participants in the study is protected. Extensive comparative empirical investigation shows that different algorithms trained on the stimuli are able to generalize successfully on the same task as the original model. Architectural and algorithmic similarity between new and original models play an important role in performance. For similar architectures, the performance is close to that of using the original data (e.g., Accuracy difference of 0.56%-3.82%, Kappa coefficient difference of 0.02-0.08). Further experiments show that the stimuli can provide state-ofthe-art resilience against adversarial association and membership inference attacks.

Download Full-text

ZELDA: A 3D Image Segmentation and Parent-Child Relation Plugin for Microscopy Image Analysis in napari

Frontiers in Computer Science ◽

10.3389/fcomp.2021.796117 ◽

2022 ◽

Vol 3 ◽

Author(s):

Rocco D’Antuono ◽

Giuseppina Pisignano

Keyword(s):

Image Analysis ◽

Image Segmentation ◽

Open Source ◽

Cell Number ◽

Cell Segmentation ◽

Data Sets ◽

3D Analysis ◽

3D Segmentation ◽

Analysis Workflow ◽

Parent Child

Bioimage analysis workflows allow the measurement of sample properties such as fluorescence intensity and polarization, cell number, and vesicles distribution, but often require the integration of multiple software tools. Furthermore, it is increasingly appreciated that to overcome the limitations of the 2D-view-based image analysis approaches and to correctly understand and interpret biological processes, a 3D segmentation of microscopy data sets becomes imperative. Despite the availability of numerous algorithms for the 2D and 3D segmentation, the latter still offers some challenges for the end-users, who often do not have either an extensive knowledge of the existing software or coding skills to link the output of multiple tools. While several commercial packages are available on the market, fewer are the open-source solutions able to execute a complete 3D analysis workflow. Here we present ZELDA, a new napari plugin that easily integrates the cutting-edge solutions offered by python ecosystem, such as scikit-image for image segmentation, matplotlib for data visualization, and napari multi-dimensional image viewer for 3D rendering. This plugin aims to provide interactive and zero-scripting customizable workflows for cell segmentation, vesicles counting, parent-child relation between objects, signal quantification, and results presentation; all included in the same open-source napari viewer, and “few clicks away”.

Download Full-text

The Role of Information Theory in the Field of Big Data Privacy

Mathematical Problems of Computer Science ◽

10.51408/1963-0071 ◽

2021 ◽

Vol 55 ◽

pp. 35-43

Author(s):

Mariam Haroutunian ◽

◽

Karen Mastoyan ◽

Keyword(s):

Information Theory ◽

Big Data ◽

Data Privacy ◽

Differential Privacy ◽

Research Area ◽

Published Data ◽

Data Sets ◽

Information Theoretic ◽

Anonymized Data ◽

Current Article

Protecting privacy in Big Data is a rapidly growing research area. The first approach towards privacy assurance was the anonymity method. However, recent research indicated that simply anonymized data sets can be easily attacked. Later, differential privacy was proposed, which proved to be the most promising approach. The trade-off between privacy and the usefulness of published data, as well as other problems, such as the availability of metrics to compare different ways of achieving anonymity, are in the realm of Information Theory. Although a number of review articles are available in literature, the information - theoretic methods capacities haven’t been paid due attention. In the current article an overview of state-of-the-art methods from Information Theory to ensure privacy are provided.

Download Full-text

An example of spectrum imaging used for comparison of EELS quantitative analysis techniques on Al-Li

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s042482010008794x ◽

1991 ◽

Vol 49 ◽

pp. 726-727

Author(s):

John A. Hunt

Keyword(s):

Quantitative Analysis ◽

Large Data ◽

Difference Spectrum ◽

Large Data Sets ◽

Foil Thickness ◽

Data Sets ◽

Analysis Techniques ◽

Spectrum Imaging ◽

Normal Spectrum ◽

Electron Energy Loss

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].

Download Full-text

Computer-aided methods for 3-D visualization of serial sections and thick biological specimens

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100129930 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1060-1061

Author(s):

Mark Ellisman ◽

Maryann Martone ◽

Gabriel Soto ◽

Eleizer Masliah ◽

David Hessler ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Three Dimensional ◽

Neuritic Plaque ◽

Dimensional Structure ◽

Data Sets ◽

Molecular Physiology ◽

Research Activities ◽

Computer Aided ◽

Dimensional Reconstruction

Structurally-oriented biologists examine cells, tissues, organelles and macromolecules in order to gain insight into cellular and molecular physiology by relating structure to function. The understanding of these structures can be greatly enhanced by the use of techniques for the visualization and quantitative analysis of three-dimensional structure. Three projects from current research activities will be presented in order to illustrate both the present capabilities of computer aided techniques as well as their limitations and future possibilities.The first project concerns the three-dimensional reconstruction of the neuritic plaques found in the brains of patients with Alzheimer's disease. We have developed a software package “Synu” for investigation of 3D data sets which has been used in conjunction with laser confocal light microscopy to study the structure of the neuritic plaque. Tissue sections of autopsy samples from patients with Alzheimer's disease were double-labeled for tau, a cytoskeletal marker for abnormal neurites, and synaptophysin, a marker of presynaptic terminals.

Download Full-text

Direct phase determination in electron crystallography: small organic molecules

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100130468 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1166-1167

Author(s):

Douglas L. Dorset

Keyword(s):

Organic Molecules ◽

Data Sets ◽

Temperature Structure ◽

3D Analysis ◽

Intensity Data ◽

Electron Crystallography ◽

Phase Determination ◽

Measured Intensity ◽

3D Data

The quantitative use of electron diffraction intensity data for the determination of crystal structures represents the pioneering achievement in the electron crystallography of organic molecules, an effort largely begun by B. K. Vainshtein and his co-workers. However, despite numerous representative structure analyses yielding results consistent with X-ray determination, this entire effort was viewed with considerable mistrust by many crystallographers. This was no doubt due to the rather high crystallographic R-factors reported for some structures and, more importantly, the failure to convince many skeptics that the measured intensity data were adequate for ab initio structure determinations.We have recently demonstrated the utility of these data sets for structure analyses by direct phase determination based on the probabilistic estimate of three- and four-phase structure invariant sums. Examples include the structure of diketopiperazine using Vainshtein's 3D data, a similar 3D analysis of the room temperature structure of thiourea, and a zonal determination of the urea structure, the latter also based on data collected by the Moscow group.

Download Full-text

Automated cell counting of astrocytes on patterned substrates containing aliphatic and charged properties

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s042482010014124x ◽

1995 ◽

Vol 53 ◽

pp. 974-975

Author(s):

W. Shain ◽

H. Ancin ◽

H.C. Craighead ◽

M. Isaacson ◽

L. Kam ◽

...

Keyword(s):

Cell Culture ◽

Cell Attachment ◽

Culture Method ◽

Cell Counting ◽

Data Sets ◽

Nuclear Staining ◽

Double Positive ◽

A Cell ◽

Wafer Test ◽

Cell Densities

Neural protheses have potential to restore nervous system functions lost by trauma or disease. Nanofabrication extends this approach to implants for stimulating and recording from single or small groups of neurons in the spinal cord and brain; however, tissue compatibility is a major limitation to their practical application. We are using a cell culture method for quantitatively measuring cell attachment to surfaces designed for nanofabricated neural prostheses.Silicon wafer test surfaces composed of 50-μm bars separated by aliphatic regions were fabricated using methods similar to a procedure described by Kleinfeld et al. Test surfaces contained either a single or double positive charge/residue. Cyanine dyes (diIC18(3)) stained the background and cell membranes (Fig 1); however, identification of individual cells at higher densities was difficult (Fig 2). Nuclear staining with acriflavine allowed discrimination of individual cells and permitted automated counting of nuclei using 3-D data sets from the confocal microscope (Fig 3). For cell attachment assays, LRM5 5 astroglial cells and astrocytes in primary cell culture were plated at increasing cell densities on test substrates, incubated for 24 hr, fixed, stained, mounted on coverslips, and imaged with a 10x objective.

Download Full-text