A Partial Optimization Approach for Privacy Preserving Frequent Itemset Mining

Shibnath Mukherjee; Aryya Gangopadhyay; Zhiyuan Chen

doi:10.4018/jcmam.2010072002

A Partial Optimization Approach for Privacy Preserving Frequent Itemset Mining

Cyber Crime ◽

10.4018/978-1-61350-323-2.ch208 ◽

2013 ◽

pp. 325-340

Author(s):

Shibnath Mukherjee ◽

Aryya Gangopadhyay ◽

Zhiyuan Chen

Keyword(s):

Low Cost ◽

Synthetic Data ◽

Frequent Itemset ◽

Optimization Approach ◽

Data Generator ◽

Hidden Cost ◽

Potential Benefits ◽

The Difference ◽

The Given ◽

Optimal Set

While data mining has been widely acclaimed as a technology that can bring potential benefits to organizations, such efforts may be negatively impacted by the possibility of discovering sensitive patterns, particularly in patient data. In this article the authors present an approach to identify the optimal set of transactions that, if sanitized, would result in hiding sensitive patterns while reducing the accidental hiding of legitimate patterns and the damage done to the database as much as possible. Their methodology allows the user to adjust their preference on the weights assigned to benefits in terms of the number of restrictive patterns hidden, cost in terms of the number of legitimate patterns hidden, and damage to the database in terms of the difference between marginal frequencies of items for the original and sanitized databases. Most approaches in solving the given problem found in literature are all-heuristic based without formal treatment for optimality. While in a few work, ILP has been used previously as a formal optimization approach, the novelty of this method is the extremely low cost-complexity model in contrast to the others. They implement our methodology in C and C++ and ran several experiments with synthetic data generated with the IBM synthetic data generator. The experiments show excellent results when compared to those in the literature.

Download Full-text

A Partial Optimization Approach for Privacy Preserving Frequent Itemset Mining

Innovations in Data Methodologies and Computational Algorithms for Medical Applications ◽

10.4018/978-1-4666-0282-3.ch002 ◽

2012 ◽

pp. 19-32

Author(s):

Shibnath Mukherjee ◽

Aryya Gangopadhyay ◽

Zhiyuan Chen

Keyword(s):

Low Cost ◽

Synthetic Data ◽

Frequent Itemset ◽

Optimization Approach ◽

Data Generator ◽

Hidden Cost ◽

Potential Benefits ◽

The Difference ◽

The Given ◽

Optimal Set

While data mining has been widely acclaimed as a technology that can bring potential benefits to organizations, such efforts may be negatively impacted by the possibility of discovering sensitive patterns, particularly in patient data. In this article the authors present an approach to identify the optimal set of transactions that, if sanitized, would result in hiding sensitive patterns while reducing the accidental hiding of legitimate patterns and the damage done to the database as much as possible. Their methodology allows the user to adjust their preference on the weights assigned to benefits in terms of the number of restrictive patterns hidden, cost in terms of the number of legitimate patterns hidden, and damage to the database in terms of the difference between marginal frequencies of items for the original and sanitized databases. Most approaches in solving the given problem found in literature are all-heuristic based without formal treatment for optimality. While in a few work, ILP has been used previously as a formal optimization approach, the novelty of this method is the extremely low cost-complexity model in contrast to the others. They implement our methodology in C and C++ and ran several experiments with synthetic data generated with the IBM synthetic data generator. The experiments show excellent results when compared to those in the literature.

Download Full-text

Variety resistance of winter barley to powdery mildew in the field in 1976−2005

Czech Journal of Genetics and Plant Breeding ◽

10.17221/2067-cjgpb ◽

2008 ◽

Vol 43 (No. 3) ◽

pp. 87-96 ◽

Cited By ~ 5

Author(s):

A. Dreiseitl

Keyword(s):

Powdery Mildew ◽

Disease Severity ◽

Powdery Mildew Resistance ◽

Specific Resistance ◽

Winter Barley ◽

Mildew Resistance ◽

Average Resistance ◽

The Difference ◽

The Given ◽

Resistance Of Varieties

The results of evaluation of powdery mildew resistance in winter barley varieties in 285 Czech Official Trials conducted at 20 locations were analysed. Over the period, the number of varieties tested per year increased from four to seven in 1976−1985 to 53−61 in 2002−2005. To assess the resistance of varieties, only trials with sufficient disease severity were used. In 1976−2000, six varieties (1.7% of the varieties tested in the given years) ranked among resistant (average resistance of a variety in a year > 7.5) including NR-468 possessing the gene Mla13, KM-2099 with mlo and Marinka with the genes Mla7, MlaMu2. In 2001−2005, already 33 varieties (16.9%) ranked among resistant, of which Traminer possessing the genes Ml(St) and Ml(IM9 dominated. The proportion of susceptible varieties (average resistance ≤ 5.5) did not change in the two mentioned periods. Two-rowed varieties began to be tested as late as in 1990 (the first variety was Danilo), however, no difference was found in the resistance of two- and six-rowed varieties. Using an example of two pairs of varieties (Dura-Miraj and Marinka-Tiffany) with identical genes for specific resistance but with different resistance in the field, the efficiency of non-specific resistance is discussed. The resistance of domestic and foreign varieties was similar in 1994−2000; however, in 2001−2005 the difference was 0.75 point to disadvantage of domestic ones.

Download Full-text

G-Tric: generating three-way synthetic datasets with triclustering solutions

BMC Bioinformatics ◽

10.1186/s12859-020-03925-4 ◽

2021 ◽

Vol 22 (1) ◽

Author(s):

João Lobo ◽

Rui Henriques ◽

Sara C. Madeira

Keyword(s):

State Of The Art ◽

Synthetic Data ◽

Ground Truth ◽

Real Data ◽

Three Dimensions ◽

Additional Advantage ◽

Urban Dynamics ◽

Data Generator ◽

Real World Datasets ◽

Synthetic Datasets

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.

Download Full-text

Some remarks on the stability of the Cauchy equation and completeness

Aequationes Mathematicae ◽

10.1007/s00010-021-00804-y ◽

2021 ◽

Author(s):

Harald Fripertinger ◽

Jens Schwaiger

Keyword(s):

Normed Space ◽

Additive Function ◽

Functional Equations ◽

Constant Function ◽

Cauchy Equation ◽

Cauchy Difference ◽

International Conference ◽

The Difference ◽

The Stability ◽

The Given

AbstractIt was proved in Forti and Schwaiger (C R Math Acad Sci Soc R Can 11(6):215–220, 1989), Schwaiger (Aequ Math 35:120–121, 1988) and with different methods in Schwaiger (Developments in functional equations and related topics. Selected papers based on the presentations at the 16th international conference on functional equations and inequalities, ICFEI, Bȩdlewo, Poland, May 17–23, 2015, Springer, Cham, pp 275–295, 2017) that under the assumption that every function defined on suitable abelian semigroups with values in a normed space such that the norm of its Cauchy difference is bounded by a constant (function) is close to some additive function, i.e., the norm of the difference between the given function and that additive function is also bounded by a constant, the normed space must necessarily be complete. By Schwaiger (Ann Math Sil 34:151–163, 2020) this is also true in the non-archimedean case. Here we discuss the situation when the bound is a suitable non-constant function.

Download Full-text

Molecularly imprinted electrospun fiber membrane for colorimetric detection of hexanoic acid

e-Polymers ◽

10.1515/epoly-2021-0049 ◽

2021 ◽

Vol 21 (1) ◽

pp. 500-510

Author(s):

Xiaoguang Ying ◽

Jieyuan He ◽

Xiao Li

Keyword(s):

Low Cost ◽

Electrospun Fiber ◽

Colorimetric Detection ◽

Hexanoic Acid ◽

Bromocresol Purple ◽

Fiber Membrane ◽

Color Changes ◽

Before And After ◽

The Difference ◽

Imagej Software

Abstract An imprinted electrospun fiber membrane was developed for the detection of volatile organic acids, which are key components of human body odor. In this study, hexanoic acid (HA) was selected as the target, polymethyl methacrylate (PMMA) was used as the substrate, and colorimetric detection of HA was achieved by a bromocresol purple (BCP) chromogenic agent. The results showed that the morphology of the fiber membrane was uniform and continuous, and it showed excellent selectivity and specificity to HA. Photographs of the color changes before and after fiber membrane adsorption were recorded by a camera and quantified by ImageJ software by the difference in gray value (ΔGray). This method is simple, intuitive, and low cost and has great potential for application in human odor analysis.

Download Full-text

Soft Spherical Tensegrity Robot Design Using Rod-Centered Actuation and Control

Volume 5A: 40th Mechanisms and Robotics Conference ◽

10.1115/detc2016-60550 ◽

2016 ◽

Cited By ~ 11

Author(s):

Lee-Huang Chen ◽

Kyunam Kim ◽

Ellande Tang ◽

Kevin Li ◽

Richard House ◽

...

Keyword(s):

Degrees Of Freedom ◽

Low Cost ◽

Space Exploration ◽

Robot Design ◽

The Novel ◽

Flexible Design ◽

Spherical Robot ◽

Potential Benefits ◽

And Performance ◽

And Control

This paper presents the design, analysis and testing of a fully actuated modular spherical tensegrity robot for co-robotic and space exploration applications. Robots built from tensegrity structures (composed of pure tensile and compression elements) have many potential benefits including high robustness through redundancy, many degrees of freedom in movement and flexible design. However to fully take advantage of these properties a significant fraction of the tensile elements should be active, leading to a potential increase in complexity, messy cable and power routing systems and increased design difficulty. Here we describe an elegant solution to a fully actuated tensegrity robot: The TT-3 (version 3) tensegrity robot, developed at UC Berkeley, in collaboration with NASA Ames, is a lightweight, low cost, modular, and rapidly prototyped spherical tensegrity robot. This robot is based on a ball-shaped six-bar tensegrity structure and features a unique modular rod-centered distributed actuation and control architecture. This paper presents the novel mechanism design, architecture and simulations of TT-3, the first untethered, fully actuated cable-driven six-bar tensegrity spherical robot ever built and tested for mobility. Furthermore, this paper discusses the controls and preliminary testing performed to observe the system’s behavior and performance.

Download Full-text

Quantitative Analysis of Fluorescence Detection Using a Smartphone Camera for a PCR Chip

Sensors ◽

10.3390/s21113917 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3917

Author(s):

Jong-Dae Kim ◽

Chan-Young Park ◽

Yu-Seop Kim ◽

Ji-Soo Hwang

Keyword(s):

Quantitative Analysis ◽

Fluorescence Detection ◽

High Performance ◽

Low Cost ◽

Dna Amplification ◽

Fluorescent Detection ◽

Rt Pcr ◽

Commercial System ◽

Intensity Changes ◽

The Difference

Most existing commercial real-time polymerase chain reaction (RT-PCR) instruments are bulky because they contain expensive fluorescent detection sensors or complex optical structures. In this paper, we propose an RT-PCR system using a camera module for smartphones that is an ultra small, high-performance and low-cost sensor for fluorescence detection. The proposed system provides stable DNA amplification. A quantitative analysis of fluorescence intensity changes shows the camera’s performance compared with that of commercial instruments. Changes in the performance between the experiments and the sets were also observed based on the threshold cycle values in a commercial RT-PCR system. The overall difference in the measured threshold cycles between the commercial system and the proposed camera was only 0.76 cycles, verifying the performance of the proposed system. The set calibration even reduced the difference to 0.41 cycles, which was less than the experimental variation in the commercial system, and there was no difference in performance.

Download Full-text

SIMULATION SCHEME MODELING OF THE SUPER-SPEED TIME BUFFER

Электросвязь ◽

10.34832/elsv.2020.9.8.011 ◽

2020 ◽

Author(s):

А.М. САЖНЕВ ◽

Л.Г. РОГУЛИНА

Keyword(s):

High Speed ◽

High Efficiency ◽

Low Cost ◽

Circuit Modeling ◽

Clock Signal ◽

Behavioral Models ◽

Software Environment ◽

Time Buffer ◽

The Given ◽

Simulation Scheme

Приводятся результаты моделирования сверхскоростного буфера тактовых сигналов, выполненного на базе арсенид-галлиевых n-канальных транзисторов в среде OrCAD и полностью отвечающего следующим требованиям: высокие технические характеристики, малые размеры, высокая частота и КПД, гибкость применения. Приведенные поведенческие модели допускают использование любой программной среды по схемотехническому моделированию. The results of simulation of an ultra-high-speed clock signal buffer based on gallium arsenide n-channel transistors in OrCAD are presented, which fully meets the following requirements: high technical characteristics, application flexibility, low cost, small size, high frequency, and high efficiency. The given behavioral models allow the use of any software environment for circuit modeling.

Download Full-text

Lead-lag series and staged parallel operational strategies improve the performance and cost-effectiveness of bonechar for control of fluoride in groundwater

Journal of Water Sanitation and Hygiene for Development ◽

10.2166/washdev.2018.111 ◽

2018 ◽

Vol 8 (4) ◽

pp. 777-784 ◽

Cited By ~ 1

Author(s):

J. Kearns ◽

A. Krupp ◽

E. Diek ◽

S. Mitchell ◽

S. Dossi ◽

...

Keyword(s):

Low Cost ◽

Operational Strategies ◽

Decentralized Treatment ◽

Column Testing ◽

Long Run ◽

Resource Poor ◽

Potential Benefits ◽

Treatment Technologies ◽

Operational Modes ◽

Local Materials

Abstract Affordable, locally managed, decentralized treatment technologies are needed to protect health in resource-poor regions where communities consume groundwater containing elevated levels of fluoride (F). Bonechar is a promising low-cost sorbent for F that can be produced using local materials and simple pyrolysis technology. However, the sorption capacity of bonechar is low relative to the quantities of F that must be removed to meet health criteria (typically several mg/L), especially at pH typical of groundwaters containing high levels of geogenic F. This necessitates large bonechar contactors and/or frequent sorbent replacement, which could be prohibitively costly in materials and labor. One strategy for improving the feasibility of bonechar water treatment is to utilize lead-lag series or staged parallel configurations of two or more contactors. This study used column testing to quantify potential benefits to bonechar use rate, replacement frequency, and long-run average F concentration in treated water of lead-lag series and staged parallel operational modes compared with single contactor mode. Lead-lag series operation exhibited the largest reduction in bonechar use rate (46% reduction over single contactor mode compared with 29% reduction for staged parallel) and lowest long-run average F levels when treating central Mexican groundwater at pH 8.2 containing 8.5 mg/L F.

Download Full-text