scholarly journals A Partial Optimization Approach for Privacy Preserving Frequent Itemset Mining

Author(s):  
Shibnath Mukherjee ◽  
Aryya Gangopadhyay ◽  
Zhiyuan Chen

While data mining has been widely acclaimed as a technology that can bring potential benefits to organizations, such efforts may be negatively impacted by the possibility of discovering sensitive patterns, particularly in patient data. In this article the authors present an approach to identify the optimal set of transactions that, if sanitized, would result in hiding sensitive patterns while reducing the accidental hiding of legitimate patterns and the damage done to the database as much as possible. Their methodology allows the user to adjust their preference on the weights assigned to benefits in terms of the number of restrictive patterns hidden, cost in terms of the number of legitimate patterns hidden, and damage to the database in terms of the difference between marginal frequencies of items for the original and sanitized databases. Most approaches in solving the given problem found in literature are all-heuristic based without formal treatment for optimality. While in a few work, ILP has been used previously as a formal optimization approach, the novelty of this method is the extremely low cost-complexity model in contrast to the others. They implement our methodology in C and C++ and ran several experiments with synthetic data generated with the IBM synthetic data generator. The experiments show excellent results when compared to those in the literature.

Cyber Crime ◽  
2013 ◽  
pp. 325-340
Author(s):  
Shibnath Mukherjee ◽  
Aryya Gangopadhyay ◽  
Zhiyuan Chen

While data mining has been widely acclaimed as a technology that can bring potential benefits to organizations, such efforts may be negatively impacted by the possibility of discovering sensitive patterns, particularly in patient data. In this article the authors present an approach to identify the optimal set of transactions that, if sanitized, would result in hiding sensitive patterns while reducing the accidental hiding of legitimate patterns and the damage done to the database as much as possible. Their methodology allows the user to adjust their preference on the weights assigned to benefits in terms of the number of restrictive patterns hidden, cost in terms of the number of legitimate patterns hidden, and damage to the database in terms of the difference between marginal frequencies of items for the original and sanitized databases. Most approaches in solving the given problem found in literature are all-heuristic based without formal treatment for optimality. While in a few work, ILP has been used previously as a formal optimization approach, the novelty of this method is the extremely low cost-complexity model in contrast to the others. They implement our methodology in C and C++ and ran several experiments with synthetic data generated with the IBM synthetic data generator. The experiments show excellent results when compared to those in the literature.


Author(s):  
Shibnath Mukherjee ◽  
Aryya Gangopadhyay ◽  
Zhiyuan Chen

While data mining has been widely acclaimed as a technology that can bring potential benefits to organizations, such efforts may be negatively impacted by the possibility of discovering sensitive patterns, particularly in patient data. In this article the authors present an approach to identify the optimal set of transactions that, if sanitized, would result in hiding sensitive patterns while reducing the accidental hiding of legitimate patterns and the damage done to the database as much as possible. Their methodology allows the user to adjust their preference on the weights assigned to benefits in terms of the number of restrictive patterns hidden, cost in terms of the number of legitimate patterns hidden, and damage to the database in terms of the difference between marginal frequencies of items for the original and sanitized databases. Most approaches in solving the given problem found in literature are all-heuristic based without formal treatment for optimality. While in a few work, ILP has been used previously as a formal optimization approach, the novelty of this method is the extremely low cost-complexity model in contrast to the others. They implement our methodology in C and C++ and ran several experiments with synthetic data generated with the IBM synthetic data generator. The experiments show excellent results when compared to those in the literature.


2008 ◽  
Vol 43 (No. 3) ◽  
pp. 87-96 ◽  
Author(s):  
A. Dreiseitl

The results of evaluation of powdery mildew resistance in winter barley varieties in 285 Czech Official Trials conducted at 20 locations were analysed. Over the period, the number of varieties tested per year increased from four to seven in 1976−1985 to 53−61 in 2002−2005. To assess the resistance of varieties, only trials with sufficient disease severity were used. In 1976−2000, six varieties (1.7% of the varieties tested in the given years) ranked among resistant (average resistance of a variety in a year > 7.5) including NR-468 possessing the gene <i>Mla13</i>, KM-2099 with <i>mlo</i> and Marinka with the genes <i>Mla7</i>, <i>MlaMu2</i>. In 2001−2005, already 33 varieties (16.9%) ranked among resistant, of which Traminer possessing the genes <i>Ml(St)</i> and <i>Ml(IM9 </i> dominated. The proportion of susceptible varieties (average resistance ≤ 5.5) did not change in the two mentioned periods. Two-rowed varieties began to be tested as late as in 1990 (the first variety was Danilo), however, no difference was found in the resistance of two- and six-rowed varieties. Using an example of two pairs of varieties (Dura-Miraj and Marinka-Tiffany) with identical genes for specific resistance but with different resistance in the field, the efficiency of non-specific resistance is discussed. The resistance of domestic and foreign varieties was similar in 1994−2000; however, in 2001−2005 the difference was 0.75 point to disadvantage of domestic ones.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
João Lobo ◽  
Rui Henriques ◽  
Sara C. Madeira

Abstract Background Three-way data started to gain popularity due to their increasing capacity to describe inherently multivariate and temporal events, such as biological responses, social interactions along time, urban dynamics, or complex geophysical phenomena. Triclustering, subspace clustering of three-way data, enables the discovery of patterns corresponding to data subspaces (triclusters) with values correlated across the three dimensions (observations $$\times$$ × features $$\times$$ × contexts). With increasing number of algorithms being proposed, effectively comparing them with state-of-the-art algorithms is paramount. These comparisons are usually performed using real data, without a known ground-truth, thus limiting the assessments. In this context, we propose a synthetic data generator, G-Tric, allowing the creation of synthetic datasets with configurable properties and the possibility to plant triclusters. The generator is prepared to create datasets resembling real 3-way data from biomedical and social data domains, with the additional advantage of further providing the ground truth (triclustering solution) as output. Results G-Tric can replicate real-world datasets and create new ones that match researchers needs across several properties, including data type (numeric or symbolic), dimensions, and background distribution. Users can tune the patterns and structure that characterize the planted triclusters (subspaces) and how they interact (overlapping). Data quality can also be controlled, by defining the amount of missing, noise or errors. Furthermore, a benchmark of datasets resembling real data is made available, together with the corresponding triclustering solutions (planted triclusters) and generating parameters. Conclusions Triclustering evaluation using G-Tric provides the possibility to combine both intrinsic and extrinsic metrics to compare solutions that produce more reliable analyses. A set of predefined datasets, mimicking widely used three-way data and exploring crucial properties was generated and made available, highlighting G-Tric’s potential to advance triclustering state-of-the-art by easing the process of evaluating the quality of new triclustering approaches.


Author(s):  
Harald Fripertinger ◽  
Jens Schwaiger

AbstractIt was proved in Forti and Schwaiger (C R Math Acad Sci Soc R Can 11(6):215–220, 1989), Schwaiger (Aequ Math 35:120–121, 1988) and with different methods in Schwaiger (Developments in functional equations and related topics. Selected papers based on the presentations at the 16th international conference on functional equations and inequalities, ICFEI, Bȩdlewo, Poland, May 17–23, 2015, Springer, Cham, pp 275–295, 2017) that under the assumption that every function defined on suitable abelian semigroups with values in a normed space such that the norm of its Cauchy difference is bounded by a constant (function) is close to some additive function, i.e., the norm of the difference between the given function and that additive function is also bounded by a constant, the normed space must necessarily be complete. By Schwaiger (Ann Math Sil 34:151–163, 2020) this is also true in the non-archimedean case. Here we discuss the situation when the bound is a suitable non-constant function.


e-Polymers ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 500-510
Author(s):  
Xiaoguang Ying ◽  
Jieyuan He ◽  
Xiao Li

Abstract An imprinted electrospun fiber membrane was developed for the detection of volatile organic acids, which are key components of human body odor. In this study, hexanoic acid (HA) was selected as the target, polymethyl methacrylate (PMMA) was used as the substrate, and colorimetric detection of HA was achieved by a bromocresol purple (BCP) chromogenic agent. The results showed that the morphology of the fiber membrane was uniform and continuous, and it showed excellent selectivity and specificity to HA. Photographs of the color changes before and after fiber membrane adsorption were recorded by a camera and quantified by ImageJ software by the difference in gray value (ΔGray). This method is simple, intuitive, and low cost and has great potential for application in human odor analysis.


Author(s):  
Lee-Huang Chen ◽  
Kyunam Kim ◽  
Ellande Tang ◽  
Kevin Li ◽  
Richard House ◽  
...  

This paper presents the design, analysis and testing of a fully actuated modular spherical tensegrity robot for co-robotic and space exploration applications. Robots built from tensegrity structures (composed of pure tensile and compression elements) have many potential benefits including high robustness through redundancy, many degrees of freedom in movement and flexible design. However to fully take advantage of these properties a significant fraction of the tensile elements should be active, leading to a potential increase in complexity, messy cable and power routing systems and increased design difficulty. Here we describe an elegant solution to a fully actuated tensegrity robot: The TT-3 (version 3) tensegrity robot, developed at UC Berkeley, in collaboration with NASA Ames, is a lightweight, low cost, modular, and rapidly prototyped spherical tensegrity robot. This robot is based on a ball-shaped six-bar tensegrity structure and features a unique modular rod-centered distributed actuation and control architecture. This paper presents the novel mechanism design, architecture and simulations of TT-3, the first untethered, fully actuated cable-driven six-bar tensegrity spherical robot ever built and tested for mobility. Furthermore, this paper discusses the controls and preliminary testing performed to observe the system’s behavior and performance.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3917
Author(s):  
Jong-Dae Kim ◽  
Chan-Young Park ◽  
Yu-Seop Kim ◽  
Ji-Soo Hwang

Most existing commercial real-time polymerase chain reaction (RT-PCR) instruments are bulky because they contain expensive fluorescent detection sensors or complex optical structures. In this paper, we propose an RT-PCR system using a camera module for smartphones that is an ultra small, high-performance and low-cost sensor for fluorescence detection. The proposed system provides stable DNA amplification. A quantitative analysis of fluorescence intensity changes shows the camera’s performance compared with that of commercial instruments. Changes in the performance between the experiments and the sets were also observed based on the threshold cycle values in a commercial RT-PCR system. The overall difference in the measured threshold cycles between the commercial system and the proposed camera was only 0.76 cycles, verifying the performance of the proposed system. The set calibration even reduced the difference to 0.41 cycles, which was less than the experimental variation in the commercial system, and there was no difference in performance.


Author(s):  
А.М. САЖНЕВ ◽  
Л.Г. РОГУЛИНА

Приводятся результаты моделирования сверхскоростного буфера тактовых сигналов, выполненного на базе арсенид-галлиевых n-канальных транзисторов в среде OrCAD и полностью отвечающего следующим требованиям: высокие технические характеристики, малые размеры, высокая частота и КПД, гибкость применения. Приведенные поведенческие модели допускают использование любой программной среды по схемотехническому моделированию. The results of simulation of an ultra-high-speed clock signal buffer based on gallium arsenide n-channel transistors in OrCAD are presented, which fully meets the following requirements: high technical characteristics, application flexibility, low cost, small size, high frequency, and high efficiency. The given behavioral models allow the use of any software environment for circuit modeling.


2018 ◽  
Vol 8 (4) ◽  
pp. 777-784 ◽  
Author(s):  
J. Kearns ◽  
A. Krupp ◽  
E. Diek ◽  
S. Mitchell ◽  
S. Dossi ◽  
...  

Abstract Affordable, locally managed, decentralized treatment technologies are needed to protect health in resource-poor regions where communities consume groundwater containing elevated levels of fluoride (F). Bonechar is a promising low-cost sorbent for F that can be produced using local materials and simple pyrolysis technology. However, the sorption capacity of bonechar is low relative to the quantities of F that must be removed to meet health criteria (typically several mg/L), especially at pH typical of groundwaters containing high levels of geogenic F. This necessitates large bonechar contactors and/or frequent sorbent replacement, which could be prohibitively costly in materials and labor. One strategy for improving the feasibility of bonechar water treatment is to utilize lead-lag series or staged parallel configurations of two or more contactors. This study used column testing to quantify potential benefits to bonechar use rate, replacement frequency, and long-run average F concentration in treated water of lead-lag series and staged parallel operational modes compared with single contactor mode. Lead-lag series operation exhibited the largest reduction in bonechar use rate (46% reduction over single contactor mode compared with 29% reduction for staged parallel) and lowest long-run average F levels when treating central Mexican groundwater at pH 8.2 containing 8.5 mg/L F.


Sign in / Sign up

Export Citation Format

Share Document