function dependency
Recently Published Documents


TOTAL DOCUMENTS

12
(FIVE YEARS 6)

H-INDEX

2
(FIVE YEARS 0)

Author(s):  
Andriy Koval ◽  
Kate Smolina ◽  
Anthony Leamon

IntroductionWhen reporting disease rates to the public, a health system must take precaution to protect released data from re-identification risks. While specific guidelines and methods vary across data systems and governances 1 , redaction of cells with small values is a key component in any approach for preparing data for public release. These preparations, when conducted manually, have proven to be arduous, time consuming, and prone to human error. Although finding a “small” value (e.g. “< 5 ” ) is straightforward, detecting conditions in which suppressed values could be recalculated from related cells involves human judgement. Objectives and ApproachGuided by the real-world objective to reports the rates of chronic diseases in British Columbia, we aimed to design a reproducible workflow that would augment human decision-making and offer a nimble quality control tool, approachable by epidemiologists without technical background. Our workflow (1) splits data into disease-by-year data frames of a specific form, (2) applies a sequence of algorithms trained to recognize conditions that made recalculation of suppressed values possible and (3) prints a graph for each case of suggested automatic redaction to be confirmed by a human. ResultsThe augmented suppression system was successfully integrated into the maintenance of Chronic Disease Dashboard, an online reporting tool of the Observatory for Population and Public Health designed to address the gap in surveillance of chronic diseases in British Columbia. Anticipating the evolution of suppression logic, we isolated the logical tests responsible for redaction and provided several options to vary the degree of preserved information. Conclusion / ImplicationsInstead of employing a complex generalizable solution, we make a case for organizing the procedure for small cell redaction as a data visualization task, allowing for straightforward quality control of suppression decision and thus more approachable to a non-technical audience, as well as for employing such learning devices as workflow maps and function dependency trees for structuring applied projects and ensuring their reproducibility.


2019 ◽  
Vol 15 (11) ◽  
pp. 155014771988989
Author(s):  
Jinlin Wang ◽  
Haining Yu ◽  
Xing Wang ◽  
Hongli Zhang ◽  
Binxing Fang ◽  
...  

The application of the Internet of Things has produced large amounts of data in different scenarios, which are accompanied with problems, such as consistency and integrity violations. Existing research on dealing with data availability violations is insufficient. In this work, the detection and repair of data availability violations (DRAV) framework is proposed to detect and repair data violations in Internet of Things with a distributed parallel computing environment. DRAV uses algorithms in the MapReduce programming framework, and these include detection and repair algorithms based on enhanced conditional function dependency for data consistency violation, MapJoin, and ReduceJoin algorithms based on master data for k-nearest neighbor–based integrity violation detection, and repair algorithms. Experiments are conducted to determine the effect of the algorithms. Results show that DRAV improves data availability in Internet of Things compared with existing methods by detecting and repairing violations.


2010 ◽  
Vol 14 (2) ◽  
pp. 301-315 ◽  
Author(s):  
James Ma ◽  
Daniel Zeng ◽  
Huimin Zhao

Sign in / Sign up

Export Citation Format

Share Document