Anonymization of Daily Activity Data by Using ℓ-diversity Privacy Model

2021 ◽  
Vol 12 (3) ◽  
pp. 1-21
Author(s):  
Pooja Parameshwarappa ◽  
Zhiyuan Chen ◽  
Güneş Koru

In the age of IoT, collection of activity data has become ubiquitous. Publishing activity data can be quite useful for various purposes such as estimating the level of assistance required by older adults and facilitating early diagnosis and treatment of certain diseases. However, publishing activity data comes with privacy risks: Each dimension, i.e., the activity of a person at any given point in time can be used to identify a person as well as to reveal sensitive information about the person such as not being at home at that time. Unfortunately, conventional anonymization methods have shortcomings when it comes to anonymizing activity data. Activity datasets considered for publication are often flat with many dimensions but typically not many rows, which makes the existing anonymization techniques either inapplicable due to very few rows, or else either inefficient or ineffective in preserving utility. This article proposes novel multi-level clustering-based approaches using a non-metric weighted distance measure that enforce ℓ-diversity model. Experimental results show that the proposed methods preserve data utility and are orders more efficient than the existing methods.

2021 ◽  
Author(s):  
Wen-Yang Lin ◽  
Jie-Teng Wang

BACKGROUND Increasingly, spontaneous reporting systems (SRS) have been established to collect adverse drug events to foster the research of ADR detection and analysis. SRS data contains personal information and so its publication requires data anonymization to prevent the disclosure of individual privacy. We previously have proposed a privacy model called MS(k, θ*)-bounding and the associated MS-Anonymization algorithm to fulfill the anonymization of SRS data. In the real world, the SRS data usually are released periodically, e.g., FAERS, to accommodate newly collected adverse drug events. Different anonymized releases of SRS data available to the attacker may thwart our single-release-focus method, i.e., MS(k, θ*)-bounding. OBJECTIVE We investigate the privacy threat caused by periodical releases of SRS data and propose anonymization methods to prevent the disclosure of personal privacy information while maintain the utility of published data. METHODS We identify some potential attacks on periodical releases of SRS data, namely BFL-attacks, that are mainly caused by follow-up cases. We present a new privacy model called PPMS(k, θ*)-bounding, and propose the associated PPMS-Anonymization algorithm along with two improvements, PPMS+-Anonymization and PPMS++-Anonymization. Empirical evaluations were performed using 32 selected FAERS quarter datasets, from 2004Q1 to 2011Q4. The performance of the proposed three versions of PPMS-Anonymization were inspected against MS-Anonymization from some aspects, including data distortion, measured by Normalized Information Loss (NIS); privacy risk of anonymized data, measured by Dangerous Identity Ratio (DIR) and Dangerous Sensitivity Ratio (DSR); and data utility, measured by bias of signal counting and strength (PRR). RESULTS The results show that our new method can prevent privacy disclosure for periodical releases of SRS data with reasonable sacrifice of data utility and acceptable deviation of the strength of ADR signals. The best version of PPMS-Anonymization, PPMS++-Anonymization, achieves nearly the same quality as MS-Anonymization both in privacy protection and data utility. CONCLUSIONS The proposed PPMS(k, θ*)-bounding model and PPMS-Anonymization algorithm are effective in anonymizing SRS datasets in the periodical data publishing scenario, preventing the series of releases from the disclosure of personal sensitive information caused by BFL-attacks while maintaining the data utility for ADR signal detection.


2021 ◽  
Vol 11 (12) ◽  
pp. 3164-3173
Author(s):  
R. Indhumathi ◽  
S. Sathiya Devi

Data sharing is essential in present biomedical research. A large quantity of medical information is gathered and for different objectives of analysis and study. Because of its large collection, anonymity is essential. Thus, it is quite important to preserve privacy and prevent leakage of sensitive information of patients. Most of the Anonymization methods such as generalisation, suppression and perturbation are proposed to overcome the information leak which degrades the utility of the collected data. During data sanitization, the utility is automatically diminished. Privacy Preserving Data Publishing faces the main drawback of maintaining tradeoff between privacy and data utility. To address this issue, an efficient algorithm called Anonymization based on Improved Bucketization (AIB) is proposed, which increases the utility of published data while maintaining privacy. The Bucketization technique is used in this paper with the intervention of the clustering method. The proposed work is divided into three stages: (i) Vertical and Horizontal partitioning (ii) Assigning Sensitive index to attributes in the cluster (iii) Verifying each cluster against privacy threshold (iv) Examining for privacy breach in Quasi Identifier (QI). To increase the utility of published data, the threshold value is determined based on the distribution of elements in each attribute, and the anonymization method is applied only to the specific QI element. As a result, the data utility has been improved. Finally, the evaluation results validated the design of paper and demonstrated that our design is effective in improving data utility.


2019 ◽  
Vol 2019 (2) ◽  
pp. 291-305 ◽  
Author(s):  
Mohammad Naseri ◽  
Nataniel P. Borges ◽  
Andreas Zeller ◽  
Romain Rouvoy

Abstract To support users with disabilities, Android provides the accessibility services, which implement means of navigating through an app. According to the Android developer’s guide: “Accessibility services should only be used to assist users with disabilities in using Android devices and apps”. However, developers are free to use this service without any restrictions, giving them critical privileges such as monitoring user input or screen content to capture sensitive information. In this paper, we show that simply enabling the accessibility service leaves 72 % of the top finance a nd 80 % of the top social media apps vulnerable to eavesdropping attacks, leaking sensitive information such as logins and passwords. A combination of several tools and recommendations could mitigate the privacy risks: We introduce an analysis technique that detects most of these issues automatically, e.g. in an app store. We also found that these issues can be automatically fixed in almost all cases; our fixes have b een accepted by 70 % of the surveyed developers. Finally, we designed a notification mechanism which would warn users against possible misuses of the accessibility services; 50 % of users would follow these notifications.


2017 ◽  
Vol 14 (4) ◽  
pp. 172988141770907 ◽  
Author(s):  
Hanbo Wu ◽  
Xin Ma ◽  
Zhimeng Zhang ◽  
Haibo Wang ◽  
Yibin Li

Human daily activity recognition has been a hot spot in the field of computer vision for many decades. Despite best efforts, activity recognition in naturally uncontrolled settings remains a challenging problem. Recently, by being able to perceive depth and visual cues simultaneously, RGB-D cameras greatly boost the performance of activity recognition. However, due to some practical difficulties, the publicly available RGB-D data sets are not sufficiently large for benchmarking when considering the diversity of their activities, subjects, and background. This severely affects the applicability of complicated learning-based recognition approaches. To address the issue, this article provides a large-scale RGB-D activity data set by merging five public RGB-D data sets that differ from each other on many aspects such as length of actions, nationality of subjects, or camera angles. This data set comprises 4528 samples depicting 7 action categories (up to 46 subcategories) performed by 74 subjects. To verify the challengeness of the data set, three feature representation methods are evaluated, which are depth motion maps, spatiotemporal depth cuboid similarity feature, and curvature space scale. Results show that the merged large-scale data set is more realistic and challenging and therefore more suitable for benchmarking.


Symmetry ◽  
2019 ◽  
Vol 11 (2) ◽  
pp. 275 ◽  
Author(s):  
Chengdong Cao ◽  
Shouzhen Zeng ◽  
Dandan Luo

The aim of this paper is to present a multiple-attribute group decision-making (MAGDM) framework based on a new single-valued neutrosophic linguistic (SVNL) distance measure. By unifying the idea of the weighted average and ordered weighted averaging into a single-valued neutrosophic linguistic distance, we first developed a new SVNL weighted distance measure, namely a SVNL combined and weighted distance (SVNLCWD) measure. The focal characteristics of the devised SVNLCWD are its ability to combine both the decision-makers’ attitudes toward the importance, as well as the weights, of the arguments. Various desirable properties and families of the developed SVNLCWD were contemplated. Moreover, a MAGDM approach based on the SVNLCWD was formulated. Lastly, a real numerical example concerning a low-carbon supplier selection problem was used to describe the superiority and feasibility of the developed approach.


Author(s):  
Pooja Parameshwarappa ◽  
Zhiyuan Chen ◽  
Gunes Koru

Publishing physical activity data can facilitate reproducible health-care research in several areas such as population health management, behavioral health research, and management of chronic health problems. However, publishing such data also brings high privacy risks related to re-identification which makes anonymization necessary. One of the challenges in anonymizing physical activity data collected periodically is its sequential nature. The existing anonymization techniques work sufficiently for cross-sectional data but have high computational costs when applied directly to sequential data. This article presents an effective anonymization approach, multi-level clustering-based anonymization to anonymize physical activity data. Compared with the conventional methods, the proposed approach improves time complexity by reducing the clustering time drastically. While doing so, it preserves the utility as much as the conventional approaches.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Haibo Zhang ◽  
Zhimin Mu ◽  
Shouzhen Zeng

Simplified neutrosophic set (SNS) is a popular tool in modelling potential, imprecise, and uncertain information within complex environments. In this paper, a method based on the integrated weighted distance measure and entropy weight is proposed for handling SNS multiple attribute group decision-making (MAGDM) problems. To this end, the simplified neutrosophic (SN) integrated weighted distance (SVNIWD) measure is first developed for overcoming the limitations of the existing methods. Afterward, the proposed SNIWD’s several properties and particular status are studied. Moreover, a flexible and useful MAGDM approach that combines the strengths of the SNIWD and the SNS is proposed, wherein the SN entropy measure is applied to calculate the unknown weight information regarding attributes. Finally, a numerical case of investment evaluation and subsequent comparative analysis are conducted to prove the superiority of the proposed framework.


2017 ◽  
Vol 14 (1) ◽  
pp. 59-66 ◽  
Author(s):  
Mhairi MacDonald ◽  
Samantha G. Fawkner ◽  
Ailsa Niven

Background:It is currently not known how much walking should be advocated for good health in adolescent girls. The aim of this study was therefore to recommend health referenced standards for step defined physical activity relating to appropriate health criterion/indicators in a group of adolescent girls.Method:Two hundred and thirty adolescent girls aged between 12 to 15 years volunteered to take part in the study. Each participant undertook measurements (BMI, waist circumference, % body fat, and blood pressure) to define health status. Activity data were collected by pedometer and used to assess daily step counts and accumulated daily activity time over 7 consecutive days.Results:Individuals classified as ‘healthy’ did not take significantly more steps·day–1 nor spend more time in moderate intensity activity than individuals classified as at health risk or with poor health profiles.Conclusion:‘Healthy’ adolescent girls do not walk significantly more in term of steps·day–1 or time spent in activity than girls classified as ‘unhealthy.’ This could suggest that adolescent girls may not walk enough to stratify health and health related outcomes and as a result the data could not be used to inform an appropriate step guideline for this population.


Sign in / Sign up

Export Citation Format

Share Document