Intensional and Extensional Views in DL-Lite Ontologies

The use of virtual collections of data is often essential in several data and knowledge management tasks. In the literature, the standard way to define virtual data collections is via views, i.e., virtual relations defined using queries. In data and knowledge bases, the notion of views is a staple of data access, data integration and exchange, query optimization, and data privacy. In this work, we study views in Ontology-Based Data Access (OBDA) systems. OBDA is a powerful paradigm for accessing data through an ontology, i.e., a conceptual specification of the domain of interest written using logical axioms. Intuitively, users of an OBDA system interact with the data only through the ontology's conceptual lens. We present a novel framework to express natural and sophisticated forms of views in OBDA systems and introduce fundamental reasoning tasks for these views. We study the computational complexity of these tasks and present classes of views for which these tasks are tractable or at least decidable.

Download Full-text

Qualitative content analysis for international comparison of data usage agreements

International Journal for Population Data Science ◽

10.23889/ijpds.v1i1.264 ◽

2017 ◽

Vol 1 (1) ◽

Author(s):

Christian Haux ◽

Frank Gabel ◽

Anna-Lena Trescher ◽

Helen Whelton ◽

Geert Van der Heijden ◽

...

Keyword(s):

Content Analysis ◽

Data Integration ◽

Data Privacy ◽

Qualitative Content Analysis ◽

Data Access ◽

Added Value ◽

Security Requirements ◽

Coding System ◽

Privacy And Security ◽

Data Usage

ABSTRACT ObjectivesThe multi-country-EU project ADVOCATE (Added Value for Oral Care) involves the analysis of routinely collected oral health care records from health insurance systems in six European countries, including NHS England and NHS Scotland. The data will be stored in a in a central repository using AnalytiXagility which adheres to strict privacy and security standards. Therefore, data usage agreements must be consented with all partners and being subjected to specific regulations in the respective nation. This will result in different aggregation levels for data integration, e. g. one of the partners does not allow the transfer of data that contain a personal identifier. To understand the variety of requirements and limitations in different countries, we performed a qualitative content analysis of the agreements. ApproachA categorisation system for privacy and data protection aspects was developed. The aspects are based on privacy conditions mentioned in guidance documents, the agreements themselves and the project’s proposal. The agreements were examined for textual elements and systematically coded by three reviewers. Compliance between privacy conditions and the agreements was estimated using a nominal scale, whether the context was available in the agreement or not. The software MAXQDA was used for tagging relevant text passages. ResultsThe initial coding scheme contains eight categories on top-level. They include, inter alia, aspects on data access, -preparation, -transmission, and -usage. The top-levels divide in up to four different levels of detail. The coding system was continuously adapted during full-text analysis. Initially, the agreements from the partners of Denmark and Germany were used. Characteristics in the agreements require a fine granularity of sub-categories. The German agreement, for example, names the whole institution as partner, whereas the Danish agreement differentiates in personal roles, each with own responsibilities. ConclusionUndertaking an overview of privacy conditions can be a valuable step in comparing privacy and security requirements in different national regulations. The qualitative content analysis was found a suitable approach for this purpose because it enables the detection of fine characteristics. By using an incremental design, it is possible to adapt the coding system to include additional partners. However, the current coding system has the limitation that heterogeneity between the agreements leads to a fine granularity of categories that hamper the comparability between partners. Despite these problems, the approach allows the comparison of data privacy and supports the development of a data integration process for international harmonisation.

Download Full-text

Control Cloud Data Access Privilge Anonymity with Attributed Based Encryption

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i8.68 ◽

2017 ◽

Vol 7 (8) ◽

pp. 279

Author(s):

P. Sudheer ◽

T. Lakshmi Surekha

Keyword(s):

Data Privacy ◽

Low Cost ◽

Data Access ◽

Privacy Concerns ◽

Cloud Data ◽

Computing Paradigm ◽

Attribute Based Encryption ◽

Data Content ◽

Identity Privacy ◽

Cloud Servers

Cloud computing is a revolutionary computing paradigm, which enables flexible, on-demand, and low-cost usage of computing resources, but the data is outsourced to some cloud servers, and various privacy concerns emerge from it. Various schemes based on the attribute-based encryption have been to secure the cloud storage. Data content privacy. A semi anonymous privilege control scheme AnonyControl to address not only the data privacy. But also the user identity privacy. AnonyControl decentralizes the central authority to limit the identity leakage and thus achieves semi anonymity. The Anonymity –F which fully prevent the identity leakage and achieve the full anonymity.

Download Full-text

Twenty Years of Data Linkage in The Australian Longitudinal Study on Women’s Health

International Journal for Population Data Science ◽

10.23889/ijpds.v5i5.1500 ◽

2020 ◽

Vol 5 (5) ◽

Author(s):

Colleen Loos ◽

Gita Mishra ◽

Annette Dobson ◽

Leigh Tooth

Keyword(s):

Longitudinal Study ◽

Data Collection ◽

Women’S Health ◽

Women's Health ◽

Linked Data ◽

Data Linkage ◽

Data Access ◽

National Study ◽

Australian Longitudinal Study ◽

Data Collections

IntroductionLinked health record collections, when combined with large longitudinal surveys, are a rich research resource to inform policy development and clinical practice across multiple sectors. Objectives and ApproachThe Australian Longitudinal Study on Women’s Health (ALSWH) is a national study of over 57,000 women in four cohorts. Survey data collection commenced in 1996. Over the past 20 years, ALSWH has also established an extensive data linkage program. The aim of this poster is to provide an overview of ALSWH’s program of regularly up-dated linked data collections for use in parallel with on-going surveys, and to demonstrate how data are made widely available to research collaborators. ResultsALSWH surveys collect information on health conditions, ageing, reproductive characteristics, access to health services, lifestyle, and socio-demographic factors. Regularly updated linked national and state administrative data collections add information on health events, health outcomes, diagnoses, treatments, and patterns of service use. ALSWH’s national linked data collections, include Medicare Benefits Schedule, Pharmaceutical Benefits Scheme, the National Death Index, the Australian Cancer Database, and the National Aged Care Data Collection. State and Territory hospital collections include Admitted Patients, Emergency Department and Perinatal Data. There are also substudies, such as the Mothers and their Children’s Health Study (MatCH), which involves linkage to children’s educational records. ALSWH has an internal Data Access Committee along with systems and protocols to facilitate collaborative multi-sectoral research using de-identified linked data. Conclusion / ImplicationsAs a large scale Australian longitudinal multi-jurisdictional data linkage and sharing program, ALSWH is a useful model for anyone planning similar research.

Download Full-text

The Brazilian cohort of pulp and paper workers: the logistic of a cancer mortality study

Cadernos de Saúde Pública ◽

10.1590/s0102-311x1998000700012 ◽

1998 ◽

Vol 14 (suppl 3) ◽

pp. S117-S123 ◽

Cited By ~ 4

Author(s):

Anaclaudia Gastal Fassa ◽

Luiz Augusto Facchini ◽

Marinel Mór Dall'Agnol

Keyword(s):

Data Access ◽

Pulp And Paper ◽

Data Accuracy ◽

Data Availability ◽

International Agency ◽

Historical Cohort ◽

Increased Risk ◽

Risk Of Cancer ◽

Access Data ◽

Brazilian Cohort

The International Agency for Research on Cancer (IARC) proposed this international historical cohort study trying to solve the controversy about the increased risk of cancer in the workers of the Pulp and Paper Industry. One of the most important aspects presented by this study in Brazil was the strategies used to overcome the methodological challenges, such as: data access, data accuracy, data availability, multiple data sources, and the large follow-up period. Through multiple strategies it was possible to build a Brazilian cohort of 3,622 workers, to follow them with a 93 percent success rate and to identify in 99 percent of the cases the cause of death. This paper, has evaluated the data access, data accuracy and the effectiveness of the strategies used and the different sources of data.

Download Full-text

From Ignorance Map to Informing PKM4E Framework: Personal Knowledge Management for Empowerment

10.28945/3984 ◽

2018 ◽

Keyword(s):

Knowledge Management ◽

Design Science ◽

Knowledge Workers ◽

Rapid Development ◽

Science Research ◽

Knowledge Bases ◽

Personal Knowledge ◽

Future Research ◽

Development Platform ◽

Personal Knowledge Management

Aim/Purpose: [This Proceedings paper was revised and published in the 2018 issue of the journal Issues in Informing Science and Information Technology, Volume 15] The proposed Personal Knowledge Management (PKM) for Empowerment (PKM4E) Framework expands on the notions of the Ignorance Map and Matrix for further supporting the educational concept of a PKM system-in-progress. Background: The accelerating information abundance is depleting the very attention our cognitive capabilities are able to master, one key cause of individual and collective opportunity divides. Support is urgently needed to benefit Knowledge Workers independent of space (developed/developing countries), time (study or career phase), discipline (natural or social science), or role (student, professional, leader). Methodology: The Design Science Research (DSR) project introducing the novel PKM System (PKMS) aims to support a scenario of a ‘Decentralizing KM Revolution’ giving more power and autonomy to individuals and self-organized groups. Contribution: The portrayal of potential better solutions cannot be accommodated by one-dimensional linear text alone but necessitates the utilization of visuals, charts, and blueprints for the concept as well as the use of colors, icons, and catchy acronyms to successfully inform a diverse portfolio of audiences and potential beneficiaries. Findings: see Recommendation for Researchers Recommendations for Practitioners: The PKM4E learning cycles and workflows apply ‘cumulative synthesis’, a concept which convincingly couples the activities of researchers and entrepreneurs, and assists users to advance their capability endowments via applied learning. Recommendation for Researchers: In substituting document-centric with meme-based knowledge bases, the PKMS approach merges distinctive voluntarily shared knowledge objects/assets of diverse disciplines into a single unified digital knowledge repository and provides the means for advancing current metrics and reputation systems. Impact on Society: The PKMS features provide the means to tackle the widening opportunity divides by affording knowledge workers with continuous life-long support from trainee, student, novice, or mentee towards professional, expert, mentor, or leader. Future Research: After completing the test phase of the PKMS prototype, its transformation into a viable PKM system and cloud-based server based on a rapid development platform and a noSQL-database is estimated to take 12 months.

Download Full-text

Enhanced Integrity Checking for Preserve Data Owner and User Level Privacy Using Dual Cryptography Approach

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit195346 ◽

2019 ◽

pp. 138-146

Author(s):

Poovizhi. M ◽

Raja. G

Keyword(s):

Data Storage ◽

Data Privacy ◽

Capital Expenditure ◽

Data Access ◽

Third Party ◽

Configurable Computing ◽

Local Data ◽

Cloud Data ◽

On Demand ◽

Integrity Checking

Using Cloud Storage, users can tenuously store their data and enjoy the on-demand great quality applications and facilities from a shared pool of configurable computing resources, without the problem of local data storage and maintenance. However, the fact that users no longer have physical possession of the outsourced data makes the data integrity protection in Cloud Computing a formidable task, especially for users with constrained dividing resources. From users’ perspective, including both individuals and IT systems, storing data remotely into the cloud in a flexible on-demand manner brings tempting benefits: relief of the burden for storage management, universal data access with independent geographical locations, and avoidance of capital expenditure on hardware, software, and personnel maintenances, etc. To securely introduce an effective Sanitizer and third party auditor (TPA), the following two fundamental requirements have to be met: 1) TPA should be able to capably audit the cloud data storage without demanding the local copy of data, and introduce no additional on-line burden to the cloud user; 2) The third party auditing process should take in no new vulnerabilities towards user data privacy. In this project, utilize and uniquely combine the public auditing protocols with double encryption approach to achieve the privacy-preserving public cloud data auditing system, which meets all integrity checking without any leakage of data. To support efficient handling of multiple auditing tasks, we further explore the technique of online signature to extend our main result into a multi-user setting, where TPA can perform multiple auditing tasks simultaneously. We can implement double encryption algorithm to encrypt the data twice and stored cloud server in Electronic Health Record applications.

Download Full-text

XRootd, disk-based, caching proxy for optimization of data access, data placement and data replication

Journal of Physics Conference Series ◽

10.1088/1742-6596/513/4/042044 ◽

2014 ◽

Vol 513 (4) ◽

pp. 042044 ◽

Cited By ~ 12

Author(s):

L A T Bauerdick ◽

K Bloom ◽

B Bockelman ◽

D C Bradley ◽

S Dasu ◽

...

Keyword(s):

Data Replication ◽

Data Access ◽

Data Placement ◽

Caching Proxy ◽

Access Data

Download Full-text

Semantic-Based Geospatial Data Integration With Unique Features

Geospatial Intelligence ◽

10.4018/978-1-5225-8054-6.ch012 ◽

2019 ◽

pp. 254-277 ◽

Cited By ~ 1

Author(s):

Ying Zhang ◽

Chaopeng Li ◽

Na Chen ◽

Shaowen Liu ◽

Liming Du ◽

...

Keyword(s):

Data Integration ◽

High Performance ◽

Data Access ◽

Heterogeneous Data ◽

Geospatial Data ◽

Experimental Results ◽

Data Sources ◽

Data Format ◽

Access Protocols ◽

Data Source

Since large amount of geospatial data are produced by various sources, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. We mainly adopt four kinds of geospatial data sources to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).

Download Full-text

Semantic Web and Geospatial Unique Features Based Geospatial Data Integration

Geospatial Intelligence ◽

10.4018/978-1-5225-8054-6.ch011 ◽

2019 ◽

pp. 230-253

Author(s):

Ying Zhang ◽

Chaopeng Li ◽

Na Chen ◽

Shaowen Liu ◽

Liming Du ◽

...

Keyword(s):

Semantic Web ◽

Data Integration ◽

High Performance ◽

Data Access ◽

Heterogeneous Data ◽

Geospatial Data ◽

Data Sources ◽

Modeling Process ◽

Translation Function ◽

Data Source

Since large amount of geospatial data are produced by various sources and stored in incompatible formats, geospatial data integration is difficult because of the shortage of semantics. Despite standardised data format and data access protocols, such as Web Feature Service (WFS), can enable end-users with access to heterogeneous data stored in different formats from various sources, it is still time-consuming and ineffective due to the lack of semantics. To solve this problem, a prototype to implement the geospatial data integration is proposed by addressing the following four problems, i.e., geospatial data retrieving, modeling, linking and integrating. First, we provide a uniform integration paradigm for users to retrieve geospatial data. Then, we align the retrieved geospatial data in the modeling process to eliminate heterogeneity with the help of Karma. Our main contribution focuses on addressing the third problem. Previous work has been done by defining a set of semantic rules for performing the linking process. However, the geospatial data has some specific geospatial relationships, which is significant for linking but cannot be solved by the Semantic Web techniques directly. We take advantage of such unique features about geospatial data to implement the linking process. In addition, the previous work will meet a complicated problem when the geospatial data sources are in different languages. In contrast, our proposed linking algorithms are endowed with translation function, which can save the translating cost among all the geospatial sources with different languages. Finally, the geospatial data is integrated by eliminating data redundancy and combining the complementary properties from the linked records. We mainly adopt four kinds of geospatial data sources, namely, OpenStreetMap(OSM), Wikmapia, USGS and EPA, to evaluate the performance of the proposed approach. The experimental results illustrate that the proposed linking method can get high performance in generating the matched candidate record pairs in terms of Reduction Ratio(RR), Pairs Completeness(PC), Pairs Quality(PQ) and F-score. The integrating results denote that each data source can get much Complementary Completeness(CC) and Increased Completeness(IC).

Download Full-text