Enhancing virtual ontology based access over tabular data with Morph-CSV

Ontology-Based Data Access (OBDA) has traditionally focused on providing a unified view of heterogeneous datasets (e.g., relational databases, CSV and JSON files), either by materializing integrated data into RDF or by performing on-the-fly querying via SPARQL query translation. In the specific case of tabular datasets represented as several CSV or Excel files, query translation approaches have been applied by considering each source as a single table that can be loaded into a relational database management system (RDBMS). Nevertheless, constraints over these tables are not represented (e.g., referential integrity among sources, datatypes, or data integrity); thus, neither consistency among attributes nor indexes over tables are enforced. As a consequence, efficiency of the SPARQL-to-SQL translation process may be affected, as well as the completeness of the answers produced during the evaluation of the generated SQL query. Our work is focused on applying implicit constraints on the OBDA query translation process over tabular data. We propose Morph-CSV, a framework for querying tabular data that exploits information from typical OBDA inputs (e.g., mappings, queries) to enforce constraints that can be used together with any SPARQL-to-SQL OBDA engine. Morph-CSV relies on both a constraint component and a set of constraint operators. For a given set of constraints, the operators are applied to each type of constraint with the aim of enhancing query completeness and performance. We evaluate Morph-CSV in several domains: e-commerce with the BSBM benchmark; transportation with the GTFS-Madrid benchmark; and biology with a use case extracted from the Bio2RDF project. We compare and report the performance of two SPARQL-to-SQL OBDA engines, without and with the incorporation of Morph-CSV. The observed results suggest that Morph-CSV is able to speed up the total query execution time by up to two orders of magnitude, while it is able to produce all the query answers.

Download Full-text

Applying graph database technology for analyzing perturbed co-expression networks in cancer

Database ◽

10.1093/database/baaa110 ◽

2020 ◽

Vol 2020 ◽

Author(s):

Claire M Simpson ◽

Florian Gnad

Keyword(s):

Relational Databases ◽

Molecular Mechanisms ◽

Biological Data ◽

Database Management System ◽

Graph Database ◽

Graph Databases ◽

Graph Representations ◽

Rnaseq Data ◽

Database Technology ◽

Speed Accuracy

Abstract Graph representations provide an elegant solution to capture and analyze complex molecular mechanisms in the cell. Co-expression networks are undirected graph representations of transcriptional co-behavior indicating (co-)regulations, functional modules or even physical interactions between the corresponding gene products. The growing avalanche of available RNA sequencing (RNAseq) data fuels the construction of such networks, which are usually stored in relational databases like most other biological data. Inferring linkage by recursive multiple-join statements, however, is computationally expensive and complex to design in relational databases. In contrast, graph databases store and represent complex interconnected data as nodes, edges and properties, making it fast and intuitive to query and analyze relationships. While graph-based database technologies are on their way from a fringe domain to going mainstream, there are only a few studies reporting their application to biological data. We used the graph database management system Neo4j to store and analyze co-expression networks derived from RNAseq data from The Cancer Genome Atlas. Comparing co-expression in tumors versus healthy tissues in six cancer types revealed significant perturbation tracing back to erroneous or rewired gene regulation. Applying centrality, community detection and pathfinding graph algorithms uncovered the destruction or creation of central nodes, modules and relationships in co-expression networks of tumors. Given the speed, accuracy and straightforwardness of managing these densely connected networks, we conclude that graph databases are ready for entering the arena of biological data.

Download Full-text

Manycore Performance-Portability: Kokkos Multidimensional Array Library

Scientific Programming ◽

10.1155/2012/917630 ◽

2012 ◽

Vol 20 (2) ◽

pp. 89-114 ◽

Cited By ~ 13

Author(s):

H. Carter Edwards ◽

Daniel Sunderland ◽

Vicki Porter ◽

Chris Amsler ◽

Sam Mish

Keyword(s):

Programming Model ◽

Engineering Application ◽

Data Access ◽

Memory Space ◽

Performance Requirements ◽

Application Programming ◽

Multidimensional Array ◽

And Performance ◽

Data Access Patterns ◽

Access Patterns

Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel execution performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].

Download Full-text

Demand Side Simulaton: Architecture And Performance

10.32920/ryerson.14656659 ◽

2021 ◽

Author(s):

Jahanzeb Yousaf

Keyword(s):

Distribution Systems ◽

Database Management System ◽

Demand Side ◽

Simulation Engine ◽

Agent Simulation ◽

Utility Companies ◽

Simulation Paradigm ◽

Improved Performance ◽

Complex Solutions ◽

And Performance

In this thesis, we describe a simulator of residential power consumption, as the first step towards a comprehensive demand-side simulator in the context of smart grid. The si mulator uses a commercialRelational Database Management System (RDBMS) as its simulation engine, and is thus capable of supporting much larger simulated systems than other existing simulators which are mostly based on a multi-agent simulation paradigm. The RDBMS-based design also leads to much improved performance while requiring less resources than comparable MAS-based system. Moreover, simulator records all the events which exceed a certain threshold and in response controls demand and assures the stability of the system, based on the future and past events. furthermore, simulator can help utility companies to obtain initial data that can lead to the development of more complex solutions to monitor and control energy consumption, and thus identify target operating points for the generation and distribution systems, with the ultimate goal of balancing the demand and supply, and of improving energy efficiency at the utility level.

Download Full-text

Design and implementation of CNTFET based ternary 1x1 memories

International Journal of Reconfigurable and Embedded Systems (IJRES) ◽

10.11591/ijres.v8.i3.pp175-184 ◽

2019 ◽

Vol 8 (3) ◽

pp. 175 ◽

Cited By ~ 1

Author(s):

S.Tamil Selvan ◽

M. Sundararajan

Keyword(s):

High Performance ◽

Data Access ◽

Cmos Technology ◽

Access Time ◽

The Novel ◽

Design And Implementation ◽

Technology Process ◽

Sram Cell ◽

And Performance ◽

Power Application

In this paper presented Design and implementation of CNTFET based Ternary 1x1 RAM memories high-performance digital circuits. CNTFET Ternary 1x1 SRAM memories is implement using 32nm technology process. The CNTFET decresase the diameter and performance matrics like delay,power and power delay, The CNTFET Ternary 6T SRAM cell consists of two cross coupled Ternary inverters one is READ and another WRITE operations of the Ternary 6T SRAM cell are performed with the Tritline using HSPICE and Tanner tools in this tool is performed high accuracy. The novel based work can be used for Low Power Application and Access time is less of compared to the conventional CMOS Technology. The CNTFET Ternary 6T SRAM array module (1X1) in 32nm technology consumes only 0.412mW power and data access time is about 5.23ns.

Download Full-text

An implementation and performance analysis of spatial data access methods

[1989] Proceedings. Fifth International Conference on Data Engineering ◽

10.1109/icde.1989.47268 ◽

2003 ◽

Cited By ~ 45

Author(s):

D. Greene

Keyword(s):

Performance Analysis ◽

Spatial Data ◽

Data Access ◽

Access Methods ◽

And Performance

Download Full-text

XML-to-SQL Query Translation Literature: The State of the Art and Open Problems

Database and XML Technologies - Lecture Notes in Computer Science ◽

10.1007/978-3-540-39429-7_1 ◽

2003 ◽

pp. 1-18 ◽

Cited By ~ 40

Author(s):

Rajasekar Krishnamurthy ◽

Raghav Kaushik ◽

Jeffrey F. Naughton

Keyword(s):

State Of The Art ◽

The State ◽

Open Problems ◽

Query Translation ◽

Sql Query

Download Full-text

Efficient XML-to-SQL Query Translation

Proceedings 2004 VLDB Conference ◽

10.1016/b978-012088469-8.50016-4 ◽

2004 ◽

pp. 144-155 ◽

Cited By ~ 10

Author(s):

Rajasekar Krishnamurthy ◽

Raghav Kaushik ◽

Jeffrey F Naughton

Keyword(s):

Query Translation ◽

Sql Query

Download Full-text

SCOntology

Adaptive Technologies and Business Integration ◽

10.4018/978-1-59904-048-6.ch007 ◽

2007 ◽

pp. 137-158 ◽

Cited By ~ 1

Author(s):

Silvio Gonnet ◽

Marcela Vegetti ◽

Horacio Leone ◽

Gabriela Henning

Keyword(s):

Supply Chain ◽

Chain Management ◽

Precise Meaning ◽

Information Logistics ◽

Money Flows ◽

Unified View ◽

And Performance ◽

The Many ◽

Abstraction Levels ◽

Material Information

This contribution points out the various challenges associated to Supply Chain Management (SCM). SCM involves coordinating and integrating material, information and money flows, both within and across several companies. The integration of these flows is perceived in quite distinct ways by different communities, raising some semantics-related problems. To assist organizations in achieving a unified view of the Supply Chain (SC), a new ontology, named SCOntology, is introduced in this chapter. SCOntology is a framework to formally describe a SC at various abstraction levels, by sharing a precise meaning of the information exchanged during communication among the many stakeholders involved in the SC. Moreover, SCOntology provides a foundation for the specification of information logistics processes and also sets the grounds for measuring and evaluating a SC by stating different metrics and performance-related concepts.

Download Full-text