aggregate functions
Recently Published Documents


TOTAL DOCUMENTS

70
(FIVE YEARS 13)

H-INDEX

11
(FIVE YEARS 1)

2021 ◽  
Vol 14 (13) ◽  
pp. 3389-3401
Author(s):  
Daniel Hernández ◽  
Luis Galárraga ◽  
Katja Hose

Over the past few years, we have witnessed the emergence of large knowledge graphs built by extracting and combining information from multiple sources. This has propelled many advances in query processing over knowledge graphs, however the aspect of providing provenance explanations for query results has so far been mostly neglected. We therefore propose a novel method, SPARQLprov, based on query rewriting, to compute how-provenance polynomials for SPARQL queries over knowledge graphs. Contrary to existing works, SPARQLprov is system-agnostic and can be applied to standard and already deployed SPARQL engines without the need of customized extensions. We rely on spm-semirings to compute polynomial annotations that respect the property of commutation with homomorphisms on monotonic and non-monotonic SPARQL queries without aggregate functions. Our evaluation on real and synthetic data shows that SPARQLprov over standard engines incurs an acceptable runtime overhead w.r.t. the original query, competing with state-of-the-art solutions for how-provenance computation.


2021 ◽  
Vol 14 (10) ◽  
pp. 1818-1831
Author(s):  
Rudi Poepsel-Lemaitre ◽  
Martin Kiefer ◽  
Joscha von Hein ◽  
Jorge-Arnulfo Quiané-Ruiz ◽  
Volker Markl

In pursuit of real-time data analysis, approximate summarization structures, i.e., synopses, have gained importance over the years. However, existing stream processing systems, such as Flink, Spark, and Storm, do not support synopses as first class citizens, i.e., as pipeline operators. Synopses' implementation is upon users. This is mainly because of the diversity of synopses, which makes a unified implementation difficult. We present Condor, a framework that supports synopses as first class citizens. Condor facilitates the specification and processing of synopsis-based streaming jobs while hiding all internal processing details. Condor's key component is its model that represents synopses as a particular case of windowed aggregate functions. An inherent divide and conquer strategy allows Condor to efficiently distribute the computation, allowing for high-performance and linear scalability. Our evaluation shows that Condor outperforms existing approaches by up to a factor of 75x and that it scales linearly with the number of cores.


Author(s):  
Pedro Cabalar ◽  
Jorge Fandinno ◽  
Torsten Schaub ◽  
Philipp Wanko

Characterizing hybrid ASP solving in a generic way is difficult since one needs to abstract from specific theories. Inspired by lazy SMT solving, this is usually addressed by treating theory atoms as opaque. Unlike this, we propose a slightly more transparent approach that includes an abstract notion of a term. Rather than imposing a syntax on terms, we keep them abstract by stipulating only some basic properties. With this, we further develop a semantic framework for hybrid ASP solving and provide aggregate functions for theory variables that adhere to different semantic principles, show that they generalize existing aggregate semantics in ASP and how we can rely on off-the-shelf hybrid solvers for implementation.


Author(s):  
Shivangi Kanchan ◽  
Parmeet Kaur ◽  
Pranjal Apoorva

Aim: To evaluate the performance of Relational and NoSQL databases in terms of execution time and memory consumption during operations involving structured data. Objective: To outline the criteria that decision makers should consider while making a choice of the database most suited to an application. Methods: Extensive experiments were performed on MySQL, MongoDB, Cassandra, Redis using the data for a IMDB movies schema prorated into 4 datasets of 1000, 10000, 25000 and 50000 records. The experiments involved typical database operations of insertion, deletion, update read of records with and without indexing as well as aggregation operations. Databases’ performance has been evaluated by measuring the time taken for operations and computing memory usage. Results: * Redis provides the best performance for write, update and delete operations in terms of time elapsed and memory usage whereas MongoDB gives the worst performance when the size of data increases, due to its locking mechanism. * For the read operations, Redis provides better performance in terms of latency than Cassandra and MongoDB. MySQL shows worst performance due to its relational architecture. On the other hand, MongoDB shows the best performance among all databases in terms of efficient memory usage. * Indexing improves the performance of any database only for covered queries. * Redis and MongoDB give good performance for range based queries and for fetching complete data in terms of elapsed time whereas MySQL gives the worst performance. * MySQL provides better performance for aggregate functions. NoSQL is not suitable for complex queries and aggregate functions. Conclusion: It has been found from the extensive empirical analysis that NoSQL outperforms SQL based systems in terms of basic read and write operations. However, SQL based systems are better if queries on the dataset mainly involves aggregation operations.


2019 ◽  
Vol 13 (2) ◽  
pp. 101-107
Author(s):  
Shailender Kumar ◽  
Preetam Kumar ◽  
Aman Mittal

Background: A Window Aggregate function belongs to a class of functions, which have emerged as a very important tool for Big Data Analytics. They lend support in analysis and decisionmaking applications. A window aggregate function aggregates and returns the result by applying the function over a limited number of tuples corresponding to current tuple and hence lending support for big data analytics. We have gone through different patents related to window aggregate functions and its optimization. The cost associated with Big data analytics, especially the processing of window functions is one of the major limiting factors. However, now a number of optimizing techniques have evolved for both single as well as multiple window aggregate functions. Methods: In this paper, the authors have discussed various optimization techniques and summarized the latest techniques that have been developed over a period through intensive research in this area. The paper tried to compare various techniques based on certain parameters like the degree of parallelism, multiple window function support, execution time etc. Results: After analyzing all these techniques, segment tree data structure seems better technique as it outperforms other techniques on different grounds like efficiency, memory overhead, execution speed and degree of parallelism. Conclusion: In order to optimize the window aggregate function, segment tree data structure technique is a better technique, which can certainly improve the processing of window aggregate function specifically in big data analytics.


Sign in / Sign up

Export Citation Format

Share Document