aggregate functions Latest Research Papers

Over the past few years, we have witnessed the emergence of large knowledge graphs built by extracting and combining information from multiple sources. This has propelled many advances in query processing over knowledge graphs, however the aspect of providing provenance explanations for query results has so far been mostly neglected. We therefore propose a novel method, SPARQLprov, based on query rewriting, to compute how-provenance polynomials for SPARQL queries over knowledge graphs. Contrary to existing works, SPARQLprov is system-agnostic and can be applied to standard and already deployed SPARQL engines without the need of customized extensions. We rely on spm-semirings to compute polynomial annotations that respect the property of commutation with homomorphisms on monotonic and non-monotonic SPARQL queries without aggregate functions. Our evaluation on real and synthetic data shows that SPARQLprov over standard engines incurs an acceptable runtime overhead w.r.t. the original query, competing with state-of-the-art solutions for how-provenance computation.

Download Full-text

In the land of data streams where synopses are missing, one framework to bring them all

Proceedings of the VLDB Endowment ◽

10.14778/3467861.3467871 ◽

2021 ◽

Vol 14 (10) ◽

pp. 1818-1831

Author(s):

Rudi Poepsel-Lemaitre ◽

Martin Kiefer ◽

Joscha von Hein ◽

Jorge-Arnulfo Quiané-Ruiz ◽

Volker Markl

Keyword(s):

Data Analysis ◽

Real Time ◽

Data Streams ◽

High Performance ◽

Stream Processing ◽

Divide And Conquer ◽

Time Data ◽

Real Time Data ◽

Internal Processing ◽

Aggregate Functions

In pursuit of real-time data analysis, approximate summarization structures, i.e., synopses, have gained importance over the years. However, existing stream processing systems, such as Flink, Spark, and Storm, do not support synopses as first class citizens, i.e., as pipeline operators. Synopses' implementation is upon users. This is mainly because of the diversity of synopses, which makes a unified implementation difficult. We present Condor, a framework that supports synopses as first class citizens. Condor facilitates the specification and processing of synopsis-based streaming jobs while hiding all internal processing details. Condor's key component is its model that represents synopses as a particular case of windowed aggregate functions. An inherent divide and conquer strategy allows Condor to efficiently distribute the computation, allowing for high-performance and linear scalability. Our evaluation shows that Condor outperforms existing approaches by up to a factor of 75x and that it scales linearly with the number of cores.

Download Full-text

Principle of Minimizing Empirical Risk and Averaging Aggregate Functions

Journal of Mathematical Sciences ◽

10.1007/s10958-021-05256-y ◽

2021 ◽

Vol 253 (4) ◽

pp. 583-598

Author(s):

Z. M. Shibzukhov

Keyword(s):

Empirical Risk ◽

Aggregate Functions

Download Full-text

Aggregate Functions

Microsoft Excel Functions Quick Reference ◽

10.1007/978-1-4842-6613-7_7 ◽

2021 ◽

pp. 103-138

Author(s):

Mandeep Mehta

Keyword(s):

Aggregate Functions

Download Full-text

A Uniform Treatment of Aggregates and Constraints in Hybrid ASP

Proceedings of the Seventeenth International Conference on Principles of Knowledge Representation and Reasoning ◽

10.24963/kr.2020/20 ◽

2020 ◽

Cited By ~ 1

Author(s):

Pedro Cabalar ◽

Jorge Fandinno ◽

Torsten Schaub ◽

Philipp Wanko

Keyword(s):

Semantic Framework ◽

Basic Properties ◽

Smt Solving ◽

Abstract Notion ◽

Uniform Treatment ◽

Aggregate Functions

Characterizing hybrid ASP solving in a generic way is difficult since one needs to abstract from specific theories. Inspired by lazy SMT solving, this is usually addressed by treating theory atoms as opaque. Unlike this, we propose a slightly more transparent approach that includes an abstract notion of a term. Rather than imposing a syntax on terms, we keep them abstract by stipulating only some basic properties. With this, we further develop a semantic framework for hybrid ASP solving and provide aggregate functions for theory variables that adhere to different semantic principles, show that they generalize existing aggregate semantics in ASP and how we can rely on off-the-shelf hybrid solvers for implementation.

Download Full-text

Empirical Evaluation of NoSQL and Relational Database Systems

Recent Advances in Computer Science and Communications ◽

10.2174/2666255813999200612113208 ◽

2020 ◽

Vol 13 ◽

Author(s):

Shivangi Kanchan ◽

Parmeet Kaur ◽

Pranjal Apoorva

Keyword(s):

Empirical Evaluation ◽

Database Systems ◽

Memory Usage ◽

Nosql Databases ◽

Locking Mechanism ◽

Relational Database Systems ◽

Database Operations ◽

Efficient Memory ◽

Aggregation Operations ◽

Aggregate Functions

Aim: To evaluate the performance of Relational and NoSQL databases in terms of execution time and memory consumption during operations involving structured data. Objective: To outline the criteria that decision makers should consider while making a choice of the database most suited to an application. Methods: Extensive experiments were performed on MySQL, MongoDB, Cassandra, Redis using the data for a IMDB movies schema prorated into 4 datasets of 1000, 10000, 25000 and 50000 records. The experiments involved typical database operations of insertion, deletion, update read of records with and without indexing as well as aggregation operations. Databases’ performance has been evaluated by measuring the time taken for operations and computing memory usage. Results: * Redis provides the best performance for write, update and delete operations in terms of time elapsed and memory usage whereas MongoDB gives the worst performance when the size of data increases, due to its locking mechanism. * For the read operations, Redis provides better performance in terms of latency than Cassandra and MongoDB. MySQL shows worst performance due to its relational architecture. On the other hand, MongoDB shows the best performance among all databases in terms of efficient memory usage. * Indexing improves the performance of any database only for covered queries. * Redis and MongoDB give good performance for range based queries and for fetching complete data in terms of elapsed time whereas MySQL gives the worst performance. * MySQL provides better performance for aggregate functions. NoSQL is not suitable for complex queries and aggregate functions. Conclusion: It has been found from the extensive empirical analysis that NoSQL outperforms SQL based systems in terms of basic read and write operations. However, SQL based systems are better if queries on the dataset mainly involves aggregation operations.

Download Full-text

SUDAF: Sharing User-Defined Aggregate Functions

2020 IEEE 36th International Conference on Data Engineering (ICDE) ◽

10.1109/icde48307.2020.00161 ◽

2020 ◽

Author(s):

Chao Zhang ◽

Farouk Toumani ◽

Bastien Doreau

Keyword(s):

Aggregate Functions

Download Full-text

Applicability of Generalized Metropolis-Hastings Algorithm to Estimating Aggregate Functions in Wireless Sensor Networks

Advances in Science Technology and Engineering Systems Journal ◽

10.25046/aj050528 ◽

2020 ◽

Vol 5 (5) ◽

pp. 224-236

Author(s):

Martin Kenyeres ◽

Jozef Kenyeres

Keyword(s):

Wireless Sensor Networks ◽

Sensor Networks ◽

Wireless Sensor ◽

Aggregate Functions

Download Full-text

Study of Optimized Window Aggregate Function for Big Data Analytics

Recent Patents on Engineering ◽

10.2174/1872212112666180330162741 ◽

2019 ◽

Vol 13 (2) ◽

pp. 101-107

Author(s):

Shailender Kumar ◽

Preetam Kumar ◽

Aman Mittal

Keyword(s):

Big Data ◽

Data Structure ◽

Data Analytics ◽

Big Data Analytics ◽

Aggregate Function ◽

Segment Tree ◽

Tree Data ◽

Tree Data Structure ◽

Multiple Window ◽

Aggregate Functions

Background: A Window Aggregate function belongs to a class of functions, which have emerged as a very important tool for Big Data Analytics. They lend support in analysis and decisionmaking applications. A window aggregate function aggregates and returns the result by applying the function over a limited number of tuples corresponding to current tuple and hence lending support for big data analytics. We have gone through different patents related to window aggregate functions and its optimization. The cost associated with Big data analytics, especially the processing of window functions is one of the major limiting factors. However, now a number of optimizing techniques have evolved for both single as well as multiple window aggregate functions. Methods: In this paper, the authors have discussed various optimization techniques and summarized the latest techniques that have been developed over a period through intensive research in this area. The paper tried to compare various techniques based on certain parameters like the degree of parallelism, multiple window function support, execution time etc. Results: After analyzing all these techniques, segment tree data structure seems better technique as it outperforms other techniques on different grounds like efficiency, memory overhead, execution speed and degree of parallelism. Conclusion: In order to optimize the window aggregate function, segment tree data structure technique is a better technique, which can certainly improve the processing of window aggregate function specifically in big data analytics.

Download Full-text